Configuration JSONs
The SMT CLI expects configuration JSONs when running minimal downtime migrations in two key modes:
-
Non-Sharded Minimal Downtime Migrations: In this mode, the SMT CLI expects a
streamingCfg
parameter containing the configuration details in JSON format. -
Sharded Minimal Downtime Migrations: In this mode, the SMT CLI also expects a
config
parameter containing the configuration details in JSON format similar tostreamingCfg
, but caters to sharded deployments.
Table of contents
StreamingCfg for Non-Sharded Minimal Downtime Migrations
This json is passed to the streamingCfg
parameter via the --source-profile
flag when running non-sharded minimal downtime migrations.
The empty fields are optional.
{
"datastreamCfg": {
"streamId": "",
"streamLocation": "us-central1",
"streamDisplayName": "",
"sourceConnectionConfig": {
"Name": "my-source-connection-profile",
"Location": "us-central1"
},
"destinationConnectionConfig": {
"Name": "my-destination-connection-profile",
"Location": "us-central1",
"Prefix": ""
},
"properties": "replicationSlot=slot_name,publication=pub_name",
"tableList": ["table1", "table2", "table3"],
"maxConcurrentBackfillTasks": "50",
"maxConcurrentCdcTasks": "5"
},
"gcsCfg": {
"ttlInDaysSet": true,
"ttlInDays": 8
},
"dataflowCfg": {
"projectId": "my-project",
"jobName": "",
"location": "us-central1",
"hostProjectId": "my-vpc-host-project-id",
"network": "my-vpc-network",
"subnetwork": "my-vpc-subnetwork",
"maxWorkers": "50",
"numWorkers": "1",
"machineType": "n1-standard-2",
"serviceAccountEmail": "",
"additionalUserLabels": "",
"kmsKeyName": "",
"gcsTemplatePath": "",
},
"tmpDir": "gs://my-bucket/path/to/directory",
}
datastreamCfg.properties
is specific to postgres, used to specify replication slot and publication name.datastreamCfg.tmpDir
is used to store SMT metadata files.
Config for Sharded Minimal Downtime Migrations
This json is passed to the config
parameter via the --source-profile
flag when running a sharded minimal downtime migration.
The empty fields are optional.
{
"configType": "dataflow",
"shardConfigurationDataflow": {
"schemaSource": {
"host": "127.0.0.1",
"user": "root",
"password": "mypass",
"port": "3306",
"dbName": "test"
},
"dataShards": [
{
"dataShardId": "smt_datashard_Jo1B_gVrJ",
"srcConnectionProfile": {
"name": "",
"host": "",
"user" :"",
"port" :"",
"password" :"",
"location": ""
},
"dstConnectionProfile": {
"name": "",
"location": ""
},
"tmpDir": "gs://my-bucket/path-to-folder",
"streamLocation": "us-central1",
"databases": [
{
"dbName": "test",
"databaseId": "logical_shard1",
"refDataShardId": "smt_datashard_Jo1B_gVrJ"
}
]
}
],
"datastreamConfig": {
"maxConcurrentBackfillTasks": "50",
"maxConcurrentCdcTasks": "5"
},
"gcsConfig": {
"ttlInDaysSet": true,
"ttlInDays": "1"
},
"dataflowConfig": {
"projectId": "my-project",
"jobName": "",
"location": "us-central1",
"hostProjectId": "my-vpc-host-project",
"network": "my-vpc-network",
"subnetwork": "my-vpc-subnetwork",
"maxWorkers": "50",
"numWorkers": "1",
"machineType": "n1-standard-2",
"serviceAccountEmail": "",
"additionalUserLabels": "",
"kmsKeyName": "",
"gcsTemplatePath": ""
}
}
}
Automatic generation of Connection Profiles
Any source or destination connection file that does not exist will be created.
- For Source Connection Profile, host, user, port and password need to be provided for creation of profile. If profile name is not provided then it will be generated. If profile location is not provided, spanner instance location will be used. Name and location can be optionally provided.
- For Destination Connection Profile, no extra details need to be provided. Name and location can be optionally provided.