Configuration JSONs

The SMT CLI expects configuration JSONs when running minimal downtime migrations in two key modes:

  • Non-Sharded Minimal Downtime Migrations: In this mode, the SMT CLI expects a streamingCfg parameter containing the configuration details in JSON format.

  • Sharded Minimal Downtime Migrations: In this mode, the SMT CLI also expects a config parameter containing the configuration details in JSON format similar to streamingCfg, but caters to sharded deployments.

Table of contents
  1. StreamingCfg for Non-Sharded Minimal Downtime Migrations
  2. Config for Sharded Minimal Downtime Migrations
    1. Automatic generation of Connection Profiles

StreamingCfg for Non-Sharded Minimal Downtime Migrations

This json is passed to the streamingCfg parameter via the --source-profile flag when running non-sharded minimal downtime migrations.

The empty fields are optional.

{
    "datastreamCfg": {
        "streamId": "",
        "streamLocation": "us-central1",
        "streamDisplayName": "",
        "sourceConnectionConfig": {
            "Name": "my-source-connection-profile",
            "Location": "us-central1"
        },
        "destinationConnectionConfig": {
            "Name": "my-destination-connection-profile",
            "Location": "us-central1",
            "Prefix": ""
        },
        "properties": "replicationSlot=slot_name,publication=pub_name", 
        "tableList": ["table1", "table2", "table3"],
        "maxConcurrentBackfillTasks": "50",
        "maxConcurrentCdcTasks": "5"
    },
    "gcsCfg": {
        "ttlInDaysSet": true,
        "ttlInDays": 8
    },
    "dataflowCfg": {
        "projectId": "my-project",
        "jobName": "",
        "location": "us-central1",
        "hostProjectId": "my-vpc-host-project-id",
        "network": "my-vpc-network",
        "subnetwork": "my-vpc-subnetwork",
        "maxWorkers": "50",
        "numWorkers": "1",
        "machineType": "n1-standard-2",
        "serviceAccountEmail": "",
        "additionalUserLabels": "",
        "kmsKeyName": "",
        "gcsTemplatePath": "",
    },
    "tmpDir": "gs://my-bucket/path/to/directory",
}
  • datastreamCfg.properties is specific to postgres, used to specify replication slot and publication name.
  • datastreamCfg.tmpDir is used to store SMT metadata files.

Config for Sharded Minimal Downtime Migrations

This json is passed to the config parameter via the --source-profile flag when running a sharded minimal downtime migration.

The empty fields are optional.

{
    "configType": "dataflow",
    "shardConfigurationDataflow": {
        "schemaSource": {
            "host": "127.0.0.1",
            "user": "root",
            "password": "mypass",
            "port": "3306",
            "dbName": "test"
        },
        "dataShards": [
            {
                "dataShardId": "smt_datashard_Jo1B_gVrJ",
                "srcConnectionProfile": {
                    "name": "",
                    "host": "",
                    "user" :"",
					"port" :"",
                    "password" :"",
                    "location": ""
                },
                "dstConnectionProfile": {
                    "name": "",
                    "location": ""
                },
                "tmpDir": "gs://my-bucket/path-to-folder",
                "streamLocation": "us-central1",
                "databases": [
                    {
                        "dbName": "test",
                        "databaseId": "logical_shard1",
                        "refDataShardId": "smt_datashard_Jo1B_gVrJ"
                    }
                ]
            }
        ],
        "datastreamConfig": {
            "maxConcurrentBackfillTasks": "50",
            "maxConcurrentCdcTasks": "5"
        },
        "gcsConfig": {
            "ttlInDaysSet": true,
            "ttlInDays": "1"
        },
        "dataflowConfig": {
            "projectId": "my-project",
            "jobName": "",
            "location": "us-central1",
            "hostProjectId": "my-vpc-host-project",
            "network": "my-vpc-network",
            "subnetwork": "my-vpc-subnetwork",
            "maxWorkers": "50",
            "numWorkers": "1",
            "machineType": "n1-standard-2",
            "serviceAccountEmail": "",
            "additionalUserLabels": "",
            "kmsKeyName": "",
            "gcsTemplatePath": ""
        }
    }
}

Automatic generation of Connection Profiles

Any source or destination connection file that does not exist will be created.

  1. For Source Connection Profile, host, user, port and password need to be provided for creation of profile. If profile name is not provided then it will be generated. If profile location is not provided, spanner instance location will be used. Name and location can be optionally provided.
  2. For Destination Connection Profile, no extra details need to be provided. Name and location can be optionally provided.