Permissions & Connectivity
- Connectivty: Since both Spanner migration tool and the underlying GCP services talk to the source database for schema and data migration, certain pre-requisite connectivity configurations are required before using the tool.
- Permissions: Spanner migration tool (SMT) runs in the customers GCP account. In order to orchestrate migrations, SMT needs access to certain permissions.
Table of contents
Connectivity
API enablement
Ensure that Datastream and Dataflow apis are enabled on your project.
- Make sure that billing is enabled for your Google Cloud project.
- Follow the Datastream guidelines to enable Datastream api.
-
Enable the Dataflow api by using:
gcloud services enable dataflow.googleapis.com
-
Google Cloud Storage apis are generally enabled by default. In they have been disabled, you will need to enable them.
gcloud services enable storage.googleapis.com
-
Enable the Pub/Sub api by using:
gcloud services enable pubsub.googleapis.com
Configuring connectivity for spanner-migration-tool
In order for SMT to read the information schema from the source database, ensure that the machine where you run spanner-migration-tool
is allowlisted to connect to the source database. In generic terms (your specific network settings may differ), do the following:
- Open your source database machine’s network firewall rules.
- Create an inbound rule.
- Set the source ip address as the ip address of the machine where you run the
spanner-migration-tool
. - Set the protocol to TCP.
- Set the port associated with the TCP protocol of your database.
- Save the firewall rule, and then exit.
Configure Datastream connectivity to source database
This is only needed for minimal downtime migrations via Spanner migration tool.
Follow the Datastream guidelines to allowlist datastream to access the source database.
Configure source database to enable CDC capture via Datastream
This is only needed for minimal downtime migrations via Spanner migration tool.
Even if the source database is reachable via Datastream, certain prerequisites need to be performed on the source database before Datastream can streaming backfill and CDC events from it. The steps required vary for each database. Validate that the following steps have been performed on the source database before using SMT.
Configuring connectivity for Dataflow
This is only required when if you plan to run Dataflow inside a VPC.
Follow the Internet access for Dataflow guidelines to allow the necessary access from the VPC in which you will run the Dataflow
jobs.
Permissions
The Spanner migration tool interacts with many GCP services. Please refer to this list for permissions required to perform migrations.
Spanner
The recommended role to perform migrations is Cloud Spanner Database Admin.
The full list of required Spanner permissions for migration are
spanner.instances.list
spanner.instances.get
spanner.databases.create
spanner.databases.list
spanner.databases.get
spanner.databases.getDdl
spanner.databases.updateDdl
spanner.databases.read
spanner.databases.write
spanner.databases.select
Refer to the grant permissions page for custom roles.
Datastream
Follow this guide to enable Datastream permissions.
Dataflow
Follow this guide to enable Dataflow permissions.
GCS
Grant the user Editor role to create buckets in the project.
GCE
Enable access to Datastream, Dataflow and Spanner using service accounts.
Pub/Sub
Grant the user Pub/Sub Editor to create Pub/Sub topic and subscription for low downtime migrations.
Additionally, we need to grant Pub/Sub publisher permission to GCS service agent. This will enable GCS to push a notification to a Pub/Sub topic whenever a new file is created. Refer to this page for more details.
- Get the GCS service agent id using the following command:
gcloud storage service-agent --project=<PROJECT_ID>
- Grant pubsub publisher role to the service agent using the following command:
gcloud projects add-iam-policy-binding PROJECT_ID --member=serviceAccount:<GCS_SERVICE_ACCOUNT_ID> --role=roles/pubsub.publisher
Cloud Monitoring
To create the monitoring dashboard granting Monitoring Editor to the service account is required. To view the dashboard on Cloud Console the user must have Monitoring Viewer permission. To edit the Dashboard Monitoring Editor permission is required. For further information follow this guide.
Other Permissions
In addition to these, the DatastreamToSpanner
pipeline created by SMT requires the following roles as well:
- Dataflow service account:
- Storage Object Creator
- Storage Object Viewer
- Dataflow compute engine service account:
- Cloud Spanner Database user
- Cloud Spanner Restore Admin
- Cloud Spanner Viewer
- Dataflow Worker
- Pub/Sub Subscriber