POC migrations for PostgreSQL
Spanner migration tool can either be used with a pg_dump file or it can be run directly on a PostgreSQL database (via go’s database/sql package).
Following instructions assume you have setup SMT by following the instructions in the installation guide.
Table of contents
Generating dump files
If you are using pg_dump , check that pg_dump is correctly configured to connect to your PostgreSQL database. Note that pg_dump uses the same options as psql to connect to your database. See the psql and pg_dump documentation.
Access to a PostgreSQL database is typically configured using the PGHOST, PGPORT, PGUSER, PGDATABASE environment variables, which are standard across PostgreSQL utilities.
It is also possible to configure access via pg_dump’s command-line options --host
, --port
, and --username
.
{ pg_dump } > file
and look at the output file. It should be a plain-text file containing SQL commands. If your database is large, consider just dumping the schema via the --schema-only
for pg_dump and --no-data
for pg_dump command-line option.
pg_dump can export data in a variety of formats, but Spanner migration tool only accepts plain
format (aka plain-text). See the pg_dump documentation for details about formats.
Using Spanner migration tool with pg_dump
The tool can used to migrate schema from an existing pg_dump file:
spanner-migration-tool schema -source=postgresql < my_pg_dump_file
You can use any of postgresql
, postgres
, or pg
as the argument to the -source
flag. They all specify PostgreSQL as the source database.
This will generate a session file with session.json
suffix. This file contains schema mapping from source to destination. You will need to specify this file during data migration. You also need to specify a particular Spanner instance and database to use during data migration.
For example, run
spanner-migration-tool data -session=mydb.session.json -source=pg -target-profile="instance=my-spanner-instance,dbName=my-spanner-database-name" < my_pg_dump_file
You can also run Spanner migration tool in a schema-and-data mode, where it will perform both schema and data migration. This is useful for quick evaluation when source database size is small.
spanner-migration-tool schema-and-data -source=pg -target-profile="instance=my-spanner-instance" < my_pg_dump_file
Spanner migration tool generates a report file, a schema file, and a bad-data file (if there are bad-data rows). You can control where these files are written by specifying a file prefix. For example,
spanner-migration-tool schema -prefix=mydb. -source=postgres < my_pg_dump_file
will write files mydb.report.txt
, mydb.schema.txt
, and mydb.dropped.txt
. The prefix can also be a directory. For example,
spanner-migration-tool schema -prefix=~/spanner-eval-mydb/ -source=postgres < my_pg_dump_file
would write the files into the directory ~/spanner-eval-mydb/
. Note that Spanner migration tool will not create directories as it writes these files.
Sample dump files
If you don’t have ready access to a PostgreSQL database, some example dump files can be found here. The file cart.pg_dump contains pg_dump for a very basic shopping cart application (just two tables, one for products and one for user carts). The file singers.pg_dump contains pg_dump output for a version of the Cloud Spanner singers example.
Directly connecting to a PostgreSQL database
In this case, Spanner migration tool connects directly to the PostgreSQL database to retrieve table schema and data. Set the -source=postgres
and corresponding source profile connection parameters host
, port
, user
, dbName
and password
.
For example, to perform schema conversion, run
spanner-migration-tool schema -source=postgres -source-profile="host=<>,port=<>,user=<>,dbName=<>"
Parameters port
and password
are optional. Port (port
) defaults to 5432
for PostgreSQL source. Password can be provided at the password prompt.
Alternatively, you can also set environment variables PGHOST
, PGPORT
, PGUSER
, PGDATABASE
for direct access. Password can be specified either in the PGPASSWORD
environment variable or provided at the password prompt.
Note that the various target-profile params described in the previous section are also applicable in direct connect mode.