In a set of four tutorials, we are exploring migrating a MySQL database instance to a PostgreSQL database by using AWS Database Migration Service (DMS). In the first tutorial, “Migrating MySQL to PostgreSQL on AWS RDS, Part 1,” we introduced DMS and created an IAM user for the DMS. In the second tutorial, Migrating MySQL to PostgreSQL on AWS RDS, Part 2,” we created instances of the MySQL and PostgreSQL on RDS and connected to the two instances. In this continuation tutorial, we shall create a DMS migration to perform the migration. This tutorial has the following sections:
- Creating a DMS Migration
- Creating a Replication Instance
- Creating Migration Task Endpoints
- Creating a Migration Task
- Conclusion
Creating a DMS Migration
In this section, we shall create a Database Migration Service (DMS) to migrate the MySQL Database to PostgreSQL database. A DMS migration consists of the following components:
- Replication Instance
- Database Endpoints
- Task
We shall discuss creating each of these in sub-sections, but first you need to create a DMS migration. Navigate to the DMS dashboard and click Create migration, as shown in Figure 1.
Figure 1: Create migration
The DMS wizard gets started. Click Next, as shown in Figure 2.
Figure 2: DMS Wizard
Creating a Replication Instance
Next, configure a replication instance, which establishes the connection between the source and target databases, transfers the data, and caches data changes that occur during the initial data load. Specify a replication instance Name and Description, and select an Instance class (the default is dms.t2.medium), as shown in Figure 3. Select a VPC and select the option to configure high availability with Mult-Az; the default setting is set to “No.” Select the option Publicly accessible.
Figure 3: Configuring a Replication Instance
In Advanced, keep the default settings for Allocated storage (50 GB), Replication Subnet Group, Availability zone (No preference), and VPC Security Group (Use default), as shown in Figure 4. Select the KMS master key (dbms) created earlier in the Setting the Environment section.
Figure 4: Configuring Advanced Settings for Replication Instance
Click Next, as shown in Figure 5.
Figure 5: Next
The replication instance begins to get created, as shown by the message in Figure 6. Next, the source and target database connections need to be configured, which we shall accomplish in the next sub-section.
Figure 6: Replication Instance begins to get created
Creating Migration Task Endpoints
The replication instance could take a few minutes to get created. The migration endpoints may be added while the replication instance is being created. Select the Source engine as “mysql,” as shown in Figure 7.
Figure 7: Selecting Source Engine
Select Target engine as “postgres,” as shown in Figure 8.
Figure 8: Selecting Target Engine
In Servername, specify the Endpoint for the RDS DB instance for the database by removing the :port suffix. For a MySQL database, the RDS Endpoint is mysqldb.crbmlbxmp8qi.us-east-1.rds.amazonaws.com:3306. Therefore, specify the Servername as mysqldb.crbmlbxmp8qi.us-east-1.rds.amazonaws.com, as shown in Figure 9. For a PostgreSQL database on RDS, the Endpoint is postgresdb.crbmlbxmp8qi.us-east-1.rds.amazonaws.com:5432; therefore, specify the Servername as postgresdb.crbmlbxmp8qi.us-east-1.rds.amazonaws.com. Specify the port separately for the source and target databases: 3306 for MySQL database and 5432 for Postgres.
Figure 9: Specifying Servername and Port for Source and Target Databases
Scroll down for more endpoint configuration settings, as shown in Figure 10. Select SSL mode as none for both the source and target databases. Specify the Username and Password as configured when creating the RDS DB instances. Specify the PostgreSQL Database name (postgresdb), also as configured when creating the RDS DB instance.
Figure 10: Specifying SSL mode, Username, Password, and SID or Database name
In Advanced settings, extra connection attributes available for MySQL Database and PostgreSQL database may be specified, but none are required for this tutorial. Select the KMS master key (dbms) for both the source and target databases (see Figure 11). To test the DMS endpoints, the Run test buttons are provided, as shown in Figure 11. The Run test buttons are grayed out or disabled till the replication instance has been created.
Figure 11: Configuring Advanced Settings for DMS Endpoints
When the replication instance has been created, the message “Replication instance created successfully” gets displayed, as shown in Figure 12.
Figure 12: Replication instance created successfully
After the replication instance has been created, the Run test buttons become enabled. Click Run test, as shown in Figure 13, for each database.
Figure 13: Run test
The connections start to get tested, as indicated by the message “Testing endpoint connection” in Figure 14.
Figure 14: Testing Endpoint Connections
If the endpoint connections have been configured as required, the message “Connection tested successfully” should get displayed, as shown in Figure 15. Click Next.
Figure 15: Connection tested successfully
Creating a Migration Task
Having created a replication instance and database endpoints, next we shall create a migration task to connect to the endpoints and actually transfer the data. In the Create task wizard, specify a Task name (a default is also specified) and add a task description (see Figure 16). Select the option to Start task on create.
Figure 16: Create task wizard
The Source and Target endpoints and the replication instance were previously configured and are non-modifiable when creating a task. Select a Migration type, which lists three options, as shown in Figure 17.
- Migrate existing data: Migrates the existing schemas, tables, and table data that already exist in the source database, but does not migrate subsequent changes on an ongoing basis.
- Migrate existing data and replicate ongoing changes: Migrates the existing schemas, tables, and table data that already exist in the source database and also migrates subsequent changes on an ongoing basis.
- Replicate data changes only: Does not migrate existing schemas, tables, and data and migrates only the data changes.
Select the Migrate existing data option, as shown in Figure 17. To migrate changes, which is what the other two options provide, the binary log retention time on the MySQL DB instance must be increased to 24 hours or more.
Figure 17: Selecting Migration type as Migrate existing data
Next, configure the Task Settings. The Target table preparation mode setting applies to the tables on the target database and provides three options:
- Do nothing: Data and metadata of the target tables are not changed
- Drop tables on target: The existing tables, if any, are dropped and new tables are created
- Truncate: Tables are truncated, which implies that the table data is deleted but the table metadata is not changed. The Include LOB columns in replication setting applies to the LOB data type columns in the source database, and provides three options:
- Don’t include LOB columns: LOB columns are excluded from the migration
- Full LOB mode: Migrates complete LOBs regardless of the size; LOBs are migrated in chunks, which could slow down the migration process
- Limited LOB mode: Truncate the LOBs to the size specified in max LOB size (kb)
Select Target table preparation mode as Do nothing, as shown in Figure 18. Select Include LOB columns in replication as Limited LOB mode and specify Max LOB size as 32 kb (default). Select the Enable logging option.
Figure 18: Task Settings
Next, add selection rules and transformation rules in Table mappings, as shown in Figure 19. Table mappings may be added by using the Guided user interface or as JSON. The Guided UI is recommended. At least one selection rule must be added and transformation rules are optional. Selection rules are applied, whereas selecting schemas, tables, and columns from the source database and transformation rules are applied before schemas, tables, and columns are migrated to the target database.
Figure 19: Table mappings
The schema may be selected from the Schema name is drop-down or, if none are listed, select Enter a schema and specify a schema in the Schema name is like field. The Action drop-down lists two options: Include and Exclude. The Include option includes the selections made for schemas and tables and the Exclude option excludes the schemas and tables. The Excludes are processed after the Includes. Not all schemas have to be selected and not all tables from a schema have to be selected.
We shall add the following selection rule:
- Include all schemas and all tables from the source database
For the selection rule, select Enter a schema and specify Schema name is as %, which selects all schemas in the source database, as shown in Figure 20. Specify Table name is like as %, which selects all tables in the selected schemas. Select Action as Include.
Figure 20: Configuring a Selection Rule
Click Add selection rule (see Figure 21).
Figure 21: Adding a selection rule
A selection rule gets added, as shown in Figure 22. When the task is created, an IAM role dms-cloudwatch-logs-role is created to allow DMS to access CloudWatch.
Figure 22: Selection rule added
Next, add a transformation rule. For this, click the add transformation rule link, as shown in Figure 23.
Figure 23: Add transformation rule
A transformation rule has settings for the Target. The rule has three options, as shown in Figure 24.
- Schema
- Table
- Column
Where specifies a subset of the objects selected by the selection rules. The Action is the transformation that is to be applied and the following options are available:
- Rename to (available for Schema and Table objects)
- Remove column (available for Columns)
- Make lowercase (available for Schemas, Tables, and Columns)
- Make uppercase (available for Schemas, Tables, and Columns)
- Add prefix (available for Schemas, Tables, and Columns)
- Remove prefix (available for Schemas, Tables, and Columns)
Figure 24: Transformation rules Settings
PostgreSQL makes use of lowercase in schemas, tables, and columns. We shall add three transformation rules:
- Make all the Schemas lowercase
- Make all the Tables lowercase
- Make all the Columns lowercase
To add the first of these transformation rules, select Target as Schema as shown in Figure 25. Specify Schema name is like %. Select Action as Make lowercase and click Add transformation rule.
Figure 25: Adding a Transformation Rule
A transformation rule gets added (see Figure 26). To add another transformation rule, click the add transformation rule link again.
Figure 26: Transformation rule added
Select Target as Table, as shown in Figure 27. Specify Schema name is like %. Select Table name is like %. Select Action as Make lowercase and click Add transformation rule.
Figure 27: Adding a Transformation Rule to rename a table
Select Target as Column, as shown in Figure 28. Specify Schema name is like %. Select Table name is like %. Select Column name is like %. Select Action as Make lowercase and click Add transformation rule.
Figure 28: Adding a transformation rule to make Column name lowercase
The three selection rules and a transformation rule added are shown in Figure 29. When the task is run, the selection rules are applied before the transformation rules. For advance settings, click Advanced Settings, as shown in Figure 29.
Figure 29: Selection Rules and Transformation Rules
In Advanced Settings, select Control table settings and specify Create control table in target using schema as public, as shown in Figure 30, and click Done.
Figure 30: Setting Control Table Schema in Target
The Advanced setting gets added (see Figure 31).
Figure 31: Advanced Setting
After configuring the task, click Create task, as shown in Figure 32, to create the task.
Figure 32: Create task
A migration task gets created, as shown in Figure 33. Initially, the task status is “Creating.”
Figure 33: Migration task
A task may have one of the statuses discussed in Table 1.
Status | Description |
Creating | The task is being created. |
Ready | The task is ready to be started and run. Usually follows the “Creating” status. |
Starting | The task is starting, during which time the task connects with the replication instance and the database endpoints. Selection and transformation rules get applied. |
Running | The task is running, which implies that the task is migrating the database. |
Load Complete | The task has completed loading (migrating) the database. |
Failed | The task has failed. |
Error | An error has occurred during migration. Some of the schemas and tables may have migrated successfully but at least one schema or table has failed to migrate. |
Modifying | The task is modifying, usually after a user has modified the task. |
Stopping | The task is stopping, which is usually after user has selected to stop a task. |
Stopped | The task is stopped, which could be due to a user having stopped a task or because a Migration type of Migrate existing data and replicate ongoing changes is used and the task has completed the initial load. |
Deleting | The task is getting deleted, which is due to the user having deleted the task. |
Table 1: Task Statuses
Two IAM roles, one for CloudWatch and the other for VPC, get created, as shown in Figure 34.
Figure 34: IAM Roles for DMS
Conclusion
In this third tutorial on migrating a MySQL database instance on RDS to a Postgres database instance on RDS, we created a DMS migration including replication instance, migration endpoints, and migration task. In the fourth tutorial, we shall discuss running the DMS migration to perform the migration and evaluate results.