This feature is in preview. This feature is subject to change. To share feedback and/or issues, contact Support.
MOLT Verify checks for data discrepancies between a source database and CockroachDB during a database migration.
The tool performs the following verifications to ensure data integrity during a migration:
- Table Verification: Check that the structure of tables between the source database and the target database are the same.
- Column Definition Verification: Check that the column names, data types, constraints, nullability, and other attributes between the source database and the target database are the same.
- Row Value Verification: Check that the actual data in the tables is the same between the source database and the target database.
For a demo of MOLT Verify, watch the following video:
Supported databases
The following databases are currently supported:
- PostgreSQL
- MySQL
- CockroachDB
Install and run MOLT Verify
To install MOLT Verify, download the binary that matches your system. To download the latest binary:
Operating System | AMD 64-bit | ARM 64-bit |
---|---|---|
Windows | Download | Download |
Linux | Download | Download |
Mac | Download | Download |
For previous binaries, refer to the MOLT version manifest. For release details, see the MOLT changelog.
Setup
Complete the following items before using MOLT Verify:
The SQL user running MOLT Verify must have the
SELECT
privilege on both the source and target CockroachDB tables.Percent-encode the connection strings for the source database and CockroachDB. This ensures that the MOLT tools can parse special characters in your password.
Given a password
a$52&
, pass it to themolt escape-password
command with single quotes:molt escape-password 'a$52&'
Substitute the following encoded password in your original connection url string: a%2452%26
Use the encoded password in your connection string. For example:
postgres://postgres:a%2452%26@localhost:5432/replicationload
Flags
Flag | Description |
---|---|
--source |
(Required) Connection string for the source database. |
--target |
(Required) Connection string for the target database. |
--concurrency |
Number of threads to process at a time when reading the tables. Default: 16 For faster verification, set this flag to a higher value. |
--continuous |
Verify tables in a continuous loop. Default: false |
--live |
Retry verification on rows before emitting warnings or errors. This is useful during live data import, when temporary mismatches can occur. Default: false |
--metrics-listen-addr |
Address of the metrics endpoint, which has the path {address}/metrics .Default: '127.0.0.1:3030' |
--row-batch-size |
Number of rows to get from a table at a time. Default: 20000 |
--schema-filter |
Verify schemas that match a specified regular expression. |
--table-filter |
Verify tables that match a specified regular expression. |
Usage
molt verify
takes two SQL connection strings as --source
and --target
arguments.
To compare a PostgreSQL database with a CockroachDB database:
molt verify \
--source 'postgresql://{username}:{password}@{host}:{port}/{database}' \
--target 'postgresql://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full'
To compare a MySQL database with a CockroachDB database:
molt verify \
--source 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}' \
--target 'postgresql://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full'
Use the optional flags to customize the verification results.
When verification completes, the output displays a summary message like the following:
{"level":"info","type":"summary","table_schema":"public","table_name":"common_table","num_truth_rows":6,"num_success":3,"num_conditional_success":0,"num_missing":2,"num_mismatch":1,"num_extraneous":2,"num_live_retry":0,"num_column_mismatch":0,"message":"finished row verification on public.common_table (shard 1/1)"}
num_missing
is the number of rows that are missing on the target database. You can add any missing data to the target database and runmolt verify
again.num_mismatch
is the number of rows with mismatched values on the target database.num_extraneous
is the number of extraneous tables on the target database.num_column_mismatch
is the number of columns with mismatched types on the target database, preventingmolt verify
from comparing the column's rows. For example, if your source table uses an auto-incrementing ID, MOLT Verify will identify a mismatch with CockroachDB'sUUID
type. In such cases, you might fix the mismatch by creating a composite type on CockroachDB that uses the auto-incrementing ID.num_success
is the number of rows that matched.num_conditional_success
is the number of rows that matched while having a column mismatch due to a type difference. This value indicates that all other columns that could be compared have matched successfully. You should manually review the warnings and errors in the output to determine whether the column mismatches can be ignored.
Known limitations
- MOLT Verify compares 20,000 rows at a time by default, and row values can change between batches, potentially resulting in temporary inconsistencies in data. If
--live
mode is enabled, MOLT Verify retries verification on these rows. To configure the row batch size, use the--row_batch_size
flag. - MOLT Verify checks for collation mismatches on primary key columns. This may cause validation to fail when a
STRING
is used as a primary key and the source and target databases are using different collations. - MOLT Verify might give an error in case of schema changes on either the source or target database.
- Geospatial types cannot yet be compared.
The following limitations are specific to MySQL:
- MySQL enums and set types are not supported.
- MOLT Verify only supports comparing one MySQL database to a whole CockroachDB schema (which is assumed to be
public
).