pg_chameleon

MySQL to PostgreSQL replica system

BSD-2-CLAUSE License

Downloads
889
Stars
375
Committers
9

Bot releases are visible (Hide)

pg_chameleon - Release v1.0-beta.1

Published by the4thdoctor over 7 years ago

Pg_chameleon is a replication tool from MySQL to PostgreSQL developed in Python 2.7 and Python 3.3+
The system relies on the mysql-replication library to pull the changes from MySQL and covert them into a jsonb object.
A plpgsql function decodes the jsonb and replays the changes into the PostgreSQL database.

The tool requires an initial replica setup which pulls the data from MySQL in read only mode.
This is done by the tool running FLUSH TABLE WITH READ LOCK; .

The tool can pull the data from a cascading replica when the MySQL slave is configured with log-slave-updates.

Changelog from 1.0-alpha.4

  • changed not python files in package to work properly with system wide installations
  • fixed issue with ALTER TABLE ADD CONSTRAINT
  • add datetime.timedelta to json encoding exceptions
  • added support for enum in ALTER TABLE MODIFY
  • requires psycopg2 2.7 which installs without postgresql headers
  • the write_batch function is now using the copy_expert in order to speedup the batch load. The fallback to inserts is still present.

Caveats

  • ensure you are running the latest pip version. you can upgrade within your virtual env running pip install pip --upgrade
pg_chameleon - Release v1.0-alpha.4

Published by the4thdoctor over 7 years ago

Pg_chameleon is a replication tool from MySQL to PostgreSQL developed in Python 2.7 and Python 3.3+
The system relies on the mysql-replication library to pull the changes from MySQL and covert them into a jsonb object.
A plpgsql function decodes the jsonb and replays the changes into the PostgreSQL database.

The tool requires an initial replica setup which pulls the data from MySQL in read only mode.
This is done by the tool running FLUSH TABLE WITH READ LOCK; .

The tool can pull the data from a cascading replica when the MySQL slave is configured with log-slave-updates.

Changelog from 1.0-alpha.3

  • Add batch retention to avoid bloating of t_replica_batch
  • Packaged for pip, now you can install the replica tool in a virtual env just typing pip install pg_chameleon

Caveats

  • ensure you are running the latest pip version. you can upgrade within your virtual env running pip install pip --upgrade
  • psycopg2 requires the python and postgresql development files. this will be solved by the upcoming psycopg2 2.7
  • when installed system wide the user directory .pg_chameleon is not created automatically. you can either create it by hand or use the global /usr/local/etc/pg_chameleon dir (not recommended for security reasons).
pg_chameleon - pg_chameleon v1.0-alpha.3

Published by the4thdoctor over 7 years ago

Pg_chameleon is a replication tool from MySQL to PostgreSQL developed in Python 2.7 and Python 3.3+
The system relies on the mysql-replication library to pull the changes from MySQL and covert them into a jsonb object.
A plpgsql function decodes the jsonb and replays the changes into the PostgreSQL database.

The tool requires an initial replica setup which pulls the data from MySQL in read only mode.
This is done by the tool running FLUSH TABLE WITH READ LOCK; .

The tool can pull the data from a cascading replica when the MySQL slave is configured with log-slave-updates.

Changelog from 1.0-alpha.2

  • Basic DDL Support (CREATE/DROP/ALTER TABLE, DROP PRIMARY KEY)
  • Replica from multiple MySQL schema or servers
  • Python 3 support

Installation in virtualenv

For working properly you should use virtualenv for installing the requirements via pip

No daemon yet

The script should be executed in a screen session to keep it running. Currently there's no respawning of the process on failure nor failure detector.

psycopg2 requires python and postgresql dev files

The psycopg2's pip installation requires the python development files and postgresql source code.
Please refer to your distribution for fulfilling those requirements.

DDL replica limitations

DDL and DML mixed in the same transaction are not decoded in the right order. This can result in a replica breakage caused by a wrong jsonb descriptor if the DML change the data on the same table modified by the DDL. I know the issue and I'm working on a solution.
Test please!

Please submit the issues you find.
Bear in mind this is an alpha release. if you use the software in production keep an eye on the process to ensure the data is correctly replicated.

pg_chameleon - 1.0-alpha.2

Published by the4thdoctor almost 8 years ago

This is the second alpha release.
The system comes with the following limitations.

Changelog from alpha 1

Several fixes in the DDL replica and add support for CHANGE statement.
Add support for check if process is running already, in order to avoid two replica processes run at the same time.
Port to python 3.6. This is still experimental. Any feedback is more than welcome.

Installation in virtualenv

For working properly you should use virtualenv for installing the requirements via pip

No daemon yet

The script should be executed in a screen session to keep it running. Currently there's no respawning of the process on failure nor failure detector.

psycopg2 requires python and postgresql dev files

The psycopg2's pip installation requires the python development files and postgresql source code.
Please refer to your distribution for fulfilling those requirements.

DDL replica limitations

DDL and DML mixed in the same transaction are not decoded in the right order. This can result in a replica breakage caused by a wrong jsonb descriptor if the DML change the data on the same table modified by the DDL. I know the issue and I'm working on a solution.
Test please!

Please submit the issues you find.
Bear in mind this is an alpha release. if you use the software in production keep an eye on the process to ensure the data is correctly replicated.

pg_chameleon - 1.0 Alpha 1

Published by the4thdoctor almost 8 years ago

This is the first alpha release.
The system comes with the following limitations.

Installation in virtualenv

For working properly you should use virtualenv for installing the requirements via pip

No daemon yet

The script should be executed in a screen session to keep it running. Currently there's no respawning of the process on failure nor failure detector.

psycopg2 requires python and postgresql dev files

The psycopg2's pip installation requires the python development files and postgresql source code.
Please refer to your distribution for fulfilling those requirements.

DDL replica limitations

DDL and DML mixed in the same transaction are not decoded in the right order. This can result in a replica breakage caused by a wrong jsonb descriptor if the DML change the data on the same table modified by the DDL. I know the issue and I'm working on a solution.
Test please!

Please submit the issues you find.
Bear in mind this is an alpha release. if you use the software in production keep an eye on the process to ensure the data is correctly replicated.