pg_chameleon

MySQL to PostgreSQL replica system

BSD-2-CLAUSE License

Downloads
889
Stars
375
Committers
9

Bot releases are hidden (Show)

pg_chameleon - v2.0.19 Latest Release

Published by the4thdoctor over 1 year ago

This maintenance release adds the following bugfix and improvements.

Merge pull request #144 adding mysql-replication support for PyMySQL>0.10.0 was introduced in v0.22
Adds support for fillfactor when running init_replica, it's now possible to specify the fillfactor for the tables when running init_replica.
Useful to mitigate bloat in advance when replicating/migrating from MySQL.

Improve logging on discarded rows, now the discarded row image is displayed in the log.

Add distinct on group concat when collecting foreign keys metadata to avoid duplicate fields in the foreign key definition.

Use mysql-replication>=0.31, this fix the crash when replicating from MariaDB introduced in mysql-replication 0.27

Changelog from v2.0.18

  • Merge pull request #144, mysql-replication support for PyMySQL>0.10.0 was introduced in v0.22
  • add support for fillfactor when running init_replica
  • improve logging on discarded rows
  • add distinct on group concat when collecting foreign keys
  • use mysql-replication>=0.31, fix for crash when replicating from MariaDB
pg_chameleon - v2.0.18

Published by the4thdoctor over 2 years ago

This maintenance release adds the following bugfix and improvements.

Adds a new method copy_schema to copy only the schema without the data (EXPERIMENTAL).

Adds the support for the ON DELETE and ON UPDATE clause when creating the foreign keys in PostgreSQL with detach_replica
and copy_schema.

When running init_replica or copy_schema the names for the indices and foreign keys are preserved.
Only if there is any duplicate name then pg_chameleon will ensure that the names on PostgreSQL are unique within the same schema.

Adds a workaround for a regression introduced in mysql-replication by forcing the version to be lesser than 0.27.

Change the data type for the identifiers stored into the replica schema to varchar(64)

This release requires a replica catalogue upgrade, therefore is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to use screen or tmux for the upgrade
  • Stop all the replica processes with chameleon stop_all_replicas --config <your_config>
  • Take a backup of the schema sch_chameleon with pg_dump as a good measure.
  • Install the upgrade with pip install pg_chameleon --upgrade
  • Check if the version is upgraded with chameleon --version
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start all the replicas.

Changelog from v2.0.17

  • Support the ON DELETE and ON UPDATE clause when creating the foreign keys in PostgreSQL
  • change logic for index and foreign key names by managing only duplicates within same schema
  • use mysql-replication<0.27 as new versions crash when receiving queries
  • add copy_schema method for copying only the schema without data (EXPERIMENTAL)
  • change type for identifiers in replica schema to varchar(64)
pg_chameleon - v2.0.17

Published by the4thdoctor over 2 years ago

This maintenance release adds the following bugfix.

Fix the wrong order in copy data/create indices when keep_existing_schema is No.

Previously the indices were created before the data was loaded into the target schema with great performance degradation.

This fix applies only if the parameter keep_existing_schema is set to No.

Add the collect for unique constraints when keep_existing_schema is Yes.

Previously the unique constraint were not collected or dropped if defined as constraints instead of indices.

This fix applies only if the parameter keep_existing_schema is set to Yes.

This release adds the following changes:

  • Remove argparse from the requirements as now it's part of the python3 core dist
  • Remove check for log_bin when we replicate from Aurora MySQL
  • Manage different the different behaviour in pyyaml to allow pg_chameleon to be installed as rpm in centos 7 via pgdg repository

This release works with Aurora MySQL. However Aurora MySQL 5.6 segfaults when FLUSH TABLES WITH READ LOCK is issued.

The replica is tested on Aurora MySQL 5.7.

This release requires a replica catalogue upgrade, therefore is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to use screen or tmux for the upgrade
  • Stop all the replica processes with chameleon stop_all_replicas --config <your_config>
  • Take a backup of the schema sch_chameleon with pg_dump as a good measure.
  • Install the upgrade with pip install pg_chameleon --upgrade
  • Check if the version is upgraded with chameleon --version
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start all the replicas.

Changelog from v2.0.16

  • Remove argparse from the requirements
  • Add the collect for unique constraints when keep_existing_schema is Yes
  • Fix wrong order in copy data/create indices when keep_existing_schema is No
  • Remove check for log_bin we are replicating from Aurora MySQL
  • Manage different the different behaviour in pyyaml to allow pg_chameleon to be installed as rpm in centos 7
pg_chameleon - v2.0.16

Published by the4thdoctor about 4 years ago

2.0.16

This maintenance release fix a crash in init_replica caused by an early disconnection during the fallback on insert.
This caused the end of transaction to crash aborting the init_replica entirely.

Changelog from v2.0.15

  • Fix for issue #126 init_replica failure with tables on transactional engine and invalid data
pg_chameleon - v2.0.15

Published by the4thdoctor about 4 years ago

This maintenance release adds the support for reduced lock if MySQL engine is transactional, thanks to @rascalDan

The init_replica process checks whether the engine for the table is transactional and runs the initial copy within a transaction.
The process still requires a FLUSH TABLES WITH READ LOCK but the lock is released as soon as the transaction snapshot is acquired.
This improvement allows pg_chameleon to run agains primary databases with minimal impact during the init_replica process.

The python-mysql-replication requirement is now changed to version >=0.22. This release adds support for PyMySQL >=0.10.0.
The requirement for PyMySQL to version <0.10.0 is therefore removed from setup.py.

Changelog from v2.0.14

  • Support for reduced lock if MySQL engine is transactional, thanks to @rascalDan
  • setup.py now requires python-mysql-replication to version 0.22 which adds support for PyMySQL >=0.10.0
  • removed PyMySQL requirement <0.10.0 from setup.py
  • prevent pg_chameleon to run as root
pg_chameleon - v2.0.14

Published by the4thdoctor about 4 years ago

This maintenance release improves the support for spatial datatypes.
When postgis is installed on the target database then the spatial data types
point,geometry,linestring,polygon, multipoint, multilinestring, geometrycollection are converted to
geometry and the data is replicated using the Well-Known Binary (WKB) Format. As the MySQL implementation for WKB is not standard pg_chameleon removes the first 4 bytes from the decoded binary data before sending it to PostgreSQL.

When keep_existing_schema is set to yes now drops and recreates indices, and primary keys during the init_replica process.
The foreign keys are dropped as well and recreated when the replica reaches the consistent status.
This way the init_replica may complete successfully even when there are foreign keys in place and with the same speed of the usual init_replica.

The setup.py now forces PyMySQL to version <0.10.0 because it breaks the python-mysql-replication library (issue #117).

Thanks to @porshkevich which fixed issue #115 by trim the space from PK index name.

This release requires a replica catalogue upgrade, therefore is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to use screen or tmux for the upgrade
  • Stop all the replica processes with chameleon stop_all_replicas --config <your_config>
  • Take a backup of the schema sch_chameleon with pg_dump as a good measure.
  • Install the upgrade with pip install pg_chameleon --upgrade
  • Check if the version is upgraded with chameleon --version
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start all the replicas.

If the upgrade procedure can't upgrade the replica catalogue because of running or errored replicas is it possible to reset the statuses by
using the command chameleon enable_replica --source <source_name>.

If the catalogue upgrade is still not possible then you can downgrade pgchameleon to the previous version. Please note that you may need to
install manually PyMySQL to fix the issue with the version 0.10.0.

pip install pg_chameleon==2.0.13

pip install "PyMySQL<0.10.0"

Changelog from v2.0.13

  • Add support for spatial data types (requires postgis installed on the target database)
  • When keep_existing_schema is set to yes now drops and recreates indices, and constraints during the init_replica process
  • Fix for issue #115 thanks to @porshkevich
  • setup.py now forces PyMySQL to version <0.10.0 because it breaks the python-mysql-replication library (issue #117)
pg_chameleon - v2.0.13

Published by the4thdoctor over 4 years ago

This maintenance release adds the EXPERIMENTAL support for Point datatype thanks to the contribution by @jovankricka-everon.

The support is currently limited to only the POINT datatype with hardcoded stuff to keep the init_replica and the replica working.
However as this feature is related with PostGIS, the next point release will rewrite this part of code using a more general approach.

The release adds the keep_existing_schema parameter in the MySQL source type. When set to Yes init_replica,refresh_schema and
sync_tables do not recreate the affected tables using the data from the MySQL source.
Instead the existing tables are truncated and the data is reloaded.

A REINDEX TABLE is executed in order to have the indices in good shape after the reload.
The next point release will very likely improve the approach on the reload and reindexing.

When keep_existing_schema is set to Yes the parameter grant_select_to have no effect.

From this release the codebase switched from tabs to spaces, following the guidelines in PEP-8.

Changelog from v2.0.12

  • EXPERIMENTAL support for Point datatype - @jovankricka-everon
  • Add keep_existing_schema in MySQL source type to keep the existing scema in place instead of rebuilding it from the mysql source
  • Change tabs to spaces in code
pg_chameleon - v2.0.12

Published by the4thdoctor almost 5 years ago

This maintenance release fixes the issue #96 where the replica initialisation failed on MySQL 8 because of the wrong field names pulled out from the information_schema.
Thanks to @daniel-qcode for contributing with his fix.

The configuration and SQL files are now moved inside into the directory pg_chameleon. This change simplifies the setup.py file and allow pg_chameleon to be
built as source and wheel package.

As python 3.4 has now reached its end-of-life and has been retired the minimum requirement for pg_chameleon has been updated to Python 3.5.

Changelog from v2.0.11

  • Fixes for issue #96 thanks to @daniel-qcode
  • Change for configuration and SQL files location
  • Package can build now as source and wheel
  • The minimum python requirements now is 3.5
pg_chameleon - v2.0.11

Published by the4thdoctor almost 5 years ago

This maintenance release fixes few things.
As reported in #95 the yaml files were not completely valid. @rebtoor fixed them.

@clifff made a pull request to have the start_replica running in foreground when log_file set to stdout.
Previously the process remained in background with the log set to stdout.

As Travis seems to break down constantly the CI configuration is disabled until a fix or a different CI is found .

Finally the method which loads the yaml file is now using an explicit loader as required by the new PyYAML version.

Previously with newer version of PyYAML there was a warning emitted by the library because the default loader is unsafe.

Changelog from v2.0.10

  • Fix wrong formatting for yaml example files. @rebtoor
  • Make start_replica run in foreground when log_file == stdout . @clifff
  • Travis seems to break down constantly, Disable the CI until a fix is found. Evaluate to use a different CI.
  • Add the add loader to yaml.load as required by the new PyYAML version.
pg_chameleon - v2.0.10

Published by the4thdoctor about 6 years ago

This maintenance release fixes a regression caused by the new replay function with PostgreSQL 10. The unnested primary key was put in cartesian product with the
json elements generating NULL identifiers which made the subsequent format function to fail.

This release adds a workaround for decoding the keys in the mysql's json fields. This allows the sytem to replicate the json data type as well.

The command enable_replica fixes a race condition when the maintenance flag is not returned to false (e.g. an application crash during the maintenance run) allowing the replica to start again.

The tokeniser for the CHANGE statement now parses the tables in the form of schema.table. However the tokenised schema is not used to determine the
query's schema because the __read_replica_stream method uses the schema name pulled out from the mysql's binlog.

As this change requires a replica catalogue upgrade is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to use screen or tmux for the upgrade
  • Stop all the replica processes with chameleon stop_all_replicas --config <your_config>
  • Take a backup of the schema sch_chameleon with pg_dump for good measure.
  • Install the upgrade with pip install pg_chameleon --upgrade
  • Check if the version is upgraded with chameleon --version
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start all the replicas.

If the upgrade procedure refuses to upgrade the catalogue because of running or errored replicas is possible to reset the statuses using the command chameleon enable_replica --source <source_name>.

If the catalogue upgrade is still not possible downgrading pgchameleon to the previous version. E.g. pip install pg_chameleon==2.0.9 will make the replica startable again.

Changelog from v2.0.9

  • Fix regression in new replay function with PostgreSQL 10
  • Convert to string the dictionary entries pulled from a json field
  • Let enable_replica to disable any leftover maintenance flag
  • Add capture in CHANGE for tables in the form schema.table
pg_chameleon - v2.0.9

Published by the4thdoctor about 6 years ago

This maintenance release fixes a wrong check for the next auto maintenance run if the maintenance wasn't run before.
Previously when changing the value of auto_maintenance from disabled to an interval, the process didn't run the automatic maintenance unless a manual maintenance
was executed before.

This release adds improvements on the replay function's speed. The new version is now replaying the data without accessing the parent log partition and
the decoding logic has been simplified. Not autoritative tests has shown a cpu gain of at least 10% and a better memory allocation.
However your mileage may vary.

The GTID operational mode has been improved removing the blocking mode which caused increased lag in systems with larger binlog size.

As this change requires a replica catalogue upgrade is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to use screen or tmux for the upgrade
  • Stop all the replica processes with chameleon stop_all_replicas --config <your_config>
  • Take a backup of the schema sch_chameleon with pg_dump for good measure.
  • Install the upgrade with pip install pg_chameleon --upgrade
  • Check if the version is upgraded with chameleon --version
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start all the replicas.

If the upgrade procedure refuses to upgrade the catalogue because of running or errored replicas is possible to reset the statuses using the command chameleon enable_replica --source <source_name>.

If the catalogue upgrade is still not possible downgrading pgchameleon to the previous version. E.g. pip install pg_chameleon==2.0.8 will make the replica startable again.

Changelog from v2.0.8

  • Fix wrong check for the next auto maintenance run if the maintenance wasn't run before
  • Improve the replay function's speed
  • Remove blocking from the GTID operational mode
pg_chameleon - v2.0.8

Published by the4thdoctor over 6 years ago

This maintenance release adds the support for skip events. Is now is possible to skip events (insert,delete,update) for single tables or for entire schemas.

A new optional source parameter skip_events: is available for the sources with type mysql.
Under skip events there are three keys one per each DML operation. Is possible to list an entire schema or single tables in the form of schema.table.
The example snippet disables the inserts on the table delphis_mediterranea.foo and the deletes on the entire schema delphis_mediterranea.

skip_events:
  insert:
    - delphis_mediterranea.foo #skips inserts on the table delphis_mediterranea.foo
  delete:
    - delphis_mediterranea #skips deletes on schema delphis_mediterranea
  update:   

The release 2.0.8 adds the EXPERIMENTAL support for the GTID for MySQL or Percona server. The GTID in MariaDb is currently not supported.
A new optional parameter gtid_enable: which defaults to No is available for the source type mysql.

When MySQL is configured with the GTID <https://dev.mysql.com/doc/refman/8.0/en/replication-gtids-concepts.html>_ and the parameter gtid_enable: is set to Yes, pg_chameleon will use the GTID to auto position the replica stream.
This allows pg_chameleon to reconfigure the source within the MySQL replicas without the need to run init_replica.

This feature has been extensively tested but as it's new has to be considered EXPERIMENTAL.

ALTER TABLE RENAME is now correctly parsed and executed.
ALTER TABLE MODIFY is now parsed correctly when the field have a default value. Previously modify with default values would parse wrongly and fail when translating to PostgreSQL dialect

The source no longer gets an error state when running with --debug.

The logged events are now cleaned when refreshing schema and syncing tables. Previously spurious logged events could lead to primary key violations when syncing single tables or refreshing single schemas.

As this change requires a replica catalogue upgrade is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to use screen or tmux for the upgrade
  • Stop all the replica processes with chameleon stop_all_replicas --config <your_config>
  • Take a backup of the schema sch_chameleon with pg_dump for good measure.
  • Install the upgrade with pip install pg_chameleon --upgrade
  • Check if the version is upgraded with chameleon --version
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start all the replicas.

If the upgrade procedure refuses to upgrade the catalogue because of running or errored replicas is possible to reset the statuses using the command chameleon enable_replica --source <source_name>.

If the catalogue upgrade is still not possible downgrading pgchameleon to the previous version. E.g. pip install pg_chameleon==2.0.7.

Changelog from v2.0.7

  • Add support for skip events as requested in issue #76. Is now possible to skip events (insert,delete,update) for single tables or for entire schemas.
  • EXPERIMENTAL support for the GTID. When configured on MySQL or Percona server pg_chameleon will use the GTID to auto position the replica stream. Mariadb is not supported by this change.
  • ALTER TABLE RENAME is now correctly parsed and executed
  • Add horrible hack to ALTER TABLE MODIFY. Previously modify with default values would parse wrongly and fail when translating to PostgreSQL dialect
  • Disable erroring the source when running with --debug switch enabled
  • Add cleanup for logged events when refreshing schema and syncing tables. previously spurious logged events could lead to primary key violations when syncing single tables or refreshing single schemas.
pg_chameleon - v2.0.7

Published by the4thdoctor over 6 years ago

The maintenance release makes the multiprocess logging safe. Now each replica process logs in a separate file.

The --full option now is working. Previously the option had no effect causing the maintenance to run always a conventional vacuum.

This release fixes the issues reported in ticket #73 and #75 by pg_chameleon's users.

The bug reported in ticket #73 caused a wrong data type tokenisation when an alter table adds a column with options (e.g. ADD COLUMN foo DEFAULT NULL)

The bug reported in ticket #75 , caused a wrong conversion to string for the row keys with None value during the cleanup of malformed rows for the init replica and the replica process.

A fix for the TRUNCATE TABLE tokenisation is implemented as well. Now if the statement specifies the table with the schema the truncate works properly.

A new optional source's parameter is added. auto_maintenance trigger a vacuum on the log tables after a specific timeout.
The timeout shall be expressed like a PostgreSQL interval (e.g. "1 day"). The special value "disabled" disables the auto maintenance.
If the parameter is omitted the auto maintenance is disabled.

Changelog from v2.0.6

  • Fix for issue #71, make the multiprocess logging safe. Now each replica process logs in a separate file
  • Fix the --full option to store true instead of false. Previously the option had no effect.
  • Add auto_maintenance optional parameter to trigger a vacuum over the log tables after a specific timeout
  • Fix for issue #75, avoid the wrong conversion to string for None keys when cleaning up malformed rows during the init replica and replica process
  • Fix for issue #73, fix for wrong data type tokenisation when an alter table adds a column with options (e.g. ADD COLUMN foo DEFAULT NULL)
  • Fix wrong TRUNCATE TABLE tokenisation if the statement specifies the table with the schema.
pg_chameleon - v2.0.6

Published by the4thdoctor over 6 years ago

The maintenance release 2.0.6 fixes a crash occurring when a new column is added on the source database with the default value NOW().

The maintenance introduced in the version 2.0.5 is now less aggressive.
In particular the run_maintenance command now executes a conventional VACUUM on the source's log tables, unless the switch --full is specified. In that case a VACUUM FULL is executed.
The detach has been disabled and may be completely removed in the future releases because very fragile and prone to errors.

However running VACUUM FULL on the log tables can cause the other sources to be blocked during the maintenance run.

This release adds an optional parameter on_error_read: on the mysql type's sources which allow the read process to stay up if the mysql database is refusing connections (e.g. MySQL down for maintenance).
Following the principle of least astonishment the parameter if omitted doesn't cause any change of behaviour. If added with the value continue (e.g. on_error_read: continue)
will prevent the replica process to stop in the case of connection issues from the MySQL database with a warning is emitted on the replica log .

This release adds the support for mysql 5.5 which doesn't have the parameter binlog_row_image.

enable_replica now can reset the replica status to stopped even if the catalogue version is mismatched.
This simplifies the upgrade procedure in case of errored or wrongly running replicas.

As this change requires a replica catalogue upgrade is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to open a screen session
  • Before upgrading pg_chameleon stop all the replica processes.
  • Upgrade the pg_chameleon package with pip install pg_chameleon --upgrade
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start the replica processes

If the upgrade procedure refuses to upgrade the catalogue because of running or errored replicas is possible to reset the statuses with the enable_replica command.

If the catalogue upgrade is still not possible downgrading pgchameleon to the version 2.0.5 with pip install pg_chameleon==2.0.5 should make the replicas startable again.

Changelog from v2.0.5

  • fix for issue #69 add source's optional parameter on_error_read: to allow the read process to continue in case of connection issues with the source database (e.g. MySQL in maintenance)
  • remove the detach partition during the maintenance process as this proved to be a very fragile approach
  • add switch --full to run a VACUUM FULL during the maintenance
  • when running the maintentenance execute a VACUUM instead of a VACUUM FULL
  • fix for issue #68. fallback to binlog_row_image=FULL if the parameter is missing in mysql 5.5.
  • add cleanup for default value NOW() when adding a new column with ALTER TABLE
  • allow enable_replica to reset the source status in the case of a catalogue version mismatch
pg_chameleon - v2.0.5

Published by the4thdoctor over 6 years ago

The maintenance release 2.0.5 a regression which prevented some tables to be synced with sync_tables when the parameter limit_tables was set.
Previously having two or more schemas mapped with only one schema listed in limit_tables prevented the other schema's tables to be synchronised with sync_tables.

This release add two new commands to improve the general performance and the management.

The command stop_all_replicas stops all the running sources within the target postgresql database.

The command run_maintenance performs a VACUUM FULL on the specified source's log tables.
In order to limit the impact on other sources eventually configured the command performs the following steps.

  • The read and replay processes for the given source are paused
  • The log tables are detached from the parent table sch_chameleon.t_log_replica with the command NO INHERIT
  • The log tables are vacuumed with VACUUM FULL
  • The log tables are attached to the parent table sch_chameleon.t_log_replica with the command INHERIT
  • The read and replay processes are resumed

Currently the process is manual but it will become eventually automated if it's proven to be sufficiently robust.

The pause for the replica processes creates the infrastructure necessary to have a self healing replica.
This functionality will appear in future releases of the branch 2.0.

As this change requires a replica catalogue upgrade is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to open a screen session
  • Before the upgrade stop all the replica processes.
  • Upgrade pg_chameleon with pip install pg_chameleon --upgrade
  • Run the upgrade command chameleon upgrade_replica_schema --config <your_config>
  • Start the replica processes

Changelog from v2.0.4

  • fix wrong exclusion when running sync_tables with limit_tables set
  • add run_maintenance command to perform a VACUUM FULL on the source's log tables
  • add stop_all_replicas command to stop all the running sources within the target postgresql database
pg_chameleon - v2.0.4

Published by the4thdoctor over 6 years ago

The maintenance release 2.0.4 fix the wrong handling of the ALTER TABLE when generating the MODIFY translation.
The regression was added in the version 2.0.3 and can result in a broken replica process.

This version improves the way to handle the replica from tables with dropped columns in the future.
The python-mysql-replication library with this commit adds a way to
manage the replica with the tables having columns dropped before the read replica is started.

Previously the auto generated column name caused the replica process to crash as the type map dictionary didn't had the corresponding key.

The version 2.0.4 handles the KeyError exception and allow the row to be stored on the PostgreSQL target database.
However this will very likely cause the table to be removed from the replica in the replay step. A debug log message is emitted when this happens in order to
when the issue occurs.

Changelog from v2.0.3

  • Fix regression added in 2.0.3 when handling MODIFY DDL
  • Improved handling of dropped columns during the replica
pg_chameleon - v2.0.3

Published by the4thdoctor over 6 years ago

The bugfix release 2.0.3 fixes the issue #63 changeing all the fields i_binlog_position to bigint. Previously binlog files larger than 2GB would cause an integer overflow during the phase of write rows in the PostgreSQL database.
The issue can affect also MySQL databases with smaller max_binlog_size as it seems that this value is a soft limit.

As this change requires a replica catalogue upgrade is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to open a screen session
  • Before the upgrade stop all the replica processes.
  • Upgrade pg_chameleon with pip install pg_chameleon --upgrade
  • Run the upgrade command chameleon upgrade_replica_schema --config <your_config>
  • Start the replica processes

Please note that because the upgrade command will alter the data types with subsequent table rewrite.
The process can take long time, in particular if the log tables are large.
If working over a remote machine the best way to proceed is to run the command in a screen session.

This release fixes a regression introduced with the release 2.0.1.
When an alter table comes in the form of ALTER TABLE ADD COLUMN is in the form datatype DEFAULT (NOT) NULL the parser captures two words instead of one,
causing the replica process crash.

The speed of the initial cleanup, when the replica starts has been improved as now the delete runs only on the sources log tables instead of the parent table.
This improvement is more effective when many sources are configured all togheter.

From this version the setup.py switches the psycopg2 requirement to using the psycopg2-binary which ensures that psycopg2 will install using the wheel package when available.

Changelog from v2.0.2

  • fix regression added by commit 8c09ccb. when ALTER TABLE ADD COLUMN is in the form datatype DEFAULT (NOT) NULL the parser captures two words instead of one
  • Improve the speed of the cleanup on startup deleting only for the source's log tables instead of the parent table
  • fix for issue #63. change the field i_binlog_position to bigint in order to avoid an integer overflow error when the binlog is largher than 2 GB.
  • change to psycopg2-binary in install_requires. This change will ensure the psycopg2 will install using the wheel package when available.
  • add upgrade_catalogue_v20 for minor schema upgrades
pg_chameleon - v2.0.2

Published by the4thdoctor over 6 years ago

This bugfix relase adds a missing functionality which wasn't added during the application development and fixes a bug in the sync_tables command.

Previously the parameter batch_retention was ignored making the replayed batches to accumulate in the table sch_chameleon.t_replica_batch
with the conseguent performance degradation over time.

This release solves the issue re enabling the batch_retention.
Please note that after upgrading there will be an initial replay lag building.
This is normal as the first cleanup will have to remove a lot of rows.
After the cleanup is complete the replay will resume as usual.

The new private method _swap_enums added to the class pg_engine moves the enumerated types from the loading schema to the destination schema
when the method swap_tables is executed by the command sync_tables.

Previously when running sync_tables tables with enum fields were created on PostgreSQL without the corresponding enumerated types.
This happened because the custom enumerated type were not moved into the destination schema and therefore dropped along with the loading schema when the
procedure performed the final cleanup.

Changelog from v2.0.1

  • Fix for issue #61, missing post replay cleanup for processed batches.
  • add private method _swap_enums to the class pg_engine which moves the enumerated types from the loading to the destination schema.
pg_chameleon - v2.0.1

Published by the4thdoctor almost 7 years ago

The first maintenance release of pg_chameleon v2 adds a performance improvement in the read replica process when
the variables limit_tables or skip_tables are set.

Previously all the rows were read from the replica stream as the BinLogStreamReader do not allow the usage of the tables in the form of
schema_name.table_name. This caused a large amount of useless data hitting the replica log tables as reported in the issue #58.

The private method __store_binlog_event now evaluates the row schema and table and returns a boolean value on whether the row or query
should be stored or not into the log table.

The release fixes also a crash in read replica if an alter table added a column was of type character varying.

Changelog from v2.0.0

  • Fix for issue #58. Improve the read replica performance by filtering the row images when limit_tables/skip_tables are set.
  • Make the read_replica_stream method private.
  • Fix read replica crash if in alter table a column was defined as character varying
pg_chameleon - v2.0.0

Published by the4thdoctor almost 7 years ago

This stable release consists of the same code of the RC1 with few usability improvements.

A new option is now available to set to set the maximum level for the messages to be sent to rollbar.
This is quite useful if we configure a periodical init_replica (e.g. pgsql source type refreshed every hour) and we don't want to fill rollbar with noise.
For example chameleon init_replica --source pgsql --rollbar-level critical will send to rollbar only messages marked as critical.

There is now a command line alias chameleon which is a wrapper for chameleon.py.

A new command enable_replica is now available to enable the source's replica if the source is not stopped clean.

Changelog from v2.0rc1

  • Add option --rollbar-level to set the maximum level for the messages to be sent to rollbar. Accepted values: "critical", "error", "warning", "info". The Default is "info".
  • Add command enable_replica used to reset the replica status in case of error or unespected crash
  • Add script alias chameleon along with chameleon.py