pgloader

Migrate to PostgreSQL in a single command!

OTHER License

Downloads
296
Stars
5.1K
Committers
104

Bot releases are hidden (Show)

pgloader - pgloader 3.6.9 Latest Release

Published by df7cb almost 2 years ago

  • Add default support for MSSQL auto-incrementing bigint and smallint (#1435, @justinfalk)
  • Improve pgloader docs (#1440, @dimitri)
  • Update test dependencies for Debian (@df7cb)
pgloader - pgloader 3.6.8

Published by df7cb about 2 years ago

  • Upgrade Clozure-CL in the Dockerfile.ccl (@dimitri)
  • Use the unix-namestring as the hash key for SQL queries (#1420) (@dimitri)
  • Added support for sequences with minvalue defined (#1429) (@noctarius)
  • Fix mapping mysql signed int with auto_incement to postgresql serial (#1248, #1437) (@padinko)
  • Debian packaging: Depend on libsqlite3-0 (@df7cb)
pgloader - pgloader 3.6.7

Published by dimitri about 2 years ago

  • Set SBCL dynamic space size to 16 GB on 64 bit architectures.
  • Improve documentation with command lines and defaults.
  • SBCL compiler notes should not be fatal to pgloader.
pgloader - pgloader 3.6.6

Published by df7cb over 2 years ago

  • Maintenance release fixing the bundle build process
pgloader - pgloader 3.6.4

Published by df7cb over 2 years ago

  • Run testsuite from github action and Debian autopkgtest (@df7cb)
  • Fix looping over sbcl external-formats (@dimitri)
  • Force libcrypto reload in src/hooks.lisp (@athos-ribeiro)
  • Fix minor casing issue in intro docs (@JockeTF)
  • Allow underscores in cast type names (@swt30)
  • Specify v8 of freetds (@dagostinelli)
  • Parameterize DYNSIZE in dockerfiles (@BrendanBall)
  • Fix documentation typo (@Nedeas)
pgloader - pgloader 3.6.3

Published by df7cb almost 3 years ago

Fix dependencies with the latest/current libs (including esrap and cl-postmodern). Work done by the debian packaging maintenance team, kudos folks!

pgloader - pgloader v3.6.2

Published by phoe over 4 years ago

Bugfix release that fixes building pgloader with SBCL 2.0.1 and newer (GitHub issue #1087) due to CFFI 0.20.0 not being compatible with the new SBCL releases.

From other noteworthy things, the DBF format received a lot of improvements.

pgloader - pgloader v3.6.1

Published by dimitri over 5 years ago

This release contains three major themes: usual maintenance and bug fixing, support for new database systems as sources and targets, and support for Citus distribution.

New Documentation System

The documentation also has received quite some attention and is now available at https://pgloader.readthedocs.io/en/latest/. It includes a new introduction page and the introduction section now has a Feature Matrix displaying what feature coverage you can expect depending on your database source type. Please read https://pgloader.readthedocs.io/en/latest/intro.html for more details.

PostgreSQL as a source

pgloader v3.6.1 now as integrated support for PostgreSQL either as a source database system, or a target database system, or both. Migrating from PostgreSQL to another PostgreSQL instance is best done with PostgreSQL tools such as logical replication or backup and restore facilities. That said, pgloader also supports PostgreSQL derivatives such as Redshift and Citus, so from PostgreSQL to PostgreSQL is to be read with that in mind.

Support for Citus distribution keys

pgloader v3.6.1 includes support for Citus distribution, documented at https://pgloader.readthedocs.io/en/latest/ref/pgsql-citus-target.html. When your target database is Citus, then pgloader has support for:

  • distribution key declaration right in the pgloader command, allowing for the next items,
  • distribution key integration in the target schema, done on the flight by following foreign key definitions and adding the distribution key where needed (tables and constraints),
  • backfilling of the data from their sources, using SQL JOINs when migrating the data from the source to the target system.

Support for Redshift

pgloader v3.6.1 implements Redshift support both as a source and as a target database system. When used as a target, pgloader takes care of dumbing down the data types when compared to PostgreSQL, and needs the user to provide an S3 setup where to upload intermediary files: Redshift can COPY from S3 files, not from standard input on the connection like PostgreSQL would.

When used as a source system, pgloader uses SELECT queries with Redshift, allowing to fetch all the data over the same network protocol, as usual.

AFTER SCHEMA EXECUTE SQL

pgloader v3.6.1 implements new support for running SQL queries in between its handling of the schema and the data parts of the migration. This allows for custom post-processing the schema to happen before loading the data into the target database.

Improvements and bug fixes

The existing support for MS SQL, MySQL, SQLite, CSV and other formats have received improvements and bug fixes. The CSV parser for instance is now able to consider a subset of the target table columns when using a CSV header in the file, or column and fields in different orders.

Sponsoring pgloader

Some of the improvements in that release were made possible thanks to our sponsors! If you need new pgloader features, please talk to me about them, I'll be happy to make it happen! You can contribute to pgloader either your time and skills, or money. See https://pgloader.io/moral-licence/ for more information and details.

pgloader - Pre-release for pgloader 3.6.1

Published by dimitri almost 6 years ago

This release is meant to allow for more testing of the current pgloader code, which is known to fix the most recent issues on docker and debian derived systems, and some compilation errors with outdated dependencies.

Please report oddities!

pgloader - pgloader v3.5.2

Published by dimitri over 6 years ago

This release is a maintenance release that contains a new feature, because it happened that way:

  • debian package fixes, thanks to Christoph Berg and Sébastien Villemot!
  • usual set of bug fixes, see the issues of the project for more information, or the commit logs,
  • add support for Redshift as a target database, using S3 as intermediary files.

The main driver behind that release is the debian packaging fixes!

pgloader - pgloader v3.5.1

Published by dimitri over 6 years ago

This release is mainly about lots of bug fixes thanks to user reports in GitHub issues, and also contains an heavily optimised code for preparing the COPY buffers. This optimisation comes with the realisation that when using pgloader to migrate from a source database to PostgreSQL, it's best to fail early. As a result, we don't keep batches in memory at all (by default) in those cases, allowing clear benefits of CPU and memory usage in pgloader.

As usual, for a full list of changes you may have a look at the git history here: https://github.com/dimitri/pgloader/compare/v3.4.1...v3.5.1.

Guessing CSV formats

In this version of pgloader, when you load a CSV file into a target table, pgloader considers that if you don't specify the CSV format then it's expected to match with the target table definition. Which means that pgloader can guess the separator and quoting rules used in your source file!

Casting Rules

It's now possible in User Defined Casting Rules to specify new guards and actions, as seen in the documentation.

Other improvements

pgloader now supports loading data into PostgreSQL Foreign Tables and Partitioned Tables.

Lots of bug where fixed in the PostgreSQL support, in the SQLite support, in the MySQL support and in the MS SQL support.

SQLite improvements

The multiple ways SQLite can represent primary keys and indexes can be confusing, and pgloader got smarter about that. Pgloader now deal with more default values for SQLite too.

MySQL improvements

When loading a huge MySQL table it's now possible to have pgloader work with more than one reader in parallel, each reader querying a range of primary key values from the source database. This technique might improve reading times in some cases.

MySQL connection string can now use the useSSL parameter, or a sslmode parameter like when using PostgreSQL URIs.

In the previous release, pgloader changed to target in PostgreSQL a schema created with the name of the MySQL database. In this release, pgloader also automatically adds that schema to the database search_path.

Load file templates

In pgloader v3.5.1 it's now possible to use the https://mustache.github.io templating engine. Values can be given via the new command line parameter --context or from the process environment. This allows using the same load file with different source files, for instance, and has been a long asked-for feature for pgloader.

pgloader - pgloader 3.4.1 is now available!

Published by dimitri over 7 years ago

This new release brings stability on the table. Both memory allocation optimization and error handling from the Command Line have been strong focus points in the preparing of this release of pgloader. It is intended to be a just works release... well otherwise you know where to open new issues.

Users of MS SQL will appreciate a lot of bug fixes and improvements to the coverage of their source database.

The parallelism features introduced in 3.3.1 have been overhauled and simplified internally, without changing the user facing knobs for them. Please report oddities if you find some.

You can read a full article about the release at from MySQL to PostgreSQL on the blog!

As usual, enjoy Free Software, enjoy pgloader and enjoy PostgreSQL!

MySQL to PostgreSQL and schema target

When converting from MySQL to PostgreSQL with this new release of pgloader, the default is now to target (and create) a PostgreSQL schema with the same name as the MySQL database. If you want to target the public schema instead, use a load file with the following command:

ALTER SCHEMA 'dbname' RENAME TO 'public' -- in pgloader command.load file

Given this command, pgloader then register your source table into the schema given in the load file and PostgreSQL commands all target this target schema. This also applies to data only migrations where the target schema has been created by a tool for you.

If you want to use your new PostgreSQL database easily with the new schema, you might want to alter PostgreSQL's target database to include it in the search_path automatically:

alter database dbname set search_path to dbname, public; -- at PostgreSQL prompt
pgloader - pgloader 3.3.2 is now available!

Published by dimitri almost 8 years ago

This is a maintenance release triggered by debian users having to deal with a bug fixed upstream. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=843555

pgloader - pgloader 3.3.1

Published by dimitri about 8 years ago

pgloader 3.3.1 is now available!

This release contains about a year of changes to pgloader: code cleanup, new features, improvements and bug fixes. Thanks to all users who report use cases and bugs and open issues!

changelog

See the detailed changeling thanks to git log and the github view of it here: https://github.com/dimitri/pgloader/compare/v3.2.2...v3.3.1

contributors

A lot of you guys did contribute to pgloader v3.3.1: thanks for joining the fun!

sponsoring

A special thanks to our new sponsors! Thanks to them our support for MS SQL is now solid and usable for everyone, and it's possible to properly load data into a pre-created PostgreSQL schema (see the ORM case below). If you want to consider sponsoring a pgloader feature, consider buying a pgloader Moral License as detailed at http://pgloader.io/pgloader-moral-license.html.

Thanks to sponsors, pgloader 3.3.1 is now even able to rewrite some Partial Indexes WHERE clauses automatically!

Sponsors are listed at http://pgloader.io/sponsors.html when they want to. It's easy to have your name here too!

release notes

Plenty things are worthy of some notes, let's focus on the main themes found in the 179 commits in between pgloader 3.2.2 and 3.3.1.

online schema changes

It's now possible to use the new pgloader clauses ALTER TABLE and ALTER SCHEMA to edit the matching between source table and their destinations.

parallelism

The way the parallelism is handled by pgloader now offers more options and the ability to use even more cores! To that effect, we first make it possible for the next table load to begin straight away rather than waiting for the all-parallel index creation to be done on the previous just-loaded table. Then more options are available, see the code comment for details:

We allow WORKER-COUNT simultaneous workers to be active at the same time
in the context of this COPY object. A single unit of work consist of
several kinds of workers:

- a reader getting raw data from the COPY source with `map-rows',
- N transformers preparing raw data for PostgreSQL COPY protocol,
- N writers sending the data down to PostgreSQL.

The N here is setup to the CONCURRENCY parameter: with a CONCURRENCY of
2, we start (+ 1 2 2) = 5 concurrent tasks, with a CONCURRENCY of 4 we
start (+ 1 4 4) = 9 concurrent tasks, of which only WORKER-COUNT may be
active simultaneously.

For that new parallel control option are available, check the section A NOTE ABOUT PARALLELISM in the main documentation.

migrate to a pre-existing schema

Also, the infamous ORM case is now handled correctly. It's possible for pgloader to migrate from a source database to a pre-installed target in PostgreSQL, where the schema has already been installed, either manually or by your favorite tooling. See the test file test/sakila-data.load which uses that option when migrating from MySQL, using the following options:

 WITH concurrency = 1, workers = 6,
      max parallel create index = 4,
      create no tables, include drop, truncate

On the code cleaning front, some refactoring did take place. The main parts of it is the introduction of our own internal catalog representation, allowing the previous feature by being able to load and compare metadata from MySQL and PostgreSQL.

docker

pgloader now includes Dockerfiles for both SBCL and CCL, and is using the DockerHub service so that a build is triggered at each commit pushed in the master's branch. Use the following URL to see about that, and please use those pre-made docker images if you need them!

https://hub.docker.com/r/dimitri/pgloader/

bundle distribution

This 3.3.1 release is also the first release to come with a bundle file. This distribution format allows to easily build pgloader from a single source archive that vendors in all the build dependencies. Look, if you don't use any library then you're mocked for choosing a poor programming language ecosystem where you have to do it all yourself, and when you pick a language that offers plenty of libs then packagers don't want to have to do the legwork themselves until proven interest by the users. And the users are basing their choice on the availability of the libs in their favorite distribution. Can you spell chicken and egg?

So, I hope having the bundle distribution of pgloader will help fellow packagers to work on including pgloader in their favorite distribution.

pgloader - pgloader 3.2.2

Published by dimitri about 9 years ago

pgloader 3.2.2 is now available!

This is the first release done in source format only. The previously available binary images were not good enough and I am not in a capacity to offer a good service here, so please see with your OS of choice packagers to obtain a binary release there directly. That said, I am taking care of debian.

Release Notes

This release is mostly about lots of bug fixes, answering to many github issues. Thanks everybody for reporting bugs! Some of you did send Pull Requests (aka patches or bug fixes) along with the bug report, let's hope that more of you are going to do that in the future ;-)

Release Early, release often

I believe that release early is something that has been done correctly in pgloader, but the release often parts have been neglected up to now. My intention is to fix that part by only doing the parts I know how to: source code and debian package. Given that organisation it's now quite easy for me to cut a release, and I intend on doing that often enough!

pgloader - pgloader 3.2.1 preview

Published by dimitri over 9 years ago

pgloader 3.2.1 preview

This is a preview release that contains some bugfixes for pgloader 3.2.0 found by early testers. Not all binary formats are covered yet. It's an interim release motivated by a down hosting server just right when pgloader makes it to Hacker News, done in the middle of the night to serve curious visitors: enjoy ;-)

It's a pre-release but as only bug fixes made it on top of pgloader 3.2.0, it's as safe as 3.2.0 really, just try it and tell me!

Binary Files

You have a choice of .pkg for MacOSX systems, .deb for debian sid (no backport to testing or stable yet, but these days I would expect that libc are at the same version so it's worth a try), and a RPM for CentOS 6.4 (Final).

Runtime dependencies

You might need to install freetds-devel package and openssl-devel depending on the features you're using from pgloader, freetds being the MS SQL driver.

CentOS binary file

The tar contains a single file pgloader that is an almost static binary:

[vagrant@localhost vagrant]$ cat /etc/redhat-release 
CentOS release 6.4 (Final)
[vagrant@localhost vagrant]$ ldd ./build/bin/pgloader
    linux-vdso.so.1 =>  (0x00007fff9d5ff000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00007fcc88183000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fcc87f66000)
    libz.so.1 => /lib64/libz.so.1 (0x00007fcc87d4f000)
    libm.so.6 => /lib64/libm.so.6 (0x00007fcc87acb000)
    libc.so.6 => /lib64/libc.so.6 (0x00007fcc87738000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fcc88390000)
Package Rankings
Top 8.17% on Proxy.golang.org
Top 14.86% on Formulae.brew.sh
Badges
Extracted from project README
Build Status Join the chat at https://gitter.im/dimitri/pgloader Read The Docs Status