osm2pgsql

OpenStreetMap data to PostgreSQL converter

GPL-2.0 License

Stars
1.4K

Bot releases are hidden (Show)

osm2pgsql - Release 1.11.0 Latest Release

Published by lonvia 8 months ago

This release makes the new middle database format the default. If you have not switched already, you need to reimport your database to take advantage of that.

We have changed the way we are parsing the command line options. The new code uses the CLI11 library (a copy of which is included in the repository) and is much cleaner and also much stricter. You now get warnings (and sometimes errors) for many combinations of options that don't make sense. Please check the output from osm2pgsql and osm2pgsql-replication for such messages and fix your command lines accordingly. Note especially that duplicated options are not allowed any more. This can happen, for instance, when using osm2pgsql-replication which adds the database connection parameters (such as -d) when it calls osm2pgsql.

If all goes well this will be the last release starting with a 1. We are planning for a version 2.0.0 in the second quarter of 2024. In that release we will remove all the functionality that has been deprecated. We will also remove support for the legacy database middle format and only support the new format introduced in version 1.9.0.

Further changes:

  • The number of database connections that osm2pgsql was opening could be quite large as it was depending on the number of tables. This is no longer the case. Osm2pgsql is opening far fewer connections now, usually you will not need to change the PostgreSQL max_connections settings any more.
  • Osm2pgsql now adds the context (the part of osm2pgsql responsible for a database connection) and the connection number to the application name used in the database connection. This allows you to better monitor what osm2pgsql is doing using the pg_stat_activity table in the database.
  • Bugfix: Using the new database format with -x, --extra-attributes did not work due to a wrong SQL command. This is fixed now.

Many thanks to Thunderforest who supported development of the features in this release.

osm2pgsql - Release 1.10.0

Published by lonvia 12 months ago

This is a relatively small but still important release.

The new middle table format has changed slightly: the tags field can now be NULL. This makes storage more efficient and indexing faster. The new middle format is now declared stable and production ready. To use it, use the command line option --middle-database-format=new, in a future version of osm2pgsql this will become the new default. If you have used this option already with one of the 1.9.x versions of osm2pgsql you have to reload your database or use this SQL command to update the table: ALTER TABLE <name> ALTER COLUMN tags DROP NOT NULL;, for <name> use planet_osm_nodes, planet_osm_ways, and planet_osm_rels or the equivalents if you are using a different table name prefix.

Other changes:

  • Emit a warning that the flex output area type and the add_row() functions are deprecated if you use them. If you get this warning, read https://osm2pgsql.org/doc/tutorials/switching-from-add-row-to-insert/ .
  • Add first/last timestamps to expire tables. Having these timestamps allows various expire/updating strategies.
  • The docs directory is now called man, because it only contains the man pages. All other docs are on the project web site.
  • Various improvements on the (still experimental) generalization code. The biggest change is that we switch from using the CImg to the OpenCV library which makes the code an order of magnitude faster.
osm2pgsql - Release 1.9.2

Published by lonvia about 1 year ago

This release fixes a bug introduced in 1.9.0 with two-stage processing that will lead to crashes. If you are using any 1.9.x version, please upgrade to 1.9.2.

In one case we had some performance problems updating an osm2pgsql database with 1.9.1 due to the PostgreSQL query planning choosing a bad plan. This release contains a workaround for that problem.

We also improved the (experimental) generalizer code a bit:

  • More information is shown in log level 'info', including some timing information.
  • The Lua config run_sql() command now can have either a single SQL statement in the sql field (as before) or a list of SQL commands.
  • For convenience, the Lua config run_sql() command now has an optional transaction field which can be set to true to wrap the SQL commands in BEGIN/COMMIT.
  • The new if_has_rows fields on the run_sql() command can be set to string with an SQL query. If that field is set, the SQL statement(s) in the sql field is only run, if the SQL query returns at least one row.
  • Some performance improvements in low-level code in the generalizer.
osm2pgsql - Release 1.9.1

Published by lonvia about 1 year ago

This release fixes some small issues with 1.9.0:

  • Fix compatibility of osm2pgsql-replication with psycopg3
  • Fix architecture-dependent double to integer conversion
  • Some small code cleanups
osm2pgsql - Release 1.9.0

Published by lonvia about 1 year ago

This release brings three new major features:

  • a new osm2pgsql_properties table that saves command line options and reuses them on updates
  • a new database middle saves raw OSM data in JSONB format and is explicitly designed to be queried by the user
  • the new (and still experimental) osm2pgsql-gen adds geometry generalization to osm2pgsql

Other changes include:

  • cleanup of schema handling
  • tile expiry output into database tables
  • a new spherical_area() function for flex config files to calculate the area of a (multi)polygon on the sphere.
  • when using the new database middle, the --middle-with-nodes option allows you to store all tagged nodes in the database (with their tags and location).
  • several improvements to osm2pgsql-replication to make it more flexible and better tested (thanks to @amandasaurus and @JakobMiksch)
  • don't do multi-statement SQL queries to be compatible with the PgPool-II connection pooler.

Please note that this version drops support for implicit DB schema other than public. If you rely on implict user schemas or custom schema paths, you now must configure the schema to be used with the --schema option.

For more information on all new features and changes read the more extensive release notes for 1.9.0.

osm2pgsql - Release 1.8.1

Published by lonvia over 1 year ago

This release contains some fixes and minor changes.

  • Fix osm2pgsql-replication script so it works correctly with PostgreSQL schemas.
  • Don't process objects without tags in outputs in append mode. This should speed up updates a little bit.
  • Count number of inserted rows and rows not inserted because of NOT NULL constraints for each table and log the numbers in debug mode.
  • Remove some extra-verbose debug logging when using the pole_of_inaccessibility() function.
  • Flush output tables generated from nodes and ways tables earlier.
osm2pgsql - Release 1.8.0

Published by lonvia over 1 year ago

The largest change is the addition of much more flexible index support in the flex output. The table definitions have a new (optional) field called indexes now which takes a list of index definitions. If the field is not there, we fall back to what we did before and create a GIST index on the only/first geometry column of a table. But you can also define any kind of index you want: define which index method (BTREE, GIST, ...) to use on which columns, define WHERE clauses and expression indexes and much more. See the flex-config/indexes.lua Lua config for some usage examples and the manual for all the details. You can also force osm2pgsql to always build the id indexes which are normally only built in slim mode.

The gazetteer output and the command line option --with-forward-dependencies are deprecated in this release and will be removed soon. They were only needed for Nominatim which switched to using the flex output recently.

Here are the other changes:

  • Fix a problem when using osm2pgsql with a projection other than WGS84 (EPSG:4326) or Web Mercator (EPSG:3857) which made the program really slow.
  • New pole_of_inaccessibility() Lua function to generate reasonably good label points from polygons. (This function is currently marked as experimental, which means it can change without notice at any time.)
  • Performance improvement for very small updates. Don't spin up multiple threads when there are less then 100 objects to process, because the extra overhead is not worth it.
  • Implement and use our own JSON writer. This removes the dependency on RapidJSON which hasn't seen a new release since 2016.
  • Add more checks (or does some checks earlier) to make sure your database uses UTF-8 encoding and that necessary database extensions are loaded and index methods, schemas and tablespaces you refer to in the config are actually available.
  • A lot of code needed to be updated so it works correctly with any of the recent versions of the fmt library.

As always there were lots of code cleanups across the board, but especially in code accessing the database and in the C++/Lua glue code to make it more flexible and easier to use internally.

osm2pgsql - Release 1.7.2

Published by lonvia almost 2 years ago

This release has some small changes only:

  • The flex output now allows tables with only the id column (or columns).
  • The osm2pgsql-replication script now always expects the osm2pgsql binary to be in same path as itself.
  • Adds the flag --middle-schema=SCHEMA to the osm2pgsql-replication script which allows placing the replication status table in a schema other than PUBLIC (Thanks to @JakobMiksch).
  • More tests have been converted to the new BDD format.
  • Various code cleanups and refactorings especially in the expire code.
osm2pgsql - Release 1.7.1

Published by lonvia about 2 years ago

This release fixes a few small bugs in osm2pgsql and closes some gaps in the geometry processing code released in 1.7.0. It also contains some security-related fixes as a result of the security audit.

  • Added as_multipoint() function to complement as_multilinestring() and as_multipolygon().
  • The functions as_multipoint(), as_multilinestring(), and as_multipolygon() will now always return single geometries if possible. Single geometries are always allowed where multi geometries are allowed, so this does't break anything.
  • The centroid() function now works for all geometry types.
  • New length() function to compute the length of a geometry in map units.
  • New reverse() function to turn geometries around (can be useful for ways tagged with oneway=-1).
  • The simplify() function is now available for multilinestrings, too. (Not for polygons yet.)
  • All example code in the flex-config directory has been updated for the new geometry handling capabilities.
  • Create nicer error messages when trying to access a missing database extension, schema, or tablespace.
  • Better checking of names (of tables, schemas, etc.) used in SQL in osm2pgsql and osm2pgsql-replication to avoid potentional SQL injection issues.
  • Fix: Make sure relation members show up in the correct order in multi-geometries when using slim mode.
  • Fix: Do not try to run ST_IsValid() on create_only columns.
  • osm2pgsql-replication: The database parameter may be empty when connection parameters are supplied via environment variables.
  • osm2pgsql-replication: when installed, now runs the osm2pgsql binary that was installed with it to avoid potential security issues through PATH manipulation.
  • osm2pgsql-replication: Meaningful error when middle tables do not exist or the prefix is a bad one.
osm2pgsql - Release 1.7.0

Published by lonvia about 2 years ago

For this version we rebuilt a lot of the code around geometry processing and around expire. The different parts -- creation of geometries from OSM data, transforming geometries (like merging and splitting linestrings) and finally writing them out in WKB format for import into the database -- are now well separated and tested on their own. And we added some functions for geometry processing, too. osm2pgsql can now calculate the centroid of a polygon and simplify linestrings using the Douglas-Peucker algorithm.

But the best part is that all of that new geometry goodness is now available from the Lua config files when using the flex output. There are many new ways of processing geometries from Lua:

  • The get_bbox() is now available for relations, too.
  • There are new functions as_point(), as_linestring(), as_polygon(), as_multilinestring(), as_multipolygon(), and as_geometrycollection() to create geometries from OSM objects.
  • Geometries can be manipulated in Lua with several functions modeled after the same functions in PostGIS: area(), centroid(), geometry_type(), line_merge(), num_geometries(), segmentize(), simplify(), srid(), and transform(). We expect more to come in the future. This way you can do more geometry processing on import removing the need for some post-processing in SQL.
  • We used to have the somewhat magic handling of geometries with the add_row() function which only allowed a limited set of operations. This function is still available for backwards compatibility, but there is a new function insert() now which doesn't have this magic. Instead geometries are treated like any other data type giving you a lot more flexiblity. Check out the example config files addresses.lua, generic.lua, simple.lua and geometries-using-insert.lua in the flex-config directory for some ideas on what can be done.

In this version we enabled the bucket index for way nodes by default. This had been around for a while but you needed a command line option to enable it. After some positive feedback from the community we decided to make this the new default. It will be used on new imports (existing databases will keep using the old index). The new bucket index is much smaller and can save you hundreds of gigabytes of disk space. See
https://osm2pgsql.org/doc/manual.html#bucket-index-for-slim-mode for the details.

And again a lot of code cleanups and some smaller bug fixes went into this release. To make writing tests easier we added a new BDD testing framework based on behave (https://behave.readthedocs.io/) and re-wrote a lot of the existing tests. Writing tests is now much easier and a lot less tedious.

There are also a bunch of changes to the osm2pgsql-replication script to make it easier to use.

This is the first version of osm2pgsql that needs a C++17 compiler. And there is a new dependency on the boost::geometry library.

osm2pgsql - Release 1.6.0

Published by lonvia over 2 years ago

  • The osm2pgsql-replication script which has been included in the osm2pgsql repository for a while will now be installed together with its man page on "make install". To use it you need Python3, psycopg2 (or psycopg3), and PyOsmium installed. See the manual for details.
  • Ignore relations with more than 32.000 members (which should never happen in real data) instead of failing.
  • Removed the dependency on boost::algorithm.
  • Included libosmium was updated to newest version 2.17.3 which contains an important fix for a problem which can lead to osm2pgsql locking up.
osm2pgsql - Release 1.5.2

Published by lonvia almost 3 years ago

This is a bugfix release with only minor changes.

Changes:

  • Fix parsing problems in style file reader. Some variables were not initialized correctly when parsing a style file, which lead to some surprising behaviour with flags of one config line re-used by the next if the flags field of that line was empty. This could also have lead to buffer overflows in the first line being parsed.
  • Fix: When there is an active progress display, log messages would show up after the progress display instead of the next line.
  • Release some allocated memory earlier in the processing chain.
  • Fix confusing log message: The message "Done postprocessing on table '{}' in {}" was logged twice when --drop is used. This changes one of the log messages to the more specific "Table '{}' dropped in {}".
  • Run ANALYZE on middle tables only in create mode saving some processing time.
  • Add 'status' command to osm2pgsql-replication. Prints the current replication status, and with --json prints that as JSON data. (thanks @amandasaurus)
  • Needs at least CMake 3.5.0 now.
  • Updates the included versions of the catch2, fmt, libosmium, and protozero libraries to current versions.

Note that this is the last version which will compile with C++14. The next version 1.6.0 will need C++17.

osm2pgsql - Release 1.5.1

Published by lonvia about 3 years ago

This is a bugfix release. It contains some important bug fixes, so everybody is encouraged to update.

Here are the changes:

  • When importing a planet file or a huge extract, something with more than about 1 billion nodes, the new RAM node location store could overflow a 32bit "offset" value which meant that the node locations would not be found again. The result were missing features, because osm2pgsql just ignores features with geometries that can not be built due to missing node locations.
  • Osm2pgsql creates temporary tables as UNLOGGED to get better performance. We fixed a bug where non-temporary output tables were also created as UNLOGGED (when clustering was disabled.)
  • In the flex output table columns marked create_only are now only created in final tables, not temporary tables. This avoids some problems, for instance when using columnn type SERIAL.
  • Make the input data check more strict: Two versions of same object are not allowed in the input.
  • Remove IMMUTABLE volatility classification from validity check trigger function.
  • Make the directory where the config file is available in the flex output through the osm2pgsql.config_dir global Lua variable.
  • Update required libosmium version to 2.17.0. The version 1.5.0 already required this, but it wasn't documented.
osm2pgsql - Release 1.5.0

Published by lonvia over 3 years ago

This release brings quite a lot of improvements. We removed the "experimental" label from the flex output which we introduced in version 1.3.0. There are some small changes you might have to make to your flex configurations, see the Upgrading chapter of the manual for details.

This release also contains a rewrite of the code used to temporarily store OSM data in memory while processing the data in non-slim mode, i.e. when you import data without --slim. It now uses much less memory.

Other changes:

  • The multi output which was marked as deprecated in the last versions has now been removed.
  • This is the first release that needs a C++14 compiler.
  • New cluster table option in the flex config file which allows you to disable clustering of the table data by geometry.
  • Do not try to create indexes for flex output tables without id.
  • Added flex config example (attributes.lua) showing how to access OSM object attributes (such as timestamp, user name, etc.) from Lua.
  • Added a warning if --flat-nodes/-F is used in non-slim mode.
  • Report cache memory usage when running with --log-level=debug.
  • Report thread number in all log lines when --log-level=debug is set.
  • Use trigger to check geometry validity on first import instead of only doing this when copying the data for clustering. In the flex output this validity check is not used any more for point geometries because they are always valid anyway.
  • The RapidJSON library is now used and included in the source.
  • Now needs libosmium 2.17.0 which is included in the source.
  • Lots of internal cleanups and restructurings.
osm2pgsql - Release 1.4.2

Published by lonvia over 3 years ago

This is a minor bugfix release that fixes the following issues:

  • Translate empty strings into NULL instead of 0.0 for columns of type double.
  • Consistently quote table names to handle upper case table prefixes correctly.
  • Avoid querying geometries in stage 2 of the flex output when expiry is disabled.
  • Fix syntax error in index creation with schema enabled.

The release also adds a new osm2pgsql-replication script to simplify the process of downloading and applying updates. It requires pyosmium and psycopg2. See #1411 for more information.

osm2pgsql - Release 1.4.1

Published by lonvia over 3 years ago

This is a minor release with some bug fixes and internal cleanups and changes, mostly in the "middle" code and geometry processing.

We have released all example config files for the flex output into the Public Domain, so they can be used as widely as possible.

Fixes:

  • Some MultiLineStrings were not assembled correctly from relations. This happened when a relation had exactly two member ways forming a closed ring with the two ways oriented against each other (#1394).
  • Long LineStrings can (optionally) be split by osm2pgsql into shorter segments. In some cases this would produce invalid LineStrings (aee1be1b).
  • Do not try to display ANSI color codes on Windows terminals which don't understand them (ab96aebc).

Other changes:

  • When osm2pgsql is started without any arguments, it now shows the help text instead of an error.
  • Write PostGIS version to output when osm2pgsql starts up and show error message when a database without PostGIS extension is used (c7927e83, #1400).
  • Improved progress output and summary information when processing input files.
  • Add log entry with number of threads when thread pool is started (34cf9d8a).
  • Report overall memory usage at the end of running osm2pgsql.
  • Updated included library versions (fmt 7.1.3, libosmium 2.16.0, catch 2.13.4).
osm2pgsql - Release 1.4.0

Published by lonvia almost 4 years ago

The project has a new website at https://osm2pgsql.org now with extensive documentation and examples, and with sections on support, contributing, news, etc. Most of the documentation from the repository and the OSM wiki was moved there. We still have a man page, it is now maintained in markdown format. All the documentation, man page, help texts etc. have been cleaned up, made more consistent and brought up to date.

The program has a much improved log output now. Each line is prefixed with a date/timestamp and by default osm2pgsql isn't as verbose any more. You can change the verbosity using several options. You can even have super-verbose logging of all SQL commands issued and all data written to the database. Warnings and errors now appear in red color if your console supports it. Progress output can be disabled, for instance, when the output is redirected to a file. When printing how low something took, osm2pgsql will now not only print the seconds but also a more human readable format with hours, minutes, and seconds.

In the last release (version 1.3.0) we have already added a warning when you used input files with negative OSM object ids or input files which are not ordered correctly. These are now not allowed any more and osm2pgsql will stop with an error if it detects these. See the manual for how to work around this. This allowed us to improve the handling of multiple input files. Osm2pgsql now reads multiple input files at the same time merging the contents. This means that you can now import several extracts in one go. Note that the extracts still have to come from the same point in time!

Changes in the flex output (which is still marked experimental). Not that some of these are breaking changes compared to the behaviour in version 1.3.0:

  • Fix: Flex output sometimes created two id indexes on the same table.
  • Set projection for geometry columns in the table configuration. The command line options --latlong, -m, --merc, -E, and --proj are not used by the flex output any more.
  • Flex mode setting type_column fixed. Now also supports id columns compatible with Imposm.
  • Optionally wrap polygon geometries in multipolygons if the geometry column of the target table is of type MultiPolygon.
  • Switch multipolygon generation from default off to default on. The multi option on the area geometry transformation has been removed and there is a new option split_at.
  • Add several Lua helper functions for flex config files.

Other changes:

  • The middle, pgsql output and flex output now all support setting the schema used for tables, indexes and functions.
  • Add support for the new API of the PROJ library (used since PROJ version 6).
  • Fixed bug in 1.3.0 that didn't disable the PostgreSQL JIT processing which slows down osm2pgsql considerably when using PostgreSQL 12 and above.
  • Do not create planet_osm_nodes table if flat nodes are used.
  • Print database version and check that we are using a supported PostgreSQL version.
  • Allow PostgreSQL conninfo string in -d/--database option.
  • Removed legacy code that tried to alter existing database tables if your config changed since the initial import. This code could only detect and do very few necessary changes and therefore could not be relied upon anyway.
  • Add --with-forward-dependencies option. This allows to disable dependency management. Used for Nominatim.
  • There is a new "bucket index" for the node-to-way-lookup in the middle. It needs a lot less disk space and imports are much faster, but updates will be slower. It is currently not enabled, but osm2pgsql experts are encouraged to try it out and give us feedback. See https://osm2pgsql.org/doc/manual.html#bucket-index-for-slim-mode for details.
  • As usual, there are various code cleanups and bug fixes.

The 'multi' output was already deprecated in the last version (1.3.0), it will be removed in the next release.

osm2pgsql - Release 1.3.0

Published by lonvia about 4 years ago

This release introduces the new "flex" output. It allows a more flexible
definition of output tables and columns. It also adds a second stage of
processing which makes it possible to get information from relations to
their members, allowing, for instance, to render tags from bicycle route
relations on their member ways. The "flex" output is configured through
Lua scripts.

The flex output is currently still marked as experimental, because it is new
and we want to collect feedback from the community before finalizing the API.
But it already works well and users are encouraged to try it out. Some new
features are only or will only be available in the flex output and we expect
that it will replace the other outputs in the long term.

Some features have been marked as deprecated:

  • The "multi" output will be removed in a future version of osm2pgsql. If you
    are using the multi output, switch to the flex output now and tell us if
    you have any problems.
  • When the input file uses negative OSM object IDs a warning is now generated.
    Negative IDs never worked correctly for all use cases. Future versions of
    osm2pgsql will not allow negative IDs at all. Use "osmium renumber"
    to get rid of the negative IDs.
  • Input files that are not ordered generate a warning. Future versions of
    osm2pgsql will not work any more with unordered files. If you have unordered
    files use "osmium sort" to order them.

Further changes:

  • The multi output now looks for lua script relative to the style.json file.
    This is a breaking change. Users might have to change the file names of
    their lua scripts in the style files.
  • Use the fmt library for formatting strings now instead of a mixture of
    boost::format and hand-written mechanisms. A version of fmt is included
    in the contrib directory.
  • Make PROJ library optional. If the proj library cannot be found by cmake,
    do not offer the option to use arbitrary projections. Only WGS84 and
    WebMercator are supported then.
  • Don't use ST_GeoHash for ordering tables by geometry on Postgis >= 2.4.
    Instead use the default ordering which works better now.
  • Fix: Always print correct relations count and more correct count per seconds
    when showing processing stats.
  • Fix: If a function run in the thread pool throws an exception, this exception
    was never "collected", it was silently ignored. This meant that some errors,
    especially in communication with the database, were not detected correctly.
  • The dependency management, the part of the code which tracks which changes
    in the OSM data trigger which changes in the outputs, was reorganized
    making in much cleaner and removing the last remnants of code written to
    support "old style" multipolygons.
  • Tests have been moved to the Catch framework, extended and the regression
    tests have been reorganised, so they can run independently of each other.
  • A lot of code was cleaned up, modernized, made more robust, and sometimes
    removed.
osm2pgsql - Release 1.2.2

Published by lonvia over 4 years ago

This release only updates the bundled version of libosmium. The new version 2.15.6 fixes an issue where complicated multipolygons make osm2pgsql hang.