delta-rs

A native Rust library for Delta Lake, with bindings into Python

APACHE-2.0 License

Downloads
5.6M
Stars
2K
Committers
79

Bot releases are hidden (Show)

delta-rs - python-v0.8.1

Published by wjones127 over 1 year ago

What's Changed

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.8.0...python-v0.8.1

delta-rs - python-v0.8.0

Published by wjones127 over 1 year ago

What's Changed

New Contributors

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.7.0...python-v0.8.0

delta-rs - rust-v0.8.0

Published by wjones127 over 1 year ago

Full Changelog

Implemented enhancements:

  • feat(rust): support additional types for partition values #1170

Fixed bugs:

  • File pruning does not occur on partition columns #1175
  • Bug: Error loading Delta table locally #1157
  • Deltalake 0.7.0 with s3 feature compliation error due to rusoto_dynamodb version conflict #1191
  • Writing from a Delta table scan using WriteBuilder fails due to missing object store #1186

Merged pull requests:

delta-rs - rust-v0.7.0

Published by wjones127 over 1 year ago

Full Changelog

Implemented enhancements:

  • Support FSCK REPAIR TABLE Operation #1092
  • Expose the Delta Log in a DataFrame that's easy for analysis #1031
  • Provide case-insensitive storage options in backend #999
  • Support local file path in CreateBuilder::with_location() #998
  • Save operational params in the same way with delta io #1054 (ismoshkov)

Fixed bugs:

  • DeltaTable DataFusion TableProvider does not support filter pushdown #1064
  • DeltaTable DataFusion scan does not prune files properly #1063
  • deltalake.DeltaTable constructor hangs in Jupyter #1093
  • Transaction log JSON formatting issue when writing data via Python bindings #1017
  • crates.io entry is missing link to rustdoc documentation #1076
  • URL Registered with ObjectStore registry is different from url in DeltaScan #1018
  • Not able to connect to Azure Storage with client id/secret #977
  • Deltalake 0.5 crate s3 feature dynamodb version mismatch #973
  • Overwrite mode does not work with Azure #939
  • Use Chrono without default features #914
  • cargo test does not run due to tls conflict #985
  • Azure SAS authorization fails with <AuthenticationErrorDetail>Signature fields not well formed. #910

Merged pull requests:

  • Make rustls default across all packages #1097 (wjones127)
  • Implement filesystem check #1103 (Blajda)
  • refactor: move vacuum command to operations module #1045 (roeap)
  • feat: enable passing storage options to Delta table builder via DataFusion's CREATE EXTERNAL TABLE #1043 (gruuya)
  • feat: improve storage location handling #1065 (roeap)
  • Fix to support UTC timezone #1022 (andrei-ionescu)
  • feat: harmonize and simplify storage configuration #1052 (roeap)
  • feat: expose function to get table of add actions #1033 (wjones127)
  • fix: change unexpected field logging level to debug #1112 (houqp)
  • fix: datafusion predicate pushdown and dependencies #1071 (roeap)
  • fix: azure sas key url encoding #1036 (roeap)
  • Add provisional workaround to support CDC #1039 #1042 (Fazzani)
  • improve debuggability of json ser/de errors #1119 (houqp)
  • Add an example of writing to a delta table with a RecordBatch #1085 (rtyler)
  • minor: optimize partition lookup for vacuum loop #1120 (houqp)
  • Add missing documentation metadata to Cargo.toml #1077 (johnbatty)
  • add test for null_count_schema_for_fields #1135 (marijncv)
  • add test for min_max_schema_for_fields #1122 (marijncv)
  • add test for get_boolean_from_metadata #1121 (marijncv)
  • add test for left_larger_than_right #1110 (marijncv)
  • Add test for: to_scalar_value #1086 (marijncv)
  • Fix typo in delta-inspect #1072 (byteink)
  • chore: update datafusion #1114 (roeap)

* This Changelog was automatically generated by github_changelog_generator

delta-rs - python-v0.7.0

Published by wjones127 over 1 year ago

What's Changed

New Contributors

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.6.4...python-v0.7.0

delta-rs - rust-v0.6.0

Published by wjones127 almost 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.6.4...rust-v0.6.0

delta-rs - python-v0.6.4

Published by fvaleye almost 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.6.3...python-v0.6.4

delta-rs - rust-v0.5.0

Published by houqp almost 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/delta-io/delta-rs/compare/rust-v0.4.0...rust-v0.5.0

delta-rs - python-v0.6.3

Published by fvaleye almost 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.6.2...python-v0.6.3

What's Changed

New Contributors

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.6.2...python-v0.6.3

delta-rs - python-v0.6.2

Published by fvaleye about 2 years ago

What's Changed

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.6.1...python-v0.6.2

delta-rs - python-v0.6.1

Published by fvaleye about 2 years ago

What's Changed

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.6.0...python-v0.6.1

delta-rs - python-v0.6.0

Published by fvaleye about 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.5.8...python-v0.6.0

delta-rs - python-v0.5.8

Published by fvaleye over 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.5.7...python-v0.5.8

delta-rs - python-v0.5.7

Published by fvaleye over 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.5.6...python-v0.5.7

delta-rs - python-v0.5.6

Published by fvaleye over 2 years ago

  • Bump version of Python binding to 0.5.6 (#558)
  • Move delta-inspect to its own crate (#557)
  • Fix VACUUM by using table_uri when filtering files to delete (#551)
  • Formally verify S3 atomic rename (#540)
  • Implement missing Azure storage backend methods (#499)
  • Implement polling for table updates (#550)
  • Add target in Python release Github action workflow. (#548)

Credits:
QP Hou, Thomas Vollmer, David Blajda, Florian Valeye

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.5.5...python-v0.5.6

delta-rs - python-v0.5.5

Published by fvaleye over 2 years ago

  • Add storage options for backends (#544)
  • Remove coupling of DynamoDbLockClient from S3 storage (#535)
  • add macOS 11 support in python binding release (#541)
  • Refresh Python usage documentation (#539)
  • [Python] Create PyArrow dataset fragments from delta log (#525)
  • Fix Delta metadata transaction schema (#531)
  • Add gcs test and improve credential error (#533)
  • Return complete history (#526)
  • Move dynamodb lock into its own crate (#508)
  • Add datafusion examples to docs (#519)
  • Fix S3 list_objs and cleanup_metadata (#518)
  • Add support for creating List and Map schema types (#517)
  • Update datafusion version to 6 (#516)
  • Retry S3 get request on 500 Internal Server Error (#510)
  • Fix memory overhead when creating checkpoint (#502)
  • Fix nullable partition values (#498)
  • Fix cleanup_expired_logs timestamp (#503)
  • Add bool config enableExpiredLogCleanup. (#500)
  • pin arrow to major version (#501)

Credits:
Florian Valeye, ahmedriza, Will Jones, Liang-Chi Hsieh, Gabriel J. Michael, Matthew Turner, Mykhailo Osypov, Andrei Ionescu, QP Hou

Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.5.4...python-v0.5.5

delta-rs - python-v0.5.4

Published by fvaleye almost 3 years ago

  • Clean up expired delta table commit logs after checkpoint (#484)
  • Add authorization options for azure storage backend (#486)
  • Bump arrow to 6.1.0 (#494)
  • Add DeltaTableError in Python binding. Add markers for integration tests with pytest. (#496)
  • Change Rust edition from 2018 to 2021 (#490)
  • Add docs for ADLS Gen2. (#492)
  • Add gt, gte, lt and lte partition filters. (#478)
  • Fix python build (#487)
  • Try to fix flaky rename under Windows (#485)
  • Update azure crates (#474)
  • Update README.adoc (#482)
  • Fix documentation for the DeltaStorageHandler (#483)
  • Throw an error when filter key is not in partitioned columns. (#475)
  • Add GCS feature to the Python Cargo.toml file (#476)
  • Make file storage backend's atomic rename async (#471)
  • materialize tables in python via native storage backend (#463)
  • Fix coverage of the Python tests (#467)
  • Support hash lookup by path string for Remove action (#462)
  • Add new module for DeltaTableState (#464)
  • Avoid table stats override in datafusion extension. (#459)
  • Fix action reconciliation for add after remove (#456)
  • Add pool_idle_timeout options for s3 and sts clients (#458)
  • Generate new session name on assume role credentials provider refresh (#451)
  • return lazy iterator in get tombstone methods (#452)
  • Support no tombstone loading & new table builder API (#445)
  • Fix broken tombstones metadata when extended_file_metadata is different between tomstones in state (#450)
  • README: mark Checkpoint creation as done for Rust (#449)
  • Add maturin develop command with extra (#448)
  • Run all tests under s3 feature flag (#447)
  • Update datafusion links (#446)
  • Batch-apply remove actions in tombstone handling (#444)
  • Fixing test to compare sorted vec (#443)
  • Add delete_lock and fix release_lock (#440)

Credits:
Liang-Chi Hsieh, Robert Pack, Mykhailo Osypov, Florian Valeye, Thomas Vollmer, Yuan Zhou, roeap, Denny Lee, Yuan Zhou, Kelvin S. do Prado, QP Hou, Thomas Peiselt, Bruno Bigras, Akshay Ghiya

delta-rs - python-v0.5.3

Published by fvaleye about 3 years ago

  • Add history command in delta-rs (#428)
  • reenable datafusion integration with temporary fork (#436)
  • Decode path in Add and Remove actions. (#434)
  • Optimize remove action apply with early iteration exit #424 (#431)
  • Clean up DeltaTransactionError (#432)
  • Add is_non_acquirable field to the dynamodb lock (#429)
  • Expose valid primitive type list to public doc (#430)
  • Support partition value string deserialization for timestamp/binary (#371)
  • Bump arrow to 6.0.0-SNAPSHOT and bring map support to schema (#375)
  • Update README.adoc (#426)
  • Introduce DeltaConfig and tombstones retention policy (#420)
  • Sync Action attributes with delta (#380)
  • Add LICENSE file in the Python binding and refer it in the pyproject.toml (#422)
  • Change checkpoint creation logs from info to debug (#423)
  • Add the Glue Data Catalog for reading the DeltaTable (#419)
  • Add S3StorageOptions to allow configuring S3 backend explicitly (#418)
  • BUGFIX: writes to gcs must include the content length header
  • Ensures that all table schemas are of StructType (#415)
  • Fix reading nullable action fields from parquet (#417)
  • Add filesystem argument for reading DeltaTable in Python binding (#414)
  • Add implementation for load_with_datetime in Python package. (#411)
  • Add a Makefile build task in the Python binding (#410)
  • Use update_incremental in update (#398)
  • Use tokio::fs::rename in put_obj. (#403)
  • Update python readme (#406)
  • Update pyproject definition in pyproject.toml (#405)
  • Add examples for reading delta table with Rust API. (#400)
  • Implement delete_objs in fs and s3 storage backends. (#395)
  • Remove version param from create_checkpoint_from_table (#399)
  • Google cloud storage backend (#355)
  • added initial commit info on create method for a DeltaTable (#387)
  • Upgrade to DataFusion 5.0 (#389)
  • additional error handling to atomic_rename (#386)
  • Reuse table/storage instances in checkpoints (#384)
  • Add sts assume role creds for S3 (#383)
  • Update datafusion and ballista links in README (#382)
  • Merge Cargo.toml into pyproject.toml (#381)
  • Implement consistent behavior in Windows with regard to swap parameter. (#379)
  • Refactoring of black, isort, mypy tools usages into pyproject.toml (#378)
  • Wrap DeltaTransactionError with DeltaTableError. (#374)
  • Allow filesystem backend put_obj to overwrite existing (#376)
  • Make Format.options to be required field (#370)
  • Implement atomic put_obj. (#367)
  • support partition value string deserialization for float/double/date (#363)
  • Add '.tmp' suffix to temporary file of prepared commit (#366)
  • cache cargo builds in CI (#359)
delta-rs - python-v0.5.2

Published by houqp about 3 years ago

  • new update_incremental API for streaming table update
  • fix a bug in load_version method causing duplicated data @zijie0
  • fix crash on table load caused by null partition value @zijie0
  • support filtering on null partition value in table load predicate @zijie0
delta-rs - python-v0.5.1

Published by houqp over 3 years ago

  • added columns argument to to_pyarrow_table method to support projections on PyArrow Table conversion @zijie0
  • added to_pandas shortcut method to convert a DeltaTable directly to pandas dataframe @bramrodenburg