A native Rust library for Delta Lake, with bindings into Python
APACHE-2.0 License
Bot releases are hidden (Show)
Published by ion-elgreco about 1 month ago
Add::get_json_stats
public by @gruuya in https://github.com/delta-io/delta-rs/pull/2822
in
pushdowns in early_filter by @ion-elgreco in https://github.com/delta-io/delta-rs/pull/2807
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.19.1...python-v0.19.2
Published by ion-elgreco 2 months ago
max_spill_size
default value by @mrjsj in https://github.com/delta-io/delta-rs/pull/2795
file_actions
call sites by @roeap in https://github.com/delta-io/delta-rs/pull/2787
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.19.0...python-v0.19.1
Published by ion-elgreco 2 months ago
Default writer engine has changed to rust. Replace your partition_filters with a predicate (sql) instead. PyArrow engine is deprecated now, and will be removed in v1.0.
delta.enableExpiredLogCleanup = false
ADD column
operationdelete
operation by @ion-elgreco in https://github.com/delta-io/delta-rs/pull/2721
overwrite
and replacewhere
writes by @ion-elgreco in https://github.com/delta-io/delta-rs/pull/2722
add column
operation by @ion-elgreco in https://github.com/delta-io/delta-rs/pull/2562
RUF
ruleset for ruff
by @fpgmaas in https://github.com/delta-io/delta-rs/pull/2677
ruff
and mypy
versions in the lint
stage in the CI pipeline by @fpgmaas in https://github.com/delta-io/delta-rs/pull/2679
Makefile
by @fpgmaas in https://github.com/delta-io/delta-rs/pull/2688
Literal
by @fpgmaas in https://github.com/delta-io/delta-rs/pull/2676
write_deltalake
in writer.py
by @fpgmaas in https://github.com/delta-io/delta-rs/pull/2695
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.18.2...python-v0.19.0
Published by ion-elgreco 4 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.18.1...python-v0.18.2
Published by ion-elgreco 4 months ago
files_by_partition
to public api by @edmondop in https://github.com/delta-io/delta-rs/pull/2533
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.18.0...python-v0.18.1
Published by ion-elgreco 4 months ago
set table properties
operation by @ion-elgreco in https://github.com/delta-io/delta-rs/pull/2264
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.17.4...python-v0.18.0
Published by ion-elgreco 5 months ago
to_pyarrow_dataset
by @ion-elgreco in https://github.com/delta-io/delta-rs/pull/2485
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.17.3...python-v0.17.4
Published by ion-elgreco 6 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.17.2...python-v0.17.3
Published by rtyler 6 months ago
Implemented enhancements:
Fixed bugs:
add.stats_parsed
with wrong type #2312
deltalake_core::kernel::snapshot::log_segment::list_log_files_with_checkpoint::{{closure}}
#2290
logRetentionDuration
#2180
Published by ion-elgreco 6 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.17.1...python-v0.17.2
Published by ion-elgreco 6 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.17.0...python-v0.17.1
Published by ion-elgreco 6 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.16.4...python-v0.17.0
Published by ion-elgreco 7 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.16.3...python-v0.16.4
Published by ion-elgreco 7 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.16.2...python-v0.16.3
Published by ion-elgreco 7 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.16.1...python-v0.16.2
Published by ion-elgreco 7 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.16.1...python-v0.16.2
Published by ion-elgreco 7 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.16.0...python-v0.16.1
Published by ion-elgreco 8 months ago
This version introduces timestampNtz datatype, this means if your writer before wrote timestamp with no timezones to a timestamp column, this will now fail. The new behavior is that you can only write timestamps with UTC time zone to timestamp primitive type.
drop constraint
operation by @ion-elgreco in https://github.com/delta-io/delta-rs/pull/2071
is_commit_file
should only catch commit jsons by @emcake in https://github.com/delta-io/delta-rs/pull/2213
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.15.3...python-v0.16.0
Published by rtyler 8 months ago
⚠️ The release of 0.17.0 removes the legacy dynamodb lock functionality, AWS users must read these release notes! ⚠️
The 0.17.0 release moves storage implementations into their own crates, such as
deltalake-aws
. A consequence of that refactoring is that custom storage and
file scheme handlers must be registered/initialized at runtime. Storage
subcrates conventionally define a register_handlers
function which performs
that task. Users may see errors such as:
thread 'main' panicked at /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/deltalake-core-0.17.0/src/table/builder.rs:189:48:
The specified table_uri is not valid: InvalidTableLocation("Unknown scheme: s3")
deltalake
) can call the storage crate via: deltalake::aws::register_handlers(None);
at the entrypoint for their code.core
and storage crates independently (e.g. deltalake-aws
) can register via deltalake_aws::register_handlers(None);
.The AWS, Azure, and GCP crates must all have their custom file schemes registered in this fashion.
The locking mechanism is fundamentally different between deltalake
v0.16.x and v0.17.0, starting with this release the deltalake
and deltalake-aws
crates this library now relies on the same protocol for concurrent writes on AWS as the Delta Lake/Spark implementation.
Fundamentally the DynamoDB table structure changes, which is documented here. The configuration of a Rust process should continue to use the AWS_S3_LOCKING_PROVIDER
environment value of dynamodb
. The new table must be specified with the DELTA_DYNAMO_TABLE_NAME
environment or configuration variable, and that should name the new S3DynamoDbLogStore
compatible DynamoDB table.
Because locking is required to ensure safe cconsistent writes, there is no iterative migration, 0.16 and 0.17 writers cannot safely coexist. The following steps should be taken when upgrading:
Implemented enhancements:
write_deltalake
silently changes nothing #2108
ensure_table_uri
when creating a table with_log_store
#2036
Schema
in write_deltalake
#1862
Fixed bugs:
WriteBuilder::with_input_execution_plan
does not apply the schema to the log's metadata fields #2105
Unknown scheme: s3
#2065
~
#1806
Closed issues:
Published by ion-elgreco 9 months ago
Full Changelog: https://github.com/delta-io/delta-rs/compare/python-v0.15.2...python-v0.15.3