TileDB

The Universal Storage Engine

MIT License

Stars
1.8K
Committers
73

Bot releases are hidden (Show)

TileDB - 2.4.0

Published by Shelnutt2 about 3 years ago

TileDB v2.4.0 Release Notes

Disk Format

  • Store array schemas under __schema directory #2258

New features

  • Perform early audit for acceptable aws sdk windows path length #2260
  • Support setting via config s3 BucketCannedACL and ObjectCannedACL via SetACL() methods #2383
  • Update spdlog dependency to 1.9.0 fixing c++17 compatibility and general improvements #1973
  • Added Azure SAS token config support and new config option #2420
  • Load all array schemas in storage manager and pass the appropriate schema pointer to each fragment #2415
  • First revision of the Interval class #2417
  • Add tiledb_schema_evolution_t and new apis for schema evolution #2426
  • Add ArraySchemaEvolution to cpp_api and its unit tests are also added. #2462
  • Add c and cpp api functions for getting the array schema of a fragment #2468
  • Add capnp serialization and rest support for array schema evolution objects #2467

Improvements

  • encryption_key and encryption_type parameters have been added to the config; internal APIs now use these parameters to set the key. #2245
  • Initial read refactor #2374
  • Create class ByteVecValue from typedef #2368
  • Encapsulate spdlog.h #2396
  • Update OSX target to 10.14 for release artifacts #2401
  • Add nullable (and unordered, nullable) support to the smoke test. #2405
  • Initial sparse global order reader #2395
  • Remove sm.sub_partitioner_memory_budget #2402
  • Update the markdown documents for our new version of array schemas #2416
  • Sparse global order reader: no more result cell slab copy. #2411
  • Sparse global order reader: initial memory budget improvements. #2413
  • Optimization of result cell slabs generation for sparse global order reader. #2414
  • Remove selective unfiltering. #2410
  • Updated Azure Storage Lite SDK to 0.3.0 #2419
  • Respect memory budget for sparse global order reader. #2425
  • Use newer Azure patch for all platforms to solve missing header error #2433
  • increased diag output for differences reported by tiledb_unit (some of which may be reasonable) #2437
  • Adjustments to schema evolution new attribute reads #2484
  • Change Quickstart link in readthedocs/doxygen index.rst` #2448
  • Initial sparse unordered with duplicates reader. #2441
  • Add calls to malloc_trim on context and query destruction linux to potentially reduce idle memory usage #2443
  • Add logger internals for std::string and std::stringstream for developer convenience #2454
  • Allow empty attribute writes. #2461
  • Refactored readers: serialization. #2458
  • Allow null data pointers for writes. #2481
  • Update backwards compatibility arrays for 2.3.0 #2487

Deprecations

Bug fixes

  • Fix to correctly apply capnproto create_symlink avoidance patch #2264
  • The bug for calculating max_size_validity for var_size attribute caused incomplete query #2266
  • Always run ASAN with matching compiler versions #2277
  • Fix some loop bounds that reference non-existent elements #2282
  • Treating std::vector like an array; accessing an element that is not present to get its address. #2276
  • Fix buffer arguments in unit-curl.cc #2287
  • Stop loop iterations within limits of vector being initialized. #2320
  • Modify FindCurl_EP.cmake to work for WIN32 -EnableDebug builds #2319
  • Fixing test failure because of an uninitialized buffer. #2386
  • Change a condition that assumed MSVC was the only compiler for WIN32 #2388
  • Fix defects in buffer classes: read, set_offset, advance_offset #2342
  • Use CHECK_SAFE() to avoid multi-threaded conflict #2394
  • Use tiledb _SAFE() items when overlapping threads may invoke code #2418
  • Changes to address issues with default string dimension ranges in query #2436
  • Only set cmake policy CMP0076 if cmake version in use knows about it #2463
  • Fix handling curl REST request having all data in single call back #2485
  • Write queries should post start/end timestamps for REST arrays #2492

API additions

  • Introduce new tiledb_experimental.h c-api header for new feature that don't have a stabilized api yet #2453
  • Introduce new tiledb_experimental cpp-api header for new feature that don't have a stabilized api yet #2453

C API

  • Refactoring [get/set]_buffer APIs #2315
  • Add tiledb_fragment_info_get_array_schema functions for getting the array schema of a fragment #2468
  • Add tiledb_schema_evolution_t and new apis for schema evolution #2426

C++ API

  • Refactoring [get/set]_buffer APIs #2399
  • Add FragmentInfo::array_schema functions for getting the array schema of a fragment #2468
  • Add ArraySchemaEvolution to cpp_api and its unit tests are also added. #2462
TileDB - 2.4.0-rc0

Published by Shelnutt2 about 3 years ago

This is a pre-release for the upcoming TileDB 2.4.0. Full history of the change will be posted in the official release. This is release should be used only for testing and validating the upcoming release. This version is not covered by TileDB compatibility or stability guarantees.

TileDB - 2.3.4

Published by Shelnutt2 about 3 years ago

TileDB v2.3.4 Release Notes

Improvements

  • Query::set_layout: setting the layout on the subarray. #2451
  • Allow empty attribute writes. #2461

Bug fixes

  • Fix deserialization of buffers in write queries with nullable var-length attributes #2442
TileDB - 2.3.3

Published by Shelnutt2 about 3 years ago

TileDB v2.3.3 Release Notes

Improvements

  • Increase REST (TileDB Cloud) retry count from 3 to 25 to be inline with S3/GCS retry times #2421
  • Avoid unnecessary est_result_size computation in must_split #2431
  • Use newer Azure patch for all platforms to solve missing header error #2433

Bug fixes

  • Fix c-api error paths always resetting any alloced pointers to nullptr in-addition to deleting #2427
TileDB - 2.3.2

Published by Shelnutt2 over 3 years ago

TileDB v2.3.2 Release Notes

Improvements

  • Support more env selectable options in both azure-windows.yml and azure-windows-release.yml #2384
  • Enable Azure/Serialization for windows CI artifacts #2400

Bug fixes

  • Correct check for last offset position so that undefined memory is not accessed. #2390
  • Fix failure to read array written with tiledb 2.2 via REST #2404
  • Use the correct buffer for validity deserialization #2407
TileDB - 2.3.1

Published by Shelnutt2 over 3 years ago

TileDB v2.3.1 Release Notes

Improvements

  • Update bzip2 in windows build to 1.0.8 #2332
  • Fixing S3 build for OSX11 #2339
  • Fixing possible overflow in Dimension::tile_num #2265
  • Fixing tile extent calculations for signed integer domains #2303
  • Add support for cross compilation on OSX in superbuild #2354
  • Remove curl link args for cross compilation #2359
  • Enable MacOS arm64 release artifacts #2360
  • Add more stats for compute_result_coords path #2366
  • Support credentials refresh for AWS Assume Role #2376

Bug fixes

  • Fixing intermittent metadata test failure #2338
  • Fix query condition validation check for nullable attributes with null conditions #2344
  • Multi-range single dimension query fix #2347
  • Rewrite Dimension::overlap_ratio #2304
  • Follow up fixes to floating point calculations for tile extents #2341
  • Fix for set_null_tile_extent_to_range #2361
  • Subarray partitioner, unordered should be unordered, even for Hilbert. #2377
TileDB - 2.3.1-rc0

Published by Shelnutt2 over 3 years ago

This is a pre-release for TileDB 2.3.1. The main purpose is to test and validate the new artifact naming scheme and macos arm64 images. This pre-release should only be used for testing and is not covered by TileDB compatibility or stability guarantees.

The change log will be included in the official 2.3.1 release.

TileDB - 2.3.0

Published by Shelnutt2 over 3 years ago

TileDB v2.3.0 Release Notes

Disk Format

  • Format version incremented to 9. #2108

Breaking behavior

  • The setting of `sm.read_range_oob` now defaults to `warn`, allowing queries to run with bounded ranges that errored before. #2176
  • Removes TBB as an optional dependency #2181

New features

  • Support TILEDB_DATETIME_{SEC,MS,US,NS} in arrow_io_impl.h #2228
  • Adds support for filtering query results on attribute values #2141
  • Adding support for time datatype dimension and attribute #2140
  • Add support for serialization of config objects #2164
  • Add C and C++ examples to the examples/ directory for the tiledb_fragment_info_t APIs. #2160
  • supporting serialization (using capnproto) build on windows #2100
  • Config option "vfs.s3.sse" for S3 server-side encryption support #2130
  • Name attribute/dimension files by index. This is fragment-specific and updates the format version to version 9. #2107
  • Smoke Test, remove nullable structs from global namespace. #2078

Improvements

  • replace ReadFromOffset with ReadRange in GCS::read() to avoid excess gcs egress traffic #2307
  • Hilbert partitioning fixes #2269
  • Stats refactor #2267
  • Improve Cap'n Proto cmake setup for system installations #2263
  • Runtime check for minimum validity buffer size #2261
  • Enable partial vacuuming when vacuuming with timestamps #2251
  • Consolidation: de-dupe FragmentInfo #2250
  • Consolidation: consider non empty domain before start timestamp #2248
  • Add size details to s3 read error #2249
  • Consolidation: do not re-open array for each fragment #2243
  • Support back compat writes #2230
  • Serialization support for query conditions #2240
  • Make SubarrayPartitioner's member functions to return Status after calling Subarray::get_range_num. #2235
  • Update bzip2 super build version to 1.0.8 to address CVE-2019-12900 in libbzip2 #2233
  • Timestamp start and end for vacuuming and consolidation #2227
  • Fix memory leaks reported on ASAN when running with leak-detection. #2223
  • Use relative paths in consolidated fragment metadata #2215
  • Optimize Subarray::compute_relevant_fragments #2216
  • AWS S3: improve is_dir #2209
  • Add nullable string to nullable attribute example #2212
  • AWS S3: adding option to skip Aws::InitAPI #2204
  • Added additional stats for subarrays and subarray partitioners #2200
  • Introduces config parameter "sm.skip_est_size_partitioning" #2203
  • Add config to query serialization. #2177
  • Consolidation support for nullable attributes #2196
  • Adjust unit tests to reduce memory leaks inside the tests. #2179
  • Reduces memory usage in multi-range range reads #2165
  • Add config option `sm.read_range_oob` to toggle bounding read ranges to domain or erroring #2162
  • Windows msys2 build artifacts are no longer uploaded #2159
  • Add internal log functions to log at different log levels #2161
  • Parallelize Writer::filter_tiles #2156
  • Added config option "vfs.gcs.request_timeout_ms" #2148
  • Improve fragment info loading by parallelizing fragment_size requests #2143
  • Allow open array stats to be printed without read query #2131
  • Cleanup the GHA CI scripts - put common code into external shell scripts. #2124
  • Reduced memory consumption in the read path for multi-range reads. #2118
  • The latest version of dev was leaving behind a test/empty_string3/. This ensures that the directory is removed when make check is run. #2113
  • Migrating AZP CI to GA #2111
  • Cache non_empty_domain for REST arrays like all other arrays #2105
  • Add additional stats printing to breakdown read state initialization timings #2095
  • Places the in-memory filesystem under unit test #1961
  • Adds a Github Action to automate the HISTORY.md #2075
  • Change printfs in C++ examples to cout, edit C print statements to fix format warnings #2226

Deprecations

  • The following APIs have been deprecated: tiledb_array_open_at, tiledb_array_open_at_with_key, tiledb_array_reopen_at. #2142

Bug fixes

  • Fix a segfault on VFS::ls for the in-memory filesystem #2255
  • Fix rare read corruption in S3 #2253
  • Update some union initializers to use strict syntax #2242
  • Fix race within S3::init_client #2247
  • Expand accepted windows URIs. #2237
  • Write fix for unordered writes on nullable, fixed attributes. #2241
  • Fix tile extent to be reported as domain extent for sparse arrays with Hilbert ordering #2231
  • Do not consider option sm.read_range_oob for set_subarray() on Write queries #2211
  • Change avoiding generation of multiple, concatenated, subarray flattened data. #2190
  • Change mutex from basic to recursive #2180
  • Fixes a memory leak in the S3 read path #2189
  • Fixes a potential memory leak in the filter pipeline #2185
  • Fixes misc memory leaks in the unit tests #2183
  • Fix memory leak of `tiledb_config_t` in error path of `tiledb_config_alloc`. #2178
  • Fix check for null pointer in query deserialization #2163
  • Fixes a potential crash when retrying incomplete reads #2137
  • Fixes a potential crash when opening an array with consolidated fragment metadata #2135
  • Corrected a bug where sparse cells may be incorrectly returned using string dimensions. #2125
  • Fix segfault in serialized queries when partition is unsplittable #2120
  • Always use original buffer size in serialized read queries serverside. #2115
  • Fix an edge-case where a read query may hang on array with string dimensions #2089

API additions

C API

  • Added tiledb_array_set_open_timestamp_start and tiledb_array_get_open_timestamp_start #2285
  • Added tiledb_array_set_open_timestamp_end and tiledb_array_get_open_timestamp_end #2285
  • Addition of tiledb_array_set_config to directly assign a config to an array. #2142
  • tiledb_query_get_array now returns a deep-copy #2184
  • Added `tiledb_serialize_config` and `tiledb_deserialize_config` #2164
  • Add new api, tiledb_query_get_config to get a query's config. #2167
  • Removes non-default parameter in "tiledb_config_unset". #2099

C++ API

  • Added Array::set_open_timestamp_start and Array::open_timestamp_start #2285
  • Added Array::set_open_timestamp_end and Array::open_timestamp_end #2285
  • add Query::result_buffer_elements_nullable support for dims #2238
  • Addition of tiledb_array_set_config to directly assign a config to an array. #2142
  • Add new api, Query.config() to get a query's config. #2167
  • Removes non-default parameter in "Config::unset". #2099
  • Add support for a string-typed, variable-sized, nullable attribute in the C++ API. #2090
TileDB - 2.3.0-rc0

Published by Shelnutt2 over 3 years ago

This is a pre-release for the upcoming TileDB 2.3. Full history of the change will be posted in the official release. This is release should be used only for testing and validating the upcoming release. This version is not covered by TileDB compatibility or stability guarantees.

TileDB - 2.2.9

Published by joe-maley over 3 years ago

TileDB v2.2.9 Release Notes

Bug fixes

  • Fix rare read corruption in S3 #2254
  • Write fix for unordered writes on nullable, fixed attributes #2241
TileDB - 2.2.8

Published by joe-maley over 3 years ago

TileDB v2.2.8 Release Notes

New features

  • Support TILEDB_DATETIME_{SEC,MS,US,NS} in arrow_io_impl.h #2229
  • Add support for serialization of config objects #2164
  • Add support for serialization of query config #2177

Improvements

  • Optimize Subarray::compute_relevant_fragments #2218
  • Reduces memory usage in multi-range range reads #2165
  • Add config option sm.read_range_oob to toggle bounding read ranges to domain or erroring #2162
  • Updates bzip2 to v1.0.8 on Linux/OSX #2233

Bug fixes

  • Fixes a potential memory leak in the filter pipeline #2185
  • Fixes misc memory leaks in the unit tests #2183
  • Fix memory leak of tiledb_config_t in error path of tiledb_config_alloc. #2178

C API

  • tiledb_query_get_array now returns a deep-copy #2188
  • Add new api,tiledb_query_get_config to get a query's config. #2167
  • Added tiledb_serialize_config and tiledb_deserialize_config #2164

C++ API

  • Add new api, Query.config() to get a query's config. #2167
TileDB - https://github.com/TileDB-Inc/TileDB/releases/tag/2.2.8-rc0

Published by Shelnutt2 over 3 years ago

TileDB - TileDB v2.2.7

Published by joe-maley over 3 years ago

TileDB v2.2.7 Release Notes

Improvements

  • Added config option vfs.gcs.request_timeout_ms #2148
  • Improve fragment info loading by parallelizing fragment_size requests #2143
  • Apply 'var_offsets.extra_element' mode to string dimension offsets too #2145
TileDB - TileDB v2.2.6

Published by Shelnutt2 over 3 years ago

TileDB v2.2.6 Release Notes

Bug fixes

  • Fixes a potential crash when retrying incomplete reads #2137
TileDB - TileDB v2.2.5

Published by Shelnutt2 over 3 years ago

TileDB v2.2.5 Release Notes

New features

  • Config option vfs.s3.sse for S3 server-side encryption support #2130

Improvements

  • Reduced memory consumption in the read path for multi-range reads. #2118
  • Cache non_empty_domain for REST arrays like all other arrays #2105
  • Add additional timer statistics for openning array for reads #2027
  • Allow open array stats to be printed without read query #2131

Bug fixes

  • Fixes a potential crash when opening an array with consolidated fragment metadata #2135
  • Corrected a bug where sparse cells may be incorrectly returned using string dimensions. #2125
  • Always use original buffer size in serialized read queries serverside. #2115
  • Fix segfault in serialized queries when partition is unsplittable #2120
TileDB - TileDB v2.2.5 Release Candidate 0

Published by Shelnutt2 over 3 years ago

This is a release candidate for the upcoming TileDB 2.2.5. This is not an official release, please continue to use TileDB 2.2.4 until TileDB 2.2.5 is finalized.

TileDB v2.2.5 Release Candidate 0 Notes

Improvements

  • Cache non_empty_domain for REST arrays like all other arrays #2105
  • Add additional timer statistics for opening array for reads #2027
  • Allow open array stats to be printed without read query #2131

Bug fixes

  • Corrected a bug where sparse cells may be incorrectly returned using string dimensions. #2125
  • Always use original buffer size in serialized read queries serverside. #2115
  • Fix segfault in serialized queries when partition is unsplittable #2120
TileDB - TileDB v2.2.4

Published by joe-maley over 3 years ago

TileDB v2.2.4 Release Notes

Improvements

  • Add additional stats printing to breakdown read state initialization timings #2095
  • Improve GCS multipart locking #2087

Bug fixes

  • Fix an edge-case where a read query may hang on array with string dimensions #2089
  • Fix mutex locking bugs on Windows due to unlocking on different thread and missing task join #2077

C++ API

  • Add support for a string-typed, variable-sized, nullable attribute in the C++ API. #2090
TileDB - TileDB v2.2.3

Published by Shelnutt2 over 3 years ago

TileDB v2.2.3 Release Notes

New features

  • Add support for retrying REST requests that fail with certain http status code such as 503 #2060

Improvements

  • Parallelize across attributes when closing a write #2048
  • Support for dimension/attribute names that contain commonly reserved filesystem characters #2047
  • Remove unnecessary is_dir in FragmentMetadata::store, this can increase performance for s3 writes #2050
  • Improve S3 multipart locking #2055
  • Parallelize loading fragments and array schema #2061
TileDB - TileDB v2.2.2

Published by ihnorton over 3 years ago

TileDB v2.2.2 Release Notes

New features

  • REST client support for caching redirects #1919

Improvements

  • Add additional timer statistics for openning array for reads #2027
  • Add rest.creation_access_credentials_name configuration parameter #2025

Bug fixes

  • Fixed ArrowAdapter export of string arrays with 64-bit offsets #2037
  • Fixed ArrowAdapter export of TILEDB_CHAR arrays with 64-bit offsets #2039

API additions

C API

  • Add tiledb_query_set_config to apply a tiledb_config_t to query-level parameters #2030

C++ API

  • Added Query::set_config to apply a tiledb::Config to query-level parameters #2030
TileDB - TileDB v2.2.1

Published by joe-maley almost 4 years ago

TileDB v2.2.1 Release Notes

Breaking behavior

  • The tile extent can now be set to null, in which case internally TileDB sets the extent to the dimension domain range. #1880
  • The C++ API std::pair<uint64_t, uint64_t> Query::est_result_size_var has been changed to 1) a return type of std::array<uint64_t, 2> and 2) returns the offsets as a size in bytes rather than elements. #1946

New features

  • Support for nullable attributes. #1895 #1938 #1948 #1945
  • Support for Hilbert order sorting for sparse arrays. #1880
  • Support for AWS S3 "AssumeRole" temporary credentials #1882
  • Support for zero-copy import/export with the Apache Arrow adapter #2001
  • Experimental support for an in-memory backend used with bootstrap option "--enable-memfs" #1873
  • Support for element offsets when reading var-sized attributes. #1897
  • Support for an extra offset indicating the size of the returned data when reading var-sized attributes. #1932
  • Support for 32-bit offsets when reading var-sized attributes. #1950

Improvements

  • Optimized string dimension performance.
  • Added functionality to get fragment information from an array. #1900
  • Prevented unnecessary sorting when (1) there is a single fragment and (i) either the query layout is global order, or (ii) the number of dimensions is 1, and (2) when there is a single range for which the result coordinates have already been sorted. #1880
  • Added extra stats for consolidation. #1880
  • Disabled checking if cells are written in global order when consolidating, as it was redundant (the cells are already being read in global order during consolidation). #1880
  • Optimize consolidated fragment metadata loading #1975

Bug fixes

  • Fix tiledb_dimension_alloc returning a non-null pointer after error #1959
  • Fixed issue with string dimensions and non-set subarray (which implies spanning the whole domain). There was an assertion being triggered. Now it works properly.
  • Fixed bug when checking the dimension domain for infinity or NaN values. #1880
  • Fixed bug with string dimension partitioning. #1880

API additions

C API

  • Added functions for getting fragment information. #1900
  • Added APIs for getting and setting ranges of queries using a dimension name. #1920

C++ API

  • Added class FragmentInfo and functions for getting fragment information. #1900
  • Added function Dimension::create that allows not setting a space tile extent. #1880
  • Added APIs for getting and setting ranges of queries using a dimension name. #1920
  • Changed std::pair<uint64_t, uint64_t> Query::est_result_size_var to std::array<uint64_t, 2> Query::est_result_size_var. Additionally, the size estimate for the offsets have been changed from elements to bytes. #1946
Package Rankings
Top 8.57% on Conda-forge.org
Top 17.89% on Anaconda.org
Badges
Extracted from project README
Full CI Azure Pipelines Anaconda download count badge