kamu-cli

New generation decentralized data lake and a streaming data pipeline

OTHER License

Downloads
1.3K
Stars
277
Committers
8

Bot releases are visible (Hide)

kamu-cli - Release v0.178.0

Published by github-actions[bot] 6 months ago

kamu-cli - Release v0.177.0

Published by github-actions[bot] 6 months ago

[0.177.0] - 2024-04-25

Added

  • REST data APIs and CLI commands now support different variations of JSON representation, including:
    • Array-of-Structures e.g. [{"c1": 1, "c2": 2}, {"c1": 3, "c2": 4}] (default)
    • Structure-of-Arrays e.g. {"c1": [1, 3], "c2": [2, 4]}
    • Array-of-Arrays e.g. [[1, 2], [3, 4]]
  • REST data APIs now also return schema of the result (pass ?schema=false to switch it off)

Changed

  • Upgraded to datafusion v37.1.0
  • Split the Flow system crates

Fixed

  • kamu system diagnose command crashes when executed outside the workspace directory
kamu-cli - Release v0.176.3

Published by github-actions[bot] 6 months ago

[0.176.3] - 2024-04-18

Changed

  • Updated KAMU_WEB_UI image to latest release 0.18.1
kamu-cli - Release v0.176.2

Published by github-actions[bot] 6 months ago

[0.176.2] - 2024-04-16

Fixed

  • Fix the instant run of ExecuteTransform flow after the successful finish of HardCompacting flow
  • Extend FlowFailed error type to indicate when ExecuteTransform failed due to HardCompacting of root dataset
kamu-cli - Release v0.176.1

Published by github-actions[bot] 6 months ago

[0.176.1] - 2024-04-16

Fixed

  • Split result for different(setFlowConfigResult, setFlowBatchingConfigResult, setFlowCompactingConfigResult) flow configuration mutations
kamu-cli - Release v0.176.0

Published by github-actions[bot] 6 months ago

[0.176.0] - 2024-04-15

  • New engine based on RisingWave streaming database (repo) that provides mature streaming alternative to Flink. See:
    • Updated supported engines documentation
    • New top-n dataset example highlighting retractions
    • Updated examples/covid dataset where RisingWave replaced Flink in tumbling window aggregation
kamu-cli - Release v0.175.0

Published by github-actions[bot] 6 months ago

[0.175.0] - 2024-04-15

Added

  • The kamu ingest command can now accept --event-time hint which is useful for snapshot-style data that doesn't have an event time column
  • The /ingest REST API endpoint also supports event time hints via odf-event-time header
  • New --system-time root parameter allows overriding time for all CLI commands

Fixed

  • CLI show errors not only under TTY
  • Removed paused from setConfigCompacting mutation
  • Extended GraphQL FlowDescriptionDatasetHardCompacting empty result with a resulting message
  • GraphQL Dataset Endpoints object: fixed the query endpoint
kamu-cli - Release v0.174.1

Published by github-actions[bot] 6 months ago

[0.174.1] - 2024-04-12

Fixed

  • Set correct ODF push/pull websocket protocol
kamu-cli - Release v0.174.0

Published by github-actions[bot] 6 months ago

[0.174.0] - 2024-04-12

Added

  • HardCompacting to flow system
kamu-cli - Release v0.173.0

Published by github-actions[bot] 6 months ago

[0.173.0] - 2024-04-09

Added

  • OData API now supports querying by collection ID/key (e.g. account/covid.cases(123))

Fixed

  • Handle broken pipe panic when process piping data into kamu exits with an error
kamu-cli - Release v0.172.2

Published by github-actions[bot] 6 months ago

kamu-cli - Release v0.172.1

Published by github-actions[bot] 6 months ago

[0.172.1] - 2024-04-08

Fixed

  • Add precondition flow checks
  • Fix URLs for Get Data panel
kamu-cli - Release v0.172.0

Published by github-actions[bot] 6 months ago

[0.172.0] - 2024-04-08

Added

  • Added persistency infrastructure prototype based on sqlx engine:
    • supports Postgres, MySQL/MariaDB, SQlite database targets
    • sketched simplictic Accounts domain (not-integrated yet)
    • converted Task System domain to use persistent repositories
    • added test infrastructure for database-specific features
    • automated and documented development flow procedures in database offline/online modes
kamu-cli - Release v0.171.0

Published by github-actions[bot] 7 months ago

[0.171.0] - 2024-04-05

Added

  • Support ArrowJson schema output format in QGL API and CLI commands
  • New kamu system compact <dataset> command that compacts dataslices for the given dataset

Changed

  • Case-insensitive comparisons of datasets, accounts and repos
kamu-cli - Release v0.170.0

Published by github-actions[bot] 7 months ago

[0.170.0] - 2024-03-29

Added

  • Added GrapqQL Dataset Endpoints object

Changed

  • REST API: /ingest endpoint will return HTTP 400 error when data cannot be read correctly
  • Improved API token generation command
kamu-cli - Release v0.169.0

Published by github-actions[bot] 7 months ago

[0.169.0] - 2024-03-25

Changed

  • Updated embedded Web UI to v0.17.0

Fixed

  • S3 Repo: Ignore dataset entries without a valid alias and leave them to be cleaned up by GC
  • Caching object repo: Ensure directory exists before writing objects
kamu-cli - Release v0.168.0

Published by github-actions[bot] 7 months ago

[0.168.0] - 2024-03-23

Changed

  • FlightSQL: For expensive queries GetFlightInfo we will only prepare schemas and not compute results - this avoids doing double the work just to return total_records and total_bytes in FlightInfo before result is fetched via DoGet
  • Optimized implementation of Datafusion catalog, scheme, and table providers that includes caching and maximally delays the metadata scanning
kamu-cli - Release v0.167.2

Published by github-actions[bot] 7 months ago

[0.167.2] - 2024-03-23

Fixed

  • Improved Python connectivity examples (ADBC, Sqlalchemy, DBAPI2, JDBC)
  • Fix invalid location info returned by FlightSQL protocol in FlightInfo that might've been causing errors in some client libraries and slowing down others.
kamu-cli - Release v0.167.1

Published by github-actions[bot] 7 months ago

[0.167.1] - 2024-03-20

Fixed

  • Bug when handle created during dataset creation had empty account name in a multi-tenant repo.
kamu-cli - Release v0.167.0

Published by github-actions[bot] 7 months ago

[0.167.0] - 2024-03-19

Added

  • Implementation of ObjectRepository that can cache small objects on local file system (e.g. to avoid too many calls to S3 repo)
  • Optional S3RegistryCache component that can cache the list of datasets under an S3 repo to avoid very expensive bucket prefix listing calls