spiceai

A unified SQL query interface and portable runtime to locally materialize, accelerate, and query datasets from any database, data warehouse, or data lake.

APACHE-2.0 License

Stars
1.5K
Committers
24

Bot releases are hidden (Show)

spiceai - v0.17.4-beta Latest Release

Published by ewgenius about 1 month ago

spiceai - v0.17.3-beta

Published by Jeadie about 2 months ago

Spice v0.17.3-beta (Sep 2, 2024)

The v0.17.3-beta release further improves data accelerator robustness and adds a new github data connector that makes accelerating GitHub Issues, Pull Requests, Commits, and Blobs easy.

Highlights in v0.17.3-beta

Improved benchmarking, testing, and robustness of data accelerators: Continued improvements to benchmarking and testing of data accelerators, leading to more robust and reliable data accelerators.

GitHub Connector (alpha): Connect to GitHub and accelerate Issues, Pull Requests, Commits, and Blobs.

datasets:
  # Fetch all rust and golang files from spiceai/spiceai
  - from: github:github.com/spiceai/spiceai/files/trunk
    name: spiceai.files
    params:
      include: '**/*.rs; **/*.go'
      github_token: ${secrets:GITHUB_TOKEN}

    # Fetch all issues from spiceai/spiceai. Similar for pull requests, commits, and more.
  - from: github:github.com/spiceai/spiceai/issues
    name: spiceai.issues
    params:
      github_token: ${secrets:GITHUB_TOKEN}

Breaking Changes

None.

Contributors

  • @phillipleblanc
  • @Jeadie
  • @peasee
  • @sgrebnov
  • @Sevenannn
  • @lukekim
  • @dependabot
  • @ewgenius

What's Changed

Dependencies

  • delta_kernel from 0.2.0 to 0.3.0.

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.2-beta...v0.17.3-beta

spiceai - v0.17.2-beta.1

Published by phillipleblanc about 2 months ago

This is the release candidate 0.17.2-beta.1

spiceai - v0.17.2-beta

Published by phillipleblanc about 2 months ago

Spice v0.17.2-beta (Aug 26, 2024)

The v0.17.2-beta release focuses on improving data accelerator compatibility, stability, and performance. Expanded data type support for DuckDB, SQLite, and PostgreSQL data accelerators (and data connectors) enables significantly more data types to be accelerated. Error handling and logging has also been improved along with several bugs.

Highlights in v0.17.2-beta

Expanded Data Type Support for Data Accelerators: DuckDB, SQLite, and PostgreSQL Data Accelerators now support a wider range of data types, enabling acceleration of more diverse datasets.

Enhanced Error Handling and Logging: Improvements have been made to aid in troubleshooting and debugging.

Anonymous Usage Telemetry: Optional, anonymous, aggregated telemetry has been added to help improve Spice. This feature can be disabled. For details about collected data, see the telemetry documentation.

To opt out of telemetry:

  1. Using the CLI flag:

    spice run -- --telemetry-enabled false
    
  2. Add configuration to spicepod.yaml:

    runtime:
      telemetry:
        enabled: false
    

Improved Benchmarking: A suite of performance benchmarking tests have been added to the project, helping to maintain and improve runtime performance; a top priority for the project.

Breaking Changes

None.

Contributors

  • @Jeadie
  • @y-f-u
  • @phillipleblanc
  • @sgrebnov
  • @Sevenannn
  • @peasee
  • @ewgenius

What's Changed

Dependencies

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.1-beta...v0.17.2-beta

spiceai - v0.17.1-beta

Published by phillipleblanc 3 months ago

Spice v0.17.1-beta (Aug 5, 2024)

The v0.17.1-beta minor release focuses on enhancing stability, performance, and usability. The Flight interface now supports the GetSchema API and s3, ftp, sftp, http, https, and databricks data connectors have added support for a client_timeout parameter.

Highlights in v0.17.1-beta

Flight API GetSchema: The GetSchema API is now supported by the Flight interface. The schema of a dataset can be retrieved using GetSchema with the PATH or CMD FlightDescriptor types. The CMD FlightDescriptor type is used to get the schema of an arbitrary SQL query as the CMD bytes. The PATH FlightDescriptor type is used to retrieve the schema of a dataset.

Client Timeout: A client_timeout parameter has been added for Data Connectors: ftp, sftp, http, https, and databricks. When defined, the client timeout configures Spice to stop waiting for a response from the data source after the specified duration. The default timeout is 30 seconds.

datasets:
  - from: ftp://remote-ftp-server.com/path/to/folder/
    name: my_dataset
    params:
      file_format: csv
      # Example client timeout
      client_timeout: 30s
      ftp_user: my-ftp-user
      ftp_pass: ${secrets:my_ftp_password}

Breaking Changes

TLS is now required to be explicitly enabled. Enable TLS on the command line using --tls-enabled true:

spice run -- --tls-enabled true --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem

Or in the spicepod.yml with enabled: true:

runtime:
  tls:
    # TLS explicitly enabled
    enabled: true
    certificate_file: /path/to/cert.pem
    key_file: /path/to/key.pem

Contributors

  • @Jeadie
  • @y-f-u
  • @phillipleblanc
  • @sgrebnov
  • @peasee
  • @Sevenannn

What's Changed

Dependencies

  • Rust: Upgraded from v1.79.0 to v1.80.0

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.0-beta...v0.17.1-beta

spiceai - v0.17.0-beta

Published by phillipleblanc 3 months ago

Spice v0.17-beta (July 29, 2024)

Announcing the first beta release of Spice.ai OSS! 🎉

The core Spice runtime has graduated from alpha to beta! Components, such as Data Connectors and Models, follow independent release milestones. Data Connectors graduating from alpha to beta include databricks, spiceai, postgres, s3, odbc, and mysql. From beta to 1.0, project will be to on improving performance and scaling to larger datasets.

This release also includes enhanced security with Transport Layer Security (TLS) secured APIs, a new spice install CLI command, and several performance and stability improvements.

Highlights in v0.17-beta

  • Encryption in transit with TLS: The HTTP, gRPC, Metrics, and OpenTelemetry (OTEL) API endpoints can be secured with TLS by specifying a certificate and private key in PEM format.

Enable TLS using the --tls-certificate-file and --tls-key-file command-line flags:

spice run -- --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem

Or configure in the spicepod.yml:

runtime:
  tls:
    certificate_file: /path/to/cert.pem
    key_file: /path/to/key.pem

Get started with TLS by following the TLS Sample. For more details see the TLS Documentation.

  • spice install: Running the spice install CLI command will download and install the latest version of the runtime.
spice install
  • Improved SQLite and DuckDB compatibility: The SQLite and DuckDB accelerators support more complex queries and additional data types.

  • Pass through arguments from spice run to runtime: Arguments passed to spice run are now passed through to the runtime.

  • Secrets replacement within connection strings: Secrets are now replaced within connection strings:

datasets:
  - from: mysql:my_table
    name: my_table
    params:
      mysql_connection_string: mysql://user:${secrets:mysql_pw}@localhost:3306/db

Breaking Changes

The odbc data connector is now optional and has been removed from the released binaries. To use the odbc data connector, use the official Spice Docker image or build the Spice runtime from source.

To build Spice from source with the odbc feature:

cargo build --release --features odbc

To use the official Spice Docker image from DockerHub:

# Pull the latest official Spice image
docker pull spiceai/spiceai:latest

# Pull the official v0.17-beta Spice image
docker pull spiceai/spiceai:0.17.0-beta

Contributors

  • @y-f-u
  • @peasee
  • @digadeesh
  • @phillipleblanc
  • @ewgenius
  • @sgrebnov
  • @Sevenannn
  • @lukekim

What's Changed

Dependencies

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.16.0-alpha...v0.17-beta

spiceai - v0.16.0-alpha

Published by digadeesh 3 months ago

Spice v0.16-alpha (July 22, 2024)

The v0.16-alpha release is the first candidate release for the beta milestone on a path to finalizing the v1.0 developer and user experience. Upgraders should be aware of several breaking changes designed to improve the Secrets configuration experience and to make authoring spicepod.yml files more consistent. See the Breaking Changes section below for details. Additionally, the Spice Java SDK was released, providing Java developers a simple but powerful native experience to query Spice.

Highlights in v0.16-alpha

secrets:
  - from: env
    name: env
  - from: aws_secrets_manager:my_secret_name
    name: aws_secret

Secrets managed by configured Secret Stores can be referenced in component params using the syntax ${<store_name>:<key>}. E.g.

datasets:
  - from: postgres:my_table
    name: my_table
    params:
      pg_host: localhost
      pg_port: 5432
      pg_pass: ${ env:MY_PG_PASS }
  • Java Client SDK: The Spice Java SDK has been released for JDK 17 or greater.

  • Federated SQL Query: Significant stability and reliability improvements have been made to federated SQL query support in most data connectors.

  • ODBC Data Connector: Providing a specific SQL dialect to query ODBC data sources is now supported using the sql_dialect param. For example, when querying Databricks using ODBC, the databricks dialect can be specified to ensure compatibility. Read the ODBC Data Connector documentation for more details.

Breaking Changes

  • Secret Stores: Secret Stores support has been overhauled including required changes to spicepod.yml schema. File based secrets stored in the ~/.spice/auth file are no longer supported. See Secret Stores Documentation for full reference.

To upgrade Secret Stores, rename any parameters ending in _key to remove the _key suffix and specify a secret inline via the secret replacement syntax (${<secret_store>:<key>}):

datasets:
  - from: postgres:my_table
    name: my_table
    params:
      pg_host: localhost
      pg_port: 5432
      pg_pass_key: my_pg_pass

to:

datasets:
  - from: postgres:my_table
    name: my_table
    params:
      pg_host: localhost
      pg_port: 5432
      pg_pass: ${secrets:my_pg_pass}

And ensure the MY_PG_PASS environment variable is set.

  • Datasets: The default value of time_format has changed from unix_seconds to timestamp.

To upgrade:

datasets:
  - from:
    name: my_dataset
    # Explicitly define format when not specified.
    time_format: unix_seconds
  • HTTP Port: The default HTTP port has changed from port 3000 to port 8090 to avoid conflicting with frontend apps which typically use the 3000 range. If an SDK is used, upgrade it at the same time as the runtime.

To upgrade and continue using port 3000, run spiced with the --http command line argument:

# Using Dockerfile or spiced directly
spiced --http 127.0.0.1:3000
  • HTTP Metrics Port: The default HTTP Metrics port has changed from port 9000 to 9090 to avoid conflicting with other metrics protocols which typically use port 9000.

To upgrade and continue using port 9000, run spiced with the metrics command line argument:

# Using Dockerfile or spiced directly
spiced --metrics 127.0.0.1:9000

To upgrade, change:

json_path: my.json.path

To:

json_pointer: /my/json/pointer
  • Data Connector Configuration: Consistent connector name prefixing has been applied to connector specific params parameters. Prefixed parameter names helps ensure parameters do not collide.

For example, the Databricks data connector specific params are now prefixed with databricks:

datasets:
  - from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog
    name: my_delta_lake_table
    params:
      mode: spark_connect
      endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com
      token: MY_TOKEN

To upgrade:

datasets:
  # Example for Spark Connect
  - from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog
    name: my_delta_lake_table
    params:
      mode: spark_connect
      databricks_endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com # Now prefixed with databricks
      databricks_token: ${secrets:my_token} # Now prefixed with databricks

Refer to the Data Connector documentation for parameter naming changes in this release.

Clickhouse Data Connector: The clickhouse_connection_timeout parameter has been renamed to connection_timeout as it applies to the client and is not Clickhouse configuration itself.

To upgrade, change:

clickhouse_connection_timeout: time

To:

connection_timeout: time

Contributors

  • @y-f-u
  • @phillipleblanc
  • @ewgenius
  • @github-actions
  • @sgrebnov
  • @lukekim
  • @digadeesh
  • @peasee
  • @Sevenannn

What's Changed

Dependencies

No major dependency updates.

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.2-alpha...v0.16.0-alpha

spiceai - v0.15.2-alpha

Published by digadeesh 3 months ago

Spice v0.15.2-alpha (July 15, 2024)

The v0.15.2-alpha minor release focuses on enhancing stability, performance, and introduces Catalog Providers for streamlined access to Data Catalog tables. Unity Catalog, Databricks Unity Catalog, and the Spice.ai Cloud Platform Catalog are supported in v0.15.2-alpha. The reliability of federated query push-down has also been improved for the MySQL, PostgreSQL, ODBC, S3, Databricks, and Spice.ai Cloud Platform data connectors.

Highlights in v0.15.2-alpha

Catalog Providers: Catalog Providers streamline access to Data Catalog tables. Initial catalog providers supported are Databricks Unity Catalog, Unity Catalog and Spice.ai Cloud Platform Catalog.

For example, to configure Spice to connect to tpch tables in the Spice.ai Cloud Platform Catalog use the new catalogs: section in the spicepod.yml:

catalogs:
  - name: spiceai
    from: spiceai
    include:
      - tpch.*
sql> show tables
+---------------+--------------+---------------+------------+
| table_catalog | table_schema | table_name    | table_type |
+---------------+--------------+---------------+------------+
| spiceai       | tpch         | region        | BASE TABLE |
| spiceai       | tpch         | part          | BASE TABLE |
| spiceai       | tpch         | customer      | BASE TABLE |
| spiceai       | tpch         | lineitem      | BASE TABLE |
| spiceai       | tpch         | partsupp      | BASE TABLE |
| spiceai       | tpch         | supplier      | BASE TABLE |
| spiceai       | tpch         | nation        | BASE TABLE |
| spiceai       | tpch         | orders        | BASE TABLE |
| spice         | runtime      | query_history | BASE TABLE |
+---------------+--------------+---------------+------------+

Time: 0.001866958 seconds. 9 rows.

ODBC Data Connector Push-Down: The ODBC Data Connector now supports query push-down for joins, improving performance for joined datasets configured with the same odbc_connection_string.

Improved Spicepod Validation Improved spicepod.yml validation has been added, including warnings when loading resources with duplicate names (datasets, views, models, embeddings).

Breaking Changes

None.

Contributors

  • @phillipleblanc
  • @peasee
  • @y-f-u
  • @ewgenius
  • @Sevenannn
  • @sgrebnov
  • @lukekim

What's Changed

Dependencies

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.1-alpha...v0.15.2-alpha

spiceai - v0.15.1-alpha

Published by digadeesh 3 months ago

Spice v0.15.1-alpha (July 8, 2024)

The v0.15.1-alpha minor release focuses on enhancing stability, performance, and usability. Memory usage has been significantly improved for the postgres and duckdb acceleration engines which now use stream processing. A new Delta Lake Data Connector has been added, sharing a delta-kernel-rs based implementation with the Databricks Data Connector supporting deletion vectors.

Highlights

Improved memory usage for PostgreSQL and DuckDB acceleration engines: Large dataset acceleration with PostgreSQL and DuckDB engines has reduced memory consumption by streaming data directly to the accelerated table as it is read from the source.

Delta Lake Data Connector: A new Delta Lake Data Connector has been added for using Delta Lake outside of Databricks.

ODBC Data Connector Streaming: The ODBC Data Connector now streams results, reducing memory usage, and improving performance.

GraphQL Object Unnesting: The GraphQL Data Connector can automatically unnest objects from GraphQL queries using the unnest_depth parameter.

Breaking Changes

None.

New Contributors

None.

Contributors

What's Changed

Dependencies

The MySQL, PostgreSQL, SQLite and DuckDB DataFusion TableProviders developed by Spice AI have been donated to the datafusion-contrib/datafusion-table-providers community repository.

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.0-alpha...v0.15.1-alpha

spiceai - v0.15.0-alpha

Published by digadeesh 4 months ago

Spice v0.15-alpha (July 1, 2024)

The v0.15-alpha release introduces support for streaming databases changes with Change Data Capture (CDC) into accelerated tables via a new Debezium connector, configurable retry logic for data refresh, and the release of a new C# SDK to build with Spice in Dotnet.

Highlights

  • Debezium data connector with Change Data Capture (CDC): Sync accelerated datasets with Debezium data sources over Kafka in real-time.

  • Data Refresh Retries: By default, accelerated datasets attempt to retry data refreshes on transient errors. This behavior can be configured using refresh_retry_enabled and refresh_retry_max_attempts.

  • C# Client SDK: A new C# Client SDK has been released for developing applications in Dotnet.

Debezium data connector with Change Data Capture (CDC)

Integrating Debezium CDC is straightforward. Get started with the Debezium CDC Sample, read more about CDC in Spice, and read the Debezium data connector documentation.

Example Spicepod using Debezium CDC:

datasets:
  - from: debezium:cdc.public.customer_addresses
    name: customer_addresses_cdc
    params:
      debezium_transport: kafka
      debezium_message_format: json
      kafka_bootstrap_servers: localhost:19092
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      refresh_mode: changes

Data Refresh Retries

Example Spicepod configuration limiting refresh retries to a maximum of 10 attempts:

datasets:
  - from: eth.blocks
    name: blocks
    acceleration:
      refresh_retry_enabled: true
      refresh_retry_max_attempts: 10
      refresh_check_interval: 30s

Breaking Changes

None.

New Contributors

Contributors

What's Changed

Dependencies

No major dependency updates.

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.1-alpha...v0.15.0-alpha

spiceai - v0.14.1-alpha

Published by digadeesh 4 months ago

Spice v0.14.1-alpha (Jun 24, 2024)

The v0.14.1-alpha release is focused on quality, stability, and type support with improvements in PostgreSQL, DuckDB, and GraphQL data connectors.

Highlights

  • PostgreSQL acceleration and data connector: Support for Composite Types and UUID data types.
  • DuckDB acceleration and data connector: Support for LargeUTF8 and DuckDB functions.
  • GraphQL data connector: Improved error handling on invalid query syntax.
  • Refresh SQL: Improved stability when overwriting STRUCT data types.

Breaking Changes

None.

New Contributors

Contributors

  • @lukekim
  • @y-f-u
  • @ewgenius
  • @phillipleblanc
  • @Jeadie
  • @sgrebnov
  • @gloomweaver
  • @phungleson
  • @peasee
  • @digadeesh

What's Changed

Dependencies

No major dependency updates.

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.0-alpha...v0.14.1-alpha

spiceai - v0.13.3-alpha

Published by phillipleblanc 4 months ago

Spice v0.13.3-alpha (June 10, 2024)

The v0.13.3-alpha release is focused on quality and stability with improvements to metrics, telemetry, and operability.

Highlights

Ready API: - Add /v1/ready API that returns success once all datasets and models are loaded and ready.

Enhanced Grafana dashboard: The dashboard now includes charts for query duration and failures, the last update time of accelerated datasets, the count of refresh errors, and the last successful time the runtime was able to access federated datasets

Contributors

  • @Jeadie
  • @ewgenius
  • @phillipleblanc
  • @sgrebnov
  • @gloomweaver
  • @y-f-u
  • @mach-kernel

What's Changed

Dependencies

  • DuckDB 1.0.0: Upgrades embedded DuckDB to 1.0.0.

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.2-alpha...v0.13.3-alpha

spiceai - v0.13.2-alpha

Published by ewgenius 5 months ago

Spice v0.13.2-alpha (June 3, 2024)

The v0.13.2-alpha release is focused on quality and stability with improvements to federated query push-down, telemetry, and query history.

Highlights

  • Filesystem Data Connector: Adds the Filesystem Data Connector for directly using files as data sources.

  • Federated Query Push-Down: Improved stability and schema compatibility for federated queries.

  • Enhanced Telemetry: Runtime Metrics now include last update time for accelerated datasets, count of refresh errors, and new metrics for query duration and failures.

  • Query History: Enabled query history logging for Arrow Flight queries in addition to HTTP queries.

Contributors

  • @lukekim
  • @y-f-u
  • @ewgenius
  • @phillipleblanc
  • @Jeadie
  • @Sevenannn
  • @sgrebnov
  • @gloomweaver
  • @mach-kernel

What's Changed

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.1-alpha...v0.13.2

spiceai - v0.13.1-alpha

Published by y-f-u 5 months ago

Spice v0.13.1-alpha (May 27, 2024)

The v0.13.1-alpha release of Spice is a minor update focused on stability, quality, and operability. Query result caching provides protection against bursts of queries and schema support for datasets has been added logical grouping. An issue where Refresh SQL predicates were not pushed down underlying data sources has been resolved along with improved Acceleration Refresh logging.

Highlights in v0.13.1-alpha

  • Results Caching: Introduced query results caching to handle bursts of requests and support caching of non-accelerated results, such as refresh data returned on zero results. Results caching is enabled by default with a 1s item time-to-live (TTL). Learn more.

  • Query History Logging: Recent queries are now logged in the new spice.runtime.query_history dataset with a default retention of 24-hours. Query history is initially enabled for HTTP queries only (not Arrow Flight queries).

  • Dataset Schemas: Added support for dataset schemas, allowing logical grouping of datasets by separating the schema name from the table name with a .. E.g.

    datasets:
      - from: mysql:app1.identities
        name: app.users
    
      - from: postgres:app2.purchases
        name: app.purchases
    

    In this example, queries against app.users will be federated to my_schema.my_table, and app.purchases will be federated to app2.purchases.

Contributors

@y-f-u
@Jeadie
@sgrebnov
@ewgenius
@phillipleblanc
@lukekim
@gloomweaver
@Sevenannn

New in this release

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.0-alpha...v0.13.1-alpha

spiceai - v0.13.0-alpha

Published by Jeadie 5 months ago

Spice v0.13-alpha (May 20, 2024)

The v0.13.0-alpha release significantly improves federated query performance and efficiency with Query Push-Down. Query push-down allows SQL queries to be directly executed by underlying data sources, such as joining tables using the same data connector. Query push-down is supported for all SQL-based and Arrow Flight data connectors. Additionally, runtime metrics, including query duration, collected and accessed in the spice.runtime.metrics table. This release also includes a new FTP/SFTP data connector and improved CSV support for the S3 data connector.

Highlights

  • Federated Query Push-Down (#1394): All SQL and Arrow Flight data connectors support federated query push-down.

  • Runtime Metrics (#1361): Runtime metric collection can be enabled using the --metrics flag and accessed by the spice.runtime.metrics table.

  • FTP & SFTP data connector (#1355) (#1399): Added support for using FTP and SFTP as data sources.

  • Improved CSV support (#1411) (#1414): S3/FTP/SFTP data connectors support CSV files with expanded CSV options.

Contributors

  • @Jeadie
  • @digadeesh
  • @ewgenius
  • @gloomweaver
  • @lukekim
  • @phillipleblanc
  • @sgrebnov
  • @y-f-u

What's Changed

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.2-alpha...v0.13.0-alpha

spiceai - v0.12.1-alpha

Published by phillipleblanc 6 months ago

Spice v0.12.1-alpha (May 6, 2024)

The v0.12.1-alpha release introduces a new Snowflake data connector, support for UUID and TimestampTZ types in the PostgreSQL connector, and improved error messages across all data connectors. The Clickhouse data connector enables data streaming by default. The public SQL interface now restricts DML and DDL queries. Additionally, accelerated tables now fully support NULL values, and issues with schema conversion in these tables have been resolved.

Highlights

  • Snowflake Data Connector: Initial support for Snowflake as a data source.

  • Clickhouse Data Streaming: Enables data streaming by default, eliminating in-memory result collection.

  • Read-only SQL Interface: Disables DML (INSERT/UPDATE/DELETE) and DDL (CREATE/ALTER TABLE) queries for improved data source security.

  • Error Message Improvements: Improved the error messages for commonly encountered issues with data connectors.

  • Accelerated Tables: Supports NULL values across all data types and fixes schema conversion errors for consistent type handling.

Contributors

  • @ahirner
  • @y-f-u
  • @sgrebnov
  • @ewgenius
  • @Jeadie
  • @gloomweaver
  • @Sevenannn
  • @digadeesh
  • @phillipleblanc

What's Changed

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.0-alpha...v0.12.1-alpha

spiceai - v0.12-alpha

Published by ewgenius 6 months ago

Spice v0.12-alpha (Apr 29, 2024)

The v0.12-alpha release introduces Clickhouse and Apache Spark data connectors, adds support for limiting refresh data periods for temporal datasets, and includes upgraded Spice Client SDKs compatible with Spice OSS.

Highlights

  • Clickhouse data connector: Use Clickhouse as a data source with the clickhouse: scheme.

  • Apache Spark Connect data connector: Use Apache Spark Connect connections as a data source using the spark: scheme.

  • Refresh data window: Limit accelerated dataset data refreshes to the specified window, as a duration from now configuration setting, for faster and more efficient refreshes.

  • ODBC data connector: Use ODBC connections as a data source using the odbc: scheme. The ODBC data connector is currently optional and not included in default builds. It can be conditionally compiled using the odbc cargo feature when building from source.

  • Spice Client SDK Support: The official Spice SDKs have been upgraded with support for Spice OSS.

Breaking Changes

  • Refresh interval: The refresh_interval acceleration setting and been changed to refresh_check_interval to make it clearer it is the check versus the data interval.

Contributors

  • @phillipleblanc
  • @Jeadie
  • @ewgenius
  • @sgrebnov
  • @y-f-u
  • @lukekim
  • @digadeesh
  • @gloomweaver
  • @edmondop
  • @mach-kernel

New Contributors

What's Changed

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.11.1-alpha...v0.12.0-alpha

spiceai - 0.11.1-alpha

Published by y-f-u 6 months ago

Spice v0.11.1-alpha (Apr 22, 2024)

The v0.11.1-alpha release introduces retention policies for accelerated datasets, native Windows installation support, and integration of catalog and schema settings for the Databricks Spark connector. Several bugs have also been fixed for improved stability.

Highlights

  • Retention Policies for Accelerated Datasets: Automatic eviction of data from accelerated time-series datasets when a specified temporal column exceeds the retention period, optimizing resource utilization.

  • Windows Installation Support: Native Windows installation support, including upgrades.

  • Databricks Spark Connect Catalog and Schema Settings: Improved translation between DataFusion and Spark, providing better Spark Catalog support.

Contributors

  • @phillipleblanc
  • @Jeadie
  • @ewgenius
  • @sgrebnov
  • @y-f-u
  • @lukekim
  • @digadeesh
  • @Sevenannn
  • @gloomweaver

New in this release

What's Changed

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.11.0-alpha...v0.11.1-alpha

spiceai - Spice.ai v0.11.0-alpha

Published by sgrebnov 6 months ago

The Spice v0.11.0-alpha release significantly improves the Databricks data connector with Databricks Connect (Spark Connect) support, adds the DuckDB data connector, and adds the AWS Secrets Manager secret store. In addition, enhanced control over accelerated dataset refreshes, improved SSL security for MySQL and PostgreSQL connections, and overall stability improvements have been added.

Highlights in v0.11.0-alpha

DuckDB data connector: Use DuckDB databases or connections as a data source.

AWS Secrets Manager Secret Store: Use AWS Secrets Managers as a secret store.

Custom Refresh SQL: Specify a custom SQL query for dataset refresh using refresh_sql.

Dataset Refresh API: Trigger a dataset refresh using the new CLI command spice refresh or via API.

Expanded SSL support for Postgres: SSL mode now supports disable, require, prefer, verify-ca, verify-full options with the default mode changed to require. Added pg_sslrootcert parameter for setting a custom root certificate and the pg_insecure parameter is no longer supported.

Databricks Connect: Choose between using Spark Connect or Delta Lake when using the Databricks data connector for improved performance.

Improved SSL support for Postgres: ssl mode now supports disable, require, prefer, verify-ca, verify-full options with default mode changed to require.
Added pg_sslrootcert parameter to allow setting custom root cert for postgres connector, pg_insecure parameter is no longer supported as redundant.

Internal architecture refactor: The internal architecture of spiced was refactored to simplify the creation data components and to improve alignment with DataFusion concepts.

New Contributors

@edmondop's first contribution github.com/spiceai/spiceai/pull/1110!

Contributors

  • @phillipleblanc
  • @Jeadie
  • @ewgenius
  • @sgrebnov
  • @y-f-u
  • @lukekim
  • @digadeesh
  • @Sevenannn
  • @gloomweaver
  • @ahirner

New in this release

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.10.2-alpha...v0.11.0-alpha

spiceai - Spice.ai v0.10.2-alpha

Published by Jeadie 6 months ago

The v0.10.2-alpha release adds the MySQL data connector and makes external data connections more robust on initialization.

Highlights in v0.10.2-alpha

  • MySQL data connector: Connect to any MySQL server, including SSL support.

  • Data connections verified at initialization: Verify endpoints and authorization for external data connections (e.g. databricks, spice.ai) at initialization.

New Contributors

Contributors

  • @phillipleblanc
  • @y-f-u
  • @ewgenius
  • @sgrebnov
  • @lukekim
  • @digadeesh
  • @jeadie

New in this release

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.10.1-alpha...v0.10.2-alpha