Bot releases are visible (Hide)
Published by Jeadie about 2 months ago
The v0.17.3-beta release further improves data accelerator robustness and adds a new github
data connector that makes accelerating GitHub Issues, Pull Requests, Commits, and Blobs easy.
Improved benchmarking, testing, and robustness of data accelerators: Continued improvements to benchmarking and testing of data accelerators, leading to more robust and reliable data accelerators.
GitHub Connector (alpha): Connect to GitHub and accelerate Issues, Pull Requests, Commits, and Blobs.
datasets:
# Fetch all rust and golang files from spiceai/spiceai
- from: github:github.com/spiceai/spiceai/files/trunk
name: spiceai.files
params:
include: '**/*.rs; **/*.go'
github_token: ${secrets:GITHUB_TOKEN}
# Fetch all issues from spiceai/spiceai. Similar for pull requests, commits, and more.
- from: github:github.com/spiceai/spiceai/issues
name: spiceai.issues
params:
github_token: ${secrets:GITHUB_TOKEN}
None.
delta_kernel
from 0.2.0 to 0.3.0.files
support (basic fields) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2393
--force
flag to spice install
to force it to install the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2395
spice chat
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2396
include
param support to GitHub Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2397
content
column to GitHub Connector when dataset is accelerated by @sgrebnov in https://github.com/spiceai/spiceai/pull/2400
crates/llms/src/chat/
by @Jeadie in https://github.com/spiceai/spiceai/pull/2439
spice chat
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2442
labels
and hashes
to primitive arrays by @sgrebnov in https://github.com/spiceai/spiceai/pull/2452
datafusion
version to the latest by @sgrebnov in https://github.com/spiceai/spiceai/pull/2456
/
for S3 data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2458
accelerated_refresh
to task_history
table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2459
assignees
and labels
fields to github issues and github pulls datasets by @ewgenius in https://github.com/spiceai/spiceai/pull/2467
updatedAt
field to GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/2474
updated_at
by @lukekim in https://github.com/spiceai/spiceai/pull/2479
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.2-beta...v0.17.3-beta
Published by phillipleblanc about 2 months ago
This is the release candidate 0.17.2-beta.1
Published by phillipleblanc about 2 months ago
The v0.17.2-beta release focuses on improving data accelerator compatibility, stability, and performance. Expanded data type support for DuckDB, SQLite, and PostgreSQL data accelerators (and data connectors) enables significantly more data types to be accelerated. Error handling and logging has also been improved along with several bugs.
Expanded Data Type Support for Data Accelerators: DuckDB, SQLite, and PostgreSQL Data Accelerators now support a wider range of data types, enabling acceleration of more diverse datasets.
Enhanced Error Handling and Logging: Improvements have been made to aid in troubleshooting and debugging.
Anonymous Usage Telemetry: Optional, anonymous, aggregated telemetry has been added to help improve Spice. This feature can be disabled. For details about collected data, see the telemetry documentation.
To opt out of telemetry:
Using the CLI flag:
spice run -- --telemetry-enabled false
Add configuration to spicepod.yaml
:
runtime:
telemetry:
enabled: false
Improved Benchmarking: A suite of performance benchmarking tests have been added to the project, helping to maintain and improve runtime performance; a top priority for the project.
None.
v0.17.2-beta
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2203
retrieved_primary_keys
in v1/search by @Jeadie in https://github.com/spiceai/spiceai/pull/2176
runtime.task_history
table for queries, and embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/2191
metrics-rs
with OpenTelemetry Metrics by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2240
description
field from spicepod.yaml and include in LLM context by @ewgenius in https://github.com/spiceai/spiceai/pull/2261
connection_pool_size
in the Postgres Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2251
DocumentSimilarityTool
by @Jeadie in https://github.com/spiceai/spiceai/pull/2263
runtime.metrics
table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2296
secrets.inject_secrets
when secret not found. by @Jeadie in https://github.com/spiceai/spiceai/pull/2306
DataAccelerator::init()
for SQLite acceleration federation by @peasee in https://github.com/spiceai/spiceai/pull/2293
disable_query_push_down
option to acceleration settings by @y-f-u in https://github.com/spiceai/spiceai/pull/2327
v1/assist
by @Jeadie in https://github.com/spiceai/spiceai/pull/2312
v1/search
: include WHERE condition, allow extra columns in projection. by @Jeadie in https://github.com/spiceai/spiceai/pull/2328
task_history
nested spans by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2337
bytes_processed
telemetry metric by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2343
runtime.metrics
/Prometheus as well by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2352
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.1-beta...v0.17.2-beta
Published by phillipleblanc 3 months ago
The v0.17.1-beta minor release focuses on enhancing stability, performance, and usability. The Flight interface now supports the GetSchema
API and s3
, ftp
, sftp
, http
, https
, and databricks
data connectors have added support for a client_timeout
parameter.
Flight API GetSchema: The GetSchema
API is now supported by the Flight interface. The schema of a dataset can be retrieved using GetSchema
with the PATH
or CMD
FlightDescriptor types. The CMD
FlightDescriptor type is used to get the schema of an arbitrary SQL query as the CMD bytes. The PATH
FlightDescriptor type is used to retrieve the schema of a dataset.
Client Timeout: A client_timeout
parameter has been added for Data Connectors: ftp
, sftp
, http
, https
, and databricks
. When defined, the client timeout configures Spice to stop waiting for a response from the data source after the specified duration. The default timeout is 30 seconds.
datasets:
- from: ftp://remote-ftp-server.com/path/to/folder/
name: my_dataset
params:
file_format: csv
# Example client timeout
client_timeout: 30s
ftp_user: my-ftp-user
ftp_pass: ${secrets:my_ftp_password}
TLS is now required to be explicitly enabled. Enable TLS on the command line using --tls-enabled true
:
spice run -- --tls-enabled true --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem
Or in the spicepod.yml
with enabled: true
:
runtime:
tls:
# TLS explicitly enabled
enabled: true
certificate_file: /path/to/cert.pem
key_file: /path/to/key.pem
v1/models
by @Jeadie in https://github.com/spiceai/spiceai/pull/2152
EmbeddingConnector
by @Jeadie in https://github.com/spiceai/spiceai/pull/2165
CREATE TABLE...
and infer on first write by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2167
GetSchema
API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2169
flightsubscriber
/flightpublisher
tools by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2194
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.0-beta...v0.17.1-beta
Published by phillipleblanc 3 months ago
Announcing the first beta release of Spice.ai OSS! 🎉
The core Spice runtime has graduated from alpha to beta! Components, such as Data Connectors and Models, follow independent release milestones. Data Connectors graduating from alpha
to beta
include databricks
, spiceai
, postgres
, s3
, odbc
, and mysql
. From beta to 1.0, project will be to on improving performance and scaling to larger datasets.
This release also includes enhanced security with Transport Layer Security (TLS) secured APIs, a new spice install
CLI command, and several performance and stability improvements.
Enable TLS using the --tls-certificate-file
and --tls-key-file
command-line flags:
spice run -- --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem
Or configure in the spicepod.yml
:
runtime:
tls:
certificate_file: /path/to/cert.pem
key_file: /path/to/key.pem
Get started with TLS by following the TLS Sample. For more details see the TLS Documentation.
spice install
: Running the spice install
CLI command will download and install the latest version of the runtime.spice install
Improved SQLite and DuckDB compatibility: The SQLite and DuckDB accelerators support more complex queries and additional data types.
Pass through arguments from spice run
to runtime: Arguments passed to spice run
are now passed through to the runtime.
Secrets replacement within connection strings: Secrets are now replaced within connection strings:
datasets:
- from: mysql:my_table
name: my_table
params:
mysql_connection_string: mysql://user:${secrets:mysql_pw}@localhost:3306/db
The odbc
data connector is now optional and has been removed from the released binaries. To use the odbc
data connector, use the official Spice Docker image or build the Spice runtime from source.
To build Spice from source with the odbc
feature:
cargo build --release --features odbc
To use the official Spice Docker image from DockerHub:
# Pull the latest official Spice image
docker pull spiceai/spiceai:latest
# Pull the official v0.17-beta Spice image
docker pull spiceai/spiceai:0.17.0-beta
unixodbc
for E2E test release installation by @peasee in https://github.com/spiceai/spiceai/pull/2063
json_pointer
param optional for the GraphQL connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2072
spice install
CLI command by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2090
delta_kernel
to 0.2.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2102
spice run
and spice sql
to runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2123
spice sql
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2125
--tls
flag by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2128
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.16.0-alpha...v0.17-beta
Published by digadeesh 3 months ago
The v0.16-alpha release is the first candidate release for the beta milestone on a path to finalizing the v1.0 developer and user experience. Upgraders should be aware of several breaking changes designed to improve the Secrets configuration experience and to make authoring spicepod.yml
files more consistent. See the Breaking Changes section below for details. Additionally, the Spice Java SDK was released, providing Java developers a simple but powerful native experience to query Spice.
secrets
configuration in spicepod.yaml
:secrets:
- from: env
name: env
- from: aws_secrets_manager:my_secret_name
name: aws_secret
Secrets managed by configured Secret Stores can be referenced in component params
using the syntax ${<store_name>:<key>}
. E.g.
datasets:
- from: postgres:my_table
name: my_table
params:
pg_host: localhost
pg_port: 5432
pg_pass: ${ env:MY_PG_PASS }
Java Client SDK: The Spice Java SDK has been released for JDK 17 or greater.
Federated SQL Query: Significant stability and reliability improvements have been made to federated SQL query support in most data connectors.
ODBC Data Connector: Providing a specific SQL dialect to query ODBC data sources is now supported using the sql_dialect
param. For example, when querying Databricks using ODBC, the databricks
dialect can be specified to ensure compatibility. Read the ODBC Data Connector documentation for more details.
spicepod.yml
schema. File based secrets stored in the ~/.spice/auth
file are no longer supported. See Secret Stores Documentation for full reference.To upgrade Secret Stores, rename any parameters ending in _key
to remove the _key
suffix and specify a secret inline via the secret replacement syntax (${<secret_store>:<key>}
):
datasets:
- from: postgres:my_table
name: my_table
params:
pg_host: localhost
pg_port: 5432
pg_pass_key: my_pg_pass
to:
datasets:
- from: postgres:my_table
name: my_table
params:
pg_host: localhost
pg_port: 5432
pg_pass: ${secrets:my_pg_pass}
And ensure the MY_PG_PASS
environment variable is set.
time_format
has changed from unix_seconds
to timestamp
.To upgrade:
datasets:
- from:
name: my_dataset
# Explicitly define format when not specified.
time_format: unix_seconds
3000
to port 8090
to avoid conflicting with frontend apps which typically use the 3000 range. If an SDK is used, upgrade it at the same time as the runtime.To upgrade and continue using port 3000, run spiced with the --http
command line argument:
# Using Dockerfile or spiced directly
spiced --http 127.0.0.1:3000
9000
to 9090
to avoid conflicting with other metrics protocols which typically use port 9000.To upgrade and continue using port 9000, run spiced with the metrics command line argument:
# Using Dockerfile or spiced directly
spiced --metrics 127.0.0.1:9000
json_path
has been replaced with json_pointer
to access nested data from the result of the GraphQL query. See the GraphQL Data Connector documentation for full details and RFC-6901 - JSON Pointer.To upgrade, change:
json_path: my.json.path
To:
json_pointer: /my/json/pointer
params
parameters. Prefixed parameter names helps ensure parameters do not collide.For example, the Databricks data connector specific params are now prefixed with databricks
:
datasets:
- from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog
name: my_delta_lake_table
params:
mode: spark_connect
endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com
token: MY_TOKEN
To upgrade:
datasets:
# Example for Spark Connect
- from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog
name: my_delta_lake_table
params:
mode: spark_connect
databricks_endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com # Now prefixed with databricks
databricks_token: ${secrets:my_token} # Now prefixed with databricks
Refer to the Data Connector documentation for parameter naming changes in this release.
Clickhouse Data Connector: The clickhouse_connection_timeout
parameter has been renamed to connection_timeout
as it applies to the client and is not Clickhouse configuration itself.
To upgrade, change:
clickhouse_connection_timeout: time
To:
connection_timeout: time
No major dependency updates.
spice chat
command, to interact with deployed spiced instance in spice.ai cloud by @ewgenius in https://github.com/spiceai/spiceai/pull/1990
/v1/chat/completions
with streaming in spice chat
cli command by @ewgenius in https://github.com/spiceai/spiceai/pull/1998
spice chat
command, add --model
flag by @ewgenius in https://github.com/spiceai/spiceai/pull/2007
${ <secret>:<key> }
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2026
connector
and runtime
categories by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2028
dataset configure
endpoint param by @sgrebnov in https://github.com/spiceai/spiceai/pull/2052
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.2-alpha...v0.16.0-alpha
Published by digadeesh 3 months ago
The v0.15.2-alpha minor release focuses on enhancing stability, performance, and introduces Catalog Providers for streamlined access to Data Catalog tables. Unity Catalog, Databricks Unity Catalog, and the Spice.ai Cloud Platform Catalog are supported in v0.15.2-alpha. The reliability of federated query push-down has also been improved for the MySQL, PostgreSQL, ODBC, S3, Databricks, and Spice.ai Cloud Platform data connectors.
Catalog Providers: Catalog Providers streamline access to Data Catalog tables. Initial catalog providers supported are Databricks Unity Catalog, Unity Catalog and Spice.ai Cloud Platform Catalog.
For example, to configure Spice to connect to tpch
tables in the Spice.ai Cloud Platform Catalog use the new catalogs:
section in the spicepod.yml
:
catalogs:
- name: spiceai
from: spiceai
include:
- tpch.*
sql> show tables
+---------------+--------------+---------------+------------+
| table_catalog | table_schema | table_name | table_type |
+---------------+--------------+---------------+------------+
| spiceai | tpch | region | BASE TABLE |
| spiceai | tpch | part | BASE TABLE |
| spiceai | tpch | customer | BASE TABLE |
| spiceai | tpch | lineitem | BASE TABLE |
| spiceai | tpch | partsupp | BASE TABLE |
| spiceai | tpch | supplier | BASE TABLE |
| spiceai | tpch | nation | BASE TABLE |
| spiceai | tpch | orders | BASE TABLE |
| spice | runtime | query_history | BASE TABLE |
+---------------+--------------+---------------+------------+
Time: 0.001866958 seconds. 9 rows.
ODBC Data Connector Push-Down: The ODBC Data Connector now supports query push-down for joins, improving performance for joined datasets configured with the same odbc_connection_string
.
Improved Spicepod Validation Improved spicepod.yml
validation has been added, including warnings when loading resources with duplicate names (datasets
, views
, models
, embeddings
).
None.
catalog
from Spicepod. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1903
Runtime
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1906
spice.ai
CatalogProvider
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1925
UnityCatalog
catalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1940
Databricks
catalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1941
params
into dataset_params
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1947
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.1-alpha...v0.15.2-alpha
Published by digadeesh 3 months ago
The v0.15.1-alpha minor release focuses on enhancing stability, performance, and usability. Memory usage has been significantly improved for the postgres
and duckdb
acceleration engines which now use stream processing. A new Delta Lake Data Connector has been added, sharing a delta-kernel-rs based implementation with the Databricks Data Connector supporting deletion vectors.
Improved memory usage for PostgreSQL and DuckDB acceleration engines: Large dataset acceleration with PostgreSQL and DuckDB engines has reduced memory consumption by streaming data directly to the accelerated table as it is read from the source.
Delta Lake Data Connector: A new Delta Lake Data Connector has been added for using Delta Lake outside of Databricks.
ODBC Data Connector Streaming: The ODBC Data Connector now streams results, reducing memory usage, and improving performance.
GraphQL Object Unnesting: The GraphQL Data Connector can automatically unnest objects from GraphQL queries using the unnest_depth
parameter.
None.
None.
The MySQL, PostgreSQL, SQLite and DuckDB DataFusion TableProviders developed by Spice AI have been donated to the datafusion-contrib/datafusion-table-providers community repository.
datafusion-table-providers
crate by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1873
delta-rs
with delta-kernel-rs
and add new delta
data connector. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1878
delta
tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1891
delta
to delta_lake
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1892
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.0-alpha...v0.15.1-alpha
Published by digadeesh 4 months ago
The v0.15-alpha release introduces support for streaming databases changes with Change Data Capture (CDC) into accelerated tables via a new Debezium connector, configurable retry logic for data refresh, and the release of a new C# SDK to build with Spice in Dotnet.
Debezium data connector with Change Data Capture (CDC): Sync accelerated datasets with Debezium data sources over Kafka in real-time.
Data Refresh Retries: By default, accelerated datasets attempt to retry data refreshes on transient errors. This behavior can be configured using refresh_retry_enabled
and refresh_retry_max_attempts
.
C# Client SDK: A new C# Client SDK has been released for developing applications in Dotnet.
Integrating Debezium CDC is straightforward. Get started with the Debezium CDC Sample, read more about CDC in Spice, and read the Debezium data connector documentation.
Example Spicepod using Debezium CDC:
datasets:
- from: debezium:cdc.public.customer_addresses
name: customer_addresses_cdc
params:
debezium_transport: kafka
debezium_message_format: json
kafka_bootstrap_servers: localhost:19092
acceleration:
enabled: true
engine: duckdb
mode: file
refresh_mode: changes
Example Spicepod configuration limiting refresh retries to a maximum of 10 attempts:
datasets:
- from: eth.blocks
name: blocks
acceleration:
refresh_retry_enabled: true
refresh_retry_max_attempts: 10
refresh_check_interval: 30s
None.
No major dependency updates.
feature--
branches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1788
symlink
-> symlink_file
. by @Jeadie in https://github.com/spiceai/spiceai/pull/1793
Unsupported DataType: conversion
for time predicates by @sgrebnov in https://github.com/spiceai/spiceai/pull/1795
clippy::module_name_repetitions
lint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1812
v1/search
that performs vector search. by @Jeadie in https://github.com/spiceai/spiceai/pull/1836
embeddings
with models
by @Jeadie in https://github.com/spiceai/spiceai/pull/1829
"cmake-build"
feature to rdkafka
for windows by @Jeadie in https://github.com/spiceai/spiceai/pull/1840
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.1-alpha...v0.15.0-alpha
Published by digadeesh 4 months ago
The v0.14.1-alpha release is focused on quality, stability, and type support with improvements in PostgreSQL, DuckDB, and GraphQL data connectors.
None.
No major dependency updates.
spiceai/async-openai
to solve Deserialize
issue in v1/embed
by @Jeadie in https://github.com/spiceai/spiceai/pull/1707
v1/assist
into a VectorSearch
struct by @Jeadie in https://github.com/spiceai/spiceai/pull/1699
spiceai/duckdb-rs
, support LargeUTF8 by @Jeadie in https://github.com/spiceai/spiceai/pull/1746
tonic::async_trait
-> async_trait::async_trait
by @Jeadie in https://github.com/spiceai/spiceai/pull/1757
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.0-alpha...v0.14.1-alpha
Published by github-actions[bot] 4 months ago
The v0.14-alpha release focuses on enhancing accelerated dataset performance and data integrity, with support for configuring primary keys and indexes. Additionally, the GraphQL data connector been introduced, along with improved dataset registration and loading error information.
Accelerated Datasets: Ensure data integrity using primary key and unique index constraints. Configure conflict handling to either upsert new data or drop it. Create indexes on frequently filtered columns for faster queries on larger datasets.
GraphQL Data Connector: Initial support for using GraphQL as a data source.
Example Spicepod showing how to use primary keys and indexes with accelerated datasets:
datasets:
- from: eth.blocks
name: blocks
acceleration:
engine: duckdb # Use DuckDB acceleration engine
primary_key: '(hash, timestamp)'
indexes:
number: enabled # same as `CREATE INDEX ON blocks (number);`
'(number, hash)': unique # same as `CREATE UNIQUE INDEX ON blocks (number, hash);`
on_conflict:
'(hash, number)': drop # possible values: drop (default), upsert
'(hash, timestamp)': upsert
Primary Keys, constraints, and indexes are currently supported when using SQLite, DuckDB, and PostgreSQL acceleration engines.
Learn more with the indexing quickstart and the primary key sample.
Read the Local Acceleration documentation.
None.
runtime.metrics
table by @ewgenius in https://github.com/spiceai/spiceai/pull/1678
runtime.metrics
by @ewgenius in https://github.com/spiceai/spiceai/pull/1681
labels
to properties
and make it nullable by @ewgenius in https://github.com/spiceai/spiceai/pull/1686
tpch_q7
, tpch_q8
, tpch_q9
, tpch_q14
by @sgrebnov in https://github.com/spiceai/spiceai/pull/1683
v1/assist
by @Jeadie in https://github.com/spiceai/spiceai/pull/1653
primary_key
in Spicepod and create in accelerated table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1687
ArrayDistance
scalar UDF by @Jeadie in https://github.com/spiceai/spiceai/pull/1697
on_conflict
behavior for accelerated tables with constraints by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1688
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.3-alpha...v0.14.0-alpha
Published by phillipleblanc 4 months ago
The v0.13.3-alpha release is focused on quality and stability with improvements to metrics, telemetry, and operability.
Ready API: - Add /v1/ready
API that returns success once all datasets and models are loaded and ready.
Enhanced Grafana dashboard: The dashboard now includes charts for query duration and failures, the last update time of accelerated datasets, the count of refresh errors, and the last successful time the runtime was able to access federated datasets
array_distance
as euclidean distance between Float32[] by @Jeadie in https://github.com/spiceai/spiceai/pull/1601
crates/runtime/src/http/v1/
by @Jeadie in https://github.com/spiceai/spiceai/pull/1619
/v1/ready
API that returns 200 when all datasets have loaded by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1629
v1/assist
response and panic bug. Include primary keys in response too by @Jeadie in https://github.com/spiceai/spiceai/pull/1635
err_code
to query_failures
metric by @sgrebnov in https://github.com/spiceai/spiceai/pull/1639
ObjectStoreMetadataTable
& ObjectStoreTextTable
by @Jeadie in https://github.com/spiceai/spiceai/pull/1649
v1/assist
by @Jeadie in https://github.com/spiceai/spiceai/pull/1648
Time Since Offline
chart to Grafana dashboard by @sgrebnov in https://github.com/spiceai/spiceai/pull/1664
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.2-alpha...v0.13.3-alpha
Published by ewgenius 5 months ago
The v0.13.2-alpha release is focused on quality and stability with improvements to federated query push-down, telemetry, and query history.
Filesystem Data Connector: Adds the Filesystem Data Connector for directly using files as data sources.
Federated Query Push-Down: Improved stability and schema compatibility for federated queries.
Enhanced Telemetry: Runtime Metrics now include last update time for accelerated datasets, count of refresh errors, and new metrics for query duration and failures.
Query History: Enabled query history logging for Arrow Flight queries in addition to HTTP queries.
spice_cloud
- connect to cloud api by @ewgenius in https://github.com/spiceai/spiceai/pull/1523
llm
UX in spicepod.yaml
by @Jeadie in https://github.com/spiceai/spiceai/pull/1545
runtime.metrics
schema, if remote (spiceai) data connector provided by @ewgenius in https://github.com/spiceai/spiceai/pull/1554
object_store
table provider for UTF8 data formats by @Jeadie in https://github.com/spiceai/spiceai/pull/1562
query_duration_seconds
and query_failures
metrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/1575
/app
as a default workdir in spiceai docker image by @ewgenius in https://github.com/spiceai/spiceai/pull/1586
EmbeddingConnector
by @Jeadie in https://github.com/spiceai/spiceai/pull/1592
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.1-alpha...v0.13.2
Published by y-f-u 5 months ago
The v0.13.1-alpha release of Spice is a minor update focused on stability, quality, and operability. Query result caching provides protection against bursts of queries and schema support for datasets has been added logical grouping. An issue where Refresh SQL predicates were not pushed down underlying data sources has been resolved along with improved Acceleration Refresh logging.
Results Caching: Introduced query results caching to handle bursts of requests and support caching of non-accelerated results, such as refresh data returned on zero results. Results caching is enabled by default with a 1s
item time-to-live (TTL). Learn more.
Query History Logging: Recent queries are now logged in the new spice.runtime.query_history
dataset with a default retention of 24-hours. Query history is initially enabled for HTTP queries only (not Arrow Flight queries).
Dataset Schemas: Added support for dataset schemas, allowing logical grouping of datasets by separating the schema name from the table name with a .
. E.g.
datasets:
- from: mysql:app1.identities
name: app.users
- from: postgres:app2.purchases
name: app.purchases
In this example, queries against app.users
will be federated to my_schema.my_table
, and app.purchases
will be federated to app2.purchases
.
@y-f-u
@Jeadie
@sgrebnov
@ewgenius
@phillipleblanc
@lukekim
@gloomweaver
@Sevenannn
file_format
parameter required for S3/FTP/SFTP connector by @ewgenius in https://github.com/spiceai/spiceai/pull/1455
file_format
from dataset path by @ewgenius in https://github.com/spiceai/spiceai/pull/1489
file_format
to helm chart sample dataset by @ewgenius in https://github.com/spiceai/spiceai/pull/1493
file_format
prompt for s3 and ftp datasets in Dataset Configure CLI if no extension detected by @ewgenius in https://github.com/spiceai/spiceai/pull/1494
runtime
schema by @ewgenius in https://github.com/spiceai/spiceai/pull/1524
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.0-alpha...v0.13.1-alpha
Published by Jeadie 5 months ago
The v0.13.0-alpha release significantly improves federated query performance and efficiency with Query Push-Down. Query push-down allows SQL queries to be directly executed by underlying data sources, such as joining tables using the same data connector. Query push-down is supported for all SQL-based and Arrow Flight data connectors. Additionally, runtime metrics, including query duration, collected and accessed in the spice.runtime.metrics
table. This release also includes a new FTP/SFTP data connector and improved CSV support for the S3 data connector.
Federated Query Push-Down (#1394): All SQL and Arrow Flight data connectors support federated query push-down.
Runtime Metrics (#1361): Runtime metric collection can be enabled using the --metrics
flag and accessed by the spice.runtime.metrics
table.
FTP & SFTP data connector (#1355) (#1399): Added support for using FTP and SFTP as data sources.
Improved CSV support (#1411) (#1414): S3/FTP/SFTP data connectors support CSV files with expanded CSV options.
release
cargo feature to docker builds by @ewgenius in https://github.com/spiceai/spiceai/pull/1377
spice.runtime.metrics
table by @ewgenius in https://github.com/spiceai/spiceai/pull/1361
runtime.metrics
table by @ewgenius in https://github.com/spiceai/spiceai/pull/1408
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.2-alpha...v0.13.0-alpha
Published by github-actions[bot] 5 months ago
The v0.12.2-alpha release introduces data streaming and key-pair authentication for the Snowflake data connector, enables general append
mode data refreshes for time-series data, improves connectivity error messages, adds nested folders support for the S3 data connector, and exposes nodeSelector and affinity keys in the Helm chart for better Kubernetes management.
Improved Connectivity Error Messages: Error messages provide clearer, actionable guidance for misconfigured settings or unreachable data connectors.
Snowflake Data Connector Improvements: Enables data streaming by default and adds support for key-pair authentication in addition to passwords.
API for Refresh SQL Updates: Update dataset Refresh SQL via API.
Append Data Refresh: Append mode data refreshes for time-series data are now supported for all data connectors. Specify a dataset time_column
with refresh_mode: append
to only fetch data more recent than the latest local data.
Docker Image Update: The spiceai/spiceai:latest
Docker image now includes the ODBC data connector. For a smaller footprint, use spiceai/spiceai:latest-slim
.
Helm Chart Improvements: nodeSelector
and affinity
keys are now supported in the Helm chart for improved Kubernetes deployment management.
POST /v1/datasets/:name/refresh
to POST /v1/datasets/:name/acceleration/refresh
to be consistent with the Spicepod.yaml
structure.release
feature in docker image by @ewgenius in https://github.com/spiceai/spiceai/pull/1324
DataConnectorResult
and DataConnectorError
by @ewgenius in https://github.com/spiceai/spiceai/pull/1339
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.1-alpha...v0.12.2-alpha
Published by phillipleblanc 6 months ago
The v0.12.1-alpha release introduces a new Snowflake data connector, support for UUID and TimestampTZ types in the PostgreSQL connector, and improved error messages across all data connectors. The Clickhouse data connector enables data streaming by default. The public SQL interface now restricts DML and DDL queries. Additionally, accelerated tables now fully support NULL values, and issues with schema conversion in these tables have been resolved.
Snowflake Data Connector: Initial support for Snowflake as a data source.
Clickhouse Data Streaming: Enables data streaming by default, eliminating in-memory result collection.
Read-only SQL Interface: Disables DML (INSERT/UPDATE/DELETE) and DDL (CREATE/ALTER TABLE) queries for improved data source security.
Error Message Improvements: Improved the error messages for commonly encountered issues with data connectors.
Accelerated Tables: Supports NULL values across all data types and fixes schema conversion errors for consistent type handling.
GITHUB_TOKEN
environment variable in the installation script, if available, to avoid rate limiting in CI workflows by @ewgenius in https://github.com/spiceai/spiceai/pull/1302
spice login spark
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1303
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.0-alpha...v0.12.1-alpha
Published by ewgenius 6 months ago
The v0.12-alpha release introduces Clickhouse and Apache Spark data connectors, adds support for limiting refresh data periods for temporal datasets, and includes upgraded Spice Client SDKs compatible with Spice OSS.
Clickhouse data connector: Use Clickhouse as a data source with the clickhouse:
scheme.
Apache Spark Connect data connector: Use Apache Spark Connect connections as a data source using the spark:
scheme.
Refresh data window: Limit accelerated dataset data refreshes to the specified window, as a duration from now configuration setting, for faster and more efficient refreshes.
ODBC data connector: Use ODBC connections as a data source using the odbc:
scheme. The ODBC data connector is currently optional and not included in default builds. It can be conditionally compiled using the odbc
cargo feature when building from source.
Spice Client SDK Support: The official Spice SDKs have been upgraded with support for Spice OSS.
refresh_interval
acceleration setting and been changed to refresh_check_interval
to make it clearer it is the check versus the data interval.SELECT count(*)
for Sqlite Data Accelerator by @sgrebnov in https://github.com/spiceai/spiceai/pull/1166
show tables
in Spice SQL & update next version to v0.12.0-alpha
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1206
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.11.1-alpha...v0.12.0-alpha
Published by y-f-u 6 months ago
The v0.11.1-alpha release introduces retention policies for accelerated datasets, native Windows installation support, and integration of catalog and schema settings for the Databricks Spark connector. Several bugs have also been fixed for improved stability.
Retention Policies for Accelerated Datasets: Automatic eviction of data from accelerated time-series datasets when a specified temporal column exceeds the retention period, optimizing resource utilization.
Windows Installation Support: Native Windows installation support, including upgrades.
Databricks Spark Connect Catalog and Schema Settings: Improved translation between DataFusion and Spark, providing better Spark Catalog support.
refresh_sql
and manual refresh to e2e tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1125
spice dataset configure
by @ewgenius in https://github.com/spiceai/spiceai/pull/1140
spice upgrade
on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1155
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.11.0-alpha...v0.11.1-alpha