unitycatalog

Open, Multi-modal Catalog for Data & AI

APACHE-2.0 License

Stars
2.3K

Bot releases are visible (Hide)

unitycatalog - v0.2.0

Published by tdas 21 days ago

We are excited to announce Unity Catalog 0.2 which adds a lot of exciting new features.

Artifacts in this release:

For more information on the UC Roadmap, please refer to Proposed UC Roadmap CY2024Q4 #411

Identity, Authentication, and Authorization

  • Support for user identities with OAuth/OIDC User Authentication: You can now configure the UC server to authenticate with an identity provider like Google, Okta, Keyclock, and other providers and limit access to only authenticated identities.
  • Support of authorization/access control: In addition to authentication, you can now configure the UC server to control access to UC assets.
    • APIs: Added REST APIs to manage authorization of all assets (tables, models, etc.)
    • CLI: Added CLI commands to manage authorizations for identities in the user database. 
  • Refer to the documentation for more details on how to configure all these features.

ML Models

  • Support registering ML model and model version support: You can now store ML Models in UC and use them with MLFlow 2.16.1
    • APIs: Added APIs for registering models and model versions on local and all cloud locations
    • CLI: Added CLI commands to register model and model versions
    • MLFlow Integration: MLFlow 2.16.1+ can now use UC OSS as its default model registry.  
      * Refer to the documentation for more details on how to configure MLFlow to use models.

Credential Vending

  • Expanded support for credential vending APIs
    • Added support for vending S3, Azure and GCS credentials
    • Added support for vending credentials for models and paths
    • Credentials are now temporary and scoped down to give clients access to only the asset directory

Spark and Delta Integration

  • Support for operating on tables in UC using Spark 3.5.3 and Delta 3.2.1: You can now use Spark SQL and DataFrame APIs to operate on tables. Specifically, you can now do the following:
    • List external and managed tables.
    • Create, read and write to external tables of Spark-native formats like Parquet, CSV with Spark 3.5.3.
    • Create read and write to external Delta tables with Delta 3.2.1.
    • Read managed Delta tables (creating and writing coming soon!).
    • Configure Spark to access tables from multiple catalogs from multiple UC deployments.
  • Support listing, creating, updating, deleting databases/schemas using Spark SQL.
  • Refer to the documentation for more details on how to configure Spark and Delta.
  • Note: With this integration, now you don't have to configure your entire Spark application with one set of credentials that allows access to all your tables. Instead, this Spark integration will automatically acquire per-table credentials from UC (assuming the user has permissions) when running your Spark jobs.

Backend Database

User Interface

  • Support for a new UI for exploring UC: You can now do the following from the UI:
    • Login via Google Authentication
    • Browse all assets (catalogs, schemas, tables, functions, volumes, models)
    • View asset metadata - such as descriptions, created at timestamps, and function definitions
    • Update descriptions for catalogs, schemas, volumes
    • Create catalogs and schemas
    • Delete assets (catalogs, schema, tables, functions, volumes)
  • Refer to the documentation for more details on the Unity Catalog UI.

Credits

This release has been made possible by 63 contributors with 31 new contributors as highlighted below. Thank you to everyone for their support and for making this release possible.

Full Changelog:

https://github.com/unitycatalog/unitycatalog/compare/v0.1.0...v0.2.0

unitycatalog - Unity Catalog 0.1 Latest Release

Published by tdas 3 months ago

We are excited to announce Unity Catalog 0.1, the first release of the Unity Catalog (UC) open source project! This release gives a preview of the following exciting new features.

Artifacts in this release: 

UC Server

This is the metastore server that supports REST APIs (Open API specification) for cataloging all different data and AI assets. In version 0.1, the server has the REST APIs to support the following:

  • Catalogs and Schemas 

    • UC supports 3 level namespaces - catalog, schema, [tables, functions, etc.]
    • List, get, create, update, and delete catalogs and schemas
  • Tables for storing tabular, structured data

    • External tables - List, get, and create tables.
    • Managed tables - List and get managed tables. Other operations (create) are coming soon.
    • Support for configuring and vending S3 credentials
  • Volumes for storing non-tabular datasets (unstructured data)

    • External volumes - List, get, and create external volumes.
    • Managed volumes -  List and get managed volumes. Other operations (create) coming soon.
    • Support for configuring and vending S3 credentials
  • Functions - List, get, and create functions for Python functions (AI/ML workloads) and SQL functions

See documentation more details

UC Java SDK for building connectors

This Java SDK generated from the OpenAPI specification operates with any UC compliant with the Unity REST API. 

UC Command Line Interface (CLI)

Though not part of this release, this is an example UC connector that demonstrates how to use the UC SDK to operate on various data assets. Specifically, it has the following functionality

  • Catalogs and Schemas - List, get, create, update, and delete catalogs and schemas
  • Tables - Supports both metadata and data operations
    • Metadata: List, get, create, update, and delete Delta tables 
    • Data: Read from and write to Delta tables using Delta Kernel
  • Volumes - Supports both metadata and data operations
    • Metadata: List, get, create, update, and delete volumes
    • Data: List directories and show file content
  • Functions - Support for Python and SQL functions
    • Python: List, get, create, delete and run Python functions
    • SQL: List, get, and delete SQL functions

See documentation more details

Credits

This release has been made possible by 33 contributors. Thank you to everyone for their support and for making this release possible.

New Contributors

Detailed change log

Full Changelog: https://github.com/unitycatalog/unitycatalog/commits/v0.1.0