OpenMetadata

Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.

APACHE-2.0 License

Downloads
107.7K
Stars
4.2K
Committers
225
OpenMetadata - OpenMetadata 0.11.3-Release

Published by Vj-L about 2 years ago

What's Changed

Full Changelog: https://github.com/open-metadata/OpenMetadata/compare/0.11.2-release...0.11.3-release

OpenMetadata - OpenMetadata 0.11.2-release

Published by akash-jain-10 over 2 years ago

What's Changed

Full Changelog: https://github.com/open-metadata/OpenMetadata/compare/0.11.1-release...0.11.2-release

OpenMetadata - OpenMetadata 0.11.1-Release

Published by Vj-L over 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/open-metadata/OpenMetadata/compare/0.11.0-release...0.11.1-release

OpenMetadata - OpenMetadata 0.11.0-Release

Published by Rabi-Sahoo over 2 years ago

Data Collaboration - Tasks and Emojis

Data Collaboration has been the prime focus of the 0.11 Release, the groundwork for which has been laid in the past several releases. In the 0.9 release, we introduced Activity Feeds, Conversation Threads, and the ability to request descriptions. In this release, we’ve added Tasks, as an extension to the ability to create conversations and post replies.
We are particularly excited about the ability to suggest tasks. This brings the collaboration to the next level where an organization can crowdsource the knowledge and continuously improve descriptions.

Column Level Lineage

https://github.com/open-metadata/OpenMetadata/issues/2931
In OpenMetadata, we primarily compute column-level lineage through SQL query analysis. Lineage information is consolidated from various sources, such as ETL pipelines, DBT, query analysis, and so on. In the backend, we’ve added column-level lineage API support. The UI now supports exploring this rich column-level lineage for understanding the relationship between tables and performing impact analysis. While exploring the lineage, users can manually edit both the table and column level lineage to capture any information that is not automatically surfaced.

Custom Properties

The key goal of the OpenMetadata project is to define Open Metadata Standards to make metadata centralized, easily shareable, and make tool interoperability easier. We take a schema-first approach for strongly typed metadata types and entities modeled using JSON schema as follows:

OpenMetadata now supports adding new types and extending entities when organizations need to capture custom metadata. New types and custom fields can be added to entities either using API or in OpenMetadata UI. This extensibility is based on JSON schema and hence has all the benefits of strong typing, rich constraints, documentation, and automatic validation similar to the core OpenMetadata schemas.

Advanced Search

Users can search by multiple parameters to narrow down the search results. Separate advanced search options are available for Tables, Topics, Dashboards, Pipelines, and ML Models. All these entities are searchable by common search options such as Owner, Tag, and Service.

Glossary UI Updates

The Glossary UI has been upgraded. However, the existing glossary functionality remains the same, with the ability to add Glossary, Terms, Tags, Descriptions, Reviewers etc... On the UI, the arrangement displaying the Summary, Related Terms, Synonyms, and References has been changed. The Reviewers are shown on the right panel with an option to add or remove existing reviewers.

Profiler and Data Quality Improvements

Profiling data and communicating quality across the organization is core to OpenMetadata. While numerous tools exist, they are often isolated and require users to navigate multiple interfaces. In OpenMetadata, these tests and data profiles are displayed alongside your assets (tables, views) and allow you to get a 360-degree view of your data.

Great Expectations Integration

While OpenMetadata allows you to set up and run data quality tests directly from the UI, we understand certain organizations already have their own data quality tool. That’s why we have developed a direct integration between Great Expectations and OpenMetadata. Using our openmetadata-ingestion[great-expectations] python submodule, you can now add custom actions to your Great Expectations checkpoints file that will automatically ingest your data quality test results into OpenMetadata at the end of your checkpoint file run.

ML Models

In this release, we are happy to share the addition of ML Model Entities to the UI. This will allow users to describe, and share models and their features as any other data asset. The UI support also includes the ingestion through the UI from MLflow. In future releases, we will add connectors to other popular ML platforms.
This is just the beginning. We want to learn about the use cases from the community and connect with people that want to help us shape the vision and roadmap. Do not hesitate to reach out!

Connectors

In every release, OpenMetadata has maintained its focus on adding new connectors. In the 0.11 release, five new connectors have been added - Airbyte, Mode, AWS Data Lake, Google Cloud Data Lake, and Apache Pinot.

OpenMetadata - OpenMetadata 0.10.4-release

Published by akash-jain-10 over 2 years ago

0.10.4 is a minor release with following fixes -
#5303
#5468
#5459
#5500
#5543

OpenMetadata - OpenMetadata 0.10.3-release

Published by akash-jain-10 over 2 years ago

0.10.3 is a minor release with following fixes
#5070
#5203
#5224
#5275

OpenMetadata - OpenMetadata 0.10.2-release

Published by akash-jain-10 over 2 years ago

0.10.2 is a minor release with following fixes
#5148
#5256
#5267
#5272

OpenMetadata - OpenMetadata 0.10.1-release

Published by akash-jain-10 over 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/open-metadata/OpenMetadata/compare/0.10.0-release...0.10.1-release

OpenMetadata - OpenMetadata 0.10.0-release

Published by akash-jain-10 over 2 years ago

What's Changed

Support for Database Schema

OpenMetadata supports databases, service name databases, and tables. We’ve added Database Schema as part of the FQN. For each external data source, we ingest the database, as well as the tables that are contained underneath the schemas.

Support for Hard Delete

OpenMetadata supported soft deletions. Now, we also support the hard deletion of entities through the UI, APIs, and ingestion. Hard deleting an entity removes the entity and all of its relationships. This will also generate a change event.

Deploy Ingestion from UI

OpenMetadata has refactored the service connections to simplify the ingestion jobs from both the ingestion framework as well as the UI. We now use the pydantic models automatically generated from the JSON schemas for the connection definition. The ‘Add Service’ form is automatically generated in the UI based on the JSON schema specifications for the various connectors that are supported in OpenMetadata.

Download DBT Manifest Files from Amazon S3 or Google Cloud Storage

Previously, when ingesting the models and lineage from DBT, we passed the path of the DBT manifest and catalog files directly into the workflow. We’ve worked on improving the quality of life of DBT. Now, we can dynamically download these files from Amazon S3 or Google Cloud Storage. This way we can have any other process to connect to the DBT, extract the catalog, and put it into any cloud service. We just need the path name and workflow job details from the metadata extraction to be able to ingest metadata.

JSON Schema based Connection Definition

Each service (database, dashboard, messaging, or pipeline service) has its own configuration specifications, with some unique requirements for some services. Instead of the ad hoc definitions of the source module in Python for each connector, we’ve worked on the full refactoring of the ingestion framework. We now use the pydantic models automatically generated from the JSON schemas for the connection definition.

Airflow Rest APIs

The Airflow REST APIs have been refactored. With our API centric model, we are creating a custom airflow rest API directly on top of Airflow using plugins. This passes the connection information to automatically generate all the dags and prepares handy methods to help us test the connection to the source before creating the service.

UI Changes

  • The UI improvements are directed towards providing a consistent user experience.
  • Hard Deletion of Entities: With the support for the hard deletion of entities, we can permanently delete tables, topics, or services. When the entity is hard deleted, the entity and all its relationships are removed. This generates an ‘EntityDeleted’ change event.
  • Dynamic “Add Service” Forms: The ‘Add Service’ form is automatically generated in the UI based on the JSON schema specifications for the various connectors that are supported in OpenMetadata.
  • UI Support for Database Schema as part of FQN: The database schema has been introduced in the 0.10 release. All the entity pages now support Database Schema in the UI.
  • Lineage Editor: Improvements have been made to the lineage editor.
  • Teams: While signing up in OpenMetadata, the teams with restricted access are hidden and only the joinable teams are displayed.
  • Team Owner: An Owner field has been added to the Team entity. Only team owners can update the teams.
  • Activity Feeds: The Activity Feeds UI supports infinite scrolling.
  • Add User: A user can be added from the Users page.

Security Changes

Support Refresh Tokens for Auth0 and Okta SSO

The JWT tokens generated by the SSO providers expire by default in about an hour, making the user re-login often. In this release, we’ve added support for refresh tokens for Auth0 and Okta SSO. The tokens are refreshed silently behind the scenes to provide an uninterrupted user experience. In the future releases, we’ll continue to stabilize authentication and add refresh tokens for the other SSO providers.

Custom OIDC SSO

OpenMetadata now supports integration with your custom built OIDC SSO for authentication. This is supported both on the front end for user authentication as well as on the ingestion side.

Azure SSO

Support has been added for Azure SSO on Airflow.

New Contributors

Full Changelog: https://github.com/open-metadata/OpenMetadata/compare/0.8.1-release...0.10.0-release

OpenMetadata - OpenMetadata 0.9.1-release

Published by akash-jain-10 over 2 years ago

Collaboration

  • Conversations in the main feed
  • Users can ask each other questions, add suggestions and replies
  • Table details - Click through on usage to see who or what services are using it, and what queries are pulling from it.

Data Quality

  • Ability to create and monitor the test cases
  • Data Quality Tests support with Json Schemas and APIs
  • UI Integration to enable users to write tests and run them on Airflow

Glossary

  • Glossaries are a Controlled Vocabulary in an organization used to define the concepts and terminologies specific to a particular domain.
  • API & Schemas to support Glossary
  • UI support to add Glossary and Glossary Terms.
  • Support for using Glossary terms to annotate Entities and Search using Glossary Terms

Connectors

  • Apache Iceberg
  • Azure SQL
  • Clickhouse
  • Clickhouse Usage
  • Databricks
  • Databricks Usage
  • Delta Lake
  • DynamoDB
  • IBM DB2
  • Power BI
  • MSSQL Usage
  • SingleStore
  • Apache Atlas, Import Metadata from Apache Atlas into OpenMetadata
  • Amundsen, Import Metadata from Amundsen into OpenMetadata

Lineage

  • DataSource SQL Parsing support to extract Lineage
  • View Lineage support

Pipeline

  • Capture pipeline status as it happens

Security

  • Security policies through the UI
  • Configuration personas and authorization based on policies
  • AWS SSO support

New Contributors

Full Changelog: https://github.com/open-metadata/OpenMetadata/compare/0.8.1-release...0.9.1-release

OpenMetadata - OpenMetadata 0.9.0-release

Published by akash-jain-10 over 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/open-metadata/OpenMetadata/compare/0.8.1-release...0.9.0-release

OpenMetadata - OpenMetadata 0.8.4-release

Published by akash-jain-10 over 2 years ago

This is a bug fix release for #2940.

OpenMetadata - OpenMetadata 0.8.3-release

Published by akash-jain-10 over 2 years ago

This is a bug fix release for the #2490.

OpenMetadata - OpenMetadata 0.8.2-release

Published by harshach over 2 years ago

OpenMetadata - OpenMetadata 0.8.1-release

Published by harshach over 2 years ago

This is a bug fix release for the #2843

OpenMetadata - OpenMetadata 0.8.0-release

Published by akash-jain-10 over 2 years ago

Access Control Policy

  • New entities called ‘Role’ and ‘Policy’ have been added.
  • A User has a ‘Role’. A ‘Policy’ can be assigned to a Role.
  • A Policy has a set of ‘Rules’. Rules are used to provide access to functions like updateDescription, updateTags, updateOwner and so on.
  • Can provide access to metadata operations on any entity.
  • A standard set of Roles with their Policies have been added in the new release.
  • ‘Admins’ and ‘Bots’ can perform any metadata operation on any entity.
  • Admins can define policies through the Policy UI, and assign roles to the Users.

Manual Lineage

  • Enhance the lineage captured from machine metadata with user knowledge.
  • Users can edit the lineage and connect the entities with a no-code editor.
  • Drag and drop UI has been designed to add lineage information manually for the table and column levels.
  • Entities like table, pipeline, and dashboard can be dragged and dropped to the lineage graph to create a node.
  • The required entity can be searched and clicked to insert into the graph.

Event Notification via Webhooks & Slack Integration

  • Subscribe event notifications via webhooks.
  • Send metadata change events as Slack notifications
  • Provide timely updates to keep the data team informed of changes

Entity Deletion

  • API support has been added for entity deletion, both for soft delete and hard delete.
  • A deleted dataset is marked as deactivated in the OpenMetadata backend instead of hard deleting it.
  • Ingestion support has been added to publish entity deletion.
  • Enabled version support for deleted entities.

Version panel has been added for all the entities- Table, Topic, Pipeline, and Dashboard.

  • Previously, we were getting the change descriptions for a limited set of fields for the Topic entity; several other fields have now been included.

New Connectors

  • Supports Delta Lake, an open source project that enables building a Lakehouse architecture on top of data lakes.
  • Worked on the refactor of SQL connectors to extract the lineage.
  • Connector API was refactored to capture the configs on the OpenMetadata side and to schedule the ingestion via UI.

Other Features

  • DataSource attribute has been added to the ML model entity.
  • Python API has been updated to add lineage for ML Model entities.
  • A new tab called ‘Bots’ has been added to group users with isBot set to true.
  • Support Application Default Credentials or a keyless, default service account in BigQuery data ingestion.
  • Includes a feature tour for new users.
OpenMetadata - OpenMetadata 0.7.1 Release - Bug Fix

Published by harshach almost 3 years ago

  1. Fixed migrate issue for ./bootstrap-storage.sh migrate-all
  2. Fixed sql v2 file to drop if exists on dbt_table_entity
OpenMetadata - OpenMetadata 0.7.0 Release

Published by harshach almost 3 years ago

Theme: Activity Feeds, DBT integration, Storage Location and Metabase, Druid, MLFlow connectors

Activity Feeds

  • Enables users to view a summary of the metadata change events.
  • The most recent changes are listed at the top.
  • Entities (tables, dashboards, team names) are clickable.
  • Displays 'All Changes' and 'My Changes'.
  • A feed has been provided for data you are 'Following'.

UX Improvements

  • New and improved UX has been implemented for search and its resulting landing pages.
  • The Explore page has an updated UI. Introduced a 3-column layout to present the search results and several visual element improvements to promote readability and navigation of the search results.
  • Users can filter by database name on the Explore page.
  • Support has been added to view the recent search terms.
  • Major version changes for backward incompatible changes are prominently displayed.

DBT Integration

  • DBT integration enables users to see what models are being used to generate tables.
  • DBT models have been associated with Tables and are no longer treated as a separate Entity on the UI.
  • Each DBT model can produce a Table, and all the logic and metadata is now being captured as part of it.
  • DBT model is accessible from a tab in the table view.
  • Storage Location

UI supports StorageLocation.

  • Can extract the location information from Glue.
  • The ingestion framework gets the details from Glue to publish the location information (such as S3 bucket, HDFS) for the external tables.
  • The table results view now displays the table location in addition to the service.

Elasticsearch

  • SSL-enabled Elasticsearch (including self-signed certs) is supported.
  • Automatically runs an indexing workflow as new entities are added or updated through ingestion workflows.
  • Metadata indexing workflow is still available; in case indexes need to be cleanly created from the ground up.

New Connectors

  • Metabase, an open-source BI solution
  • Apache Druid has been added to integrate Druid metadata.
  • MLflow, an open-source platform for the machine learning lifecycle
  • Apache Atlas import connector, imports metadata from Atlas installations into OpenMetadata.
  • Amundsen import connector, imports metadata from Amundsen into OpenMetadata.
  • AWS Glue - improved to extract metadata for tables and pipelines. Now uses the AWS boto session to get the credentials from the different credential providers, and also allows the common approach to provide credentials. Glue ingestion now supports Location.
  • MSSQL now has SSL support.

##Lineage View Improvements

  • Improved discoverability
  • Implemented incremental loading to lineage.
  • Users can inspect the Table’s columns.
  • Users can expand entity nodes to view more upstream or downstream lineage.
  • Helps data consumers and producers to assess the impact of changes to the data sources.

Other Features

  • Admins can now access the user listing page, filter users by team, and manage admin privileges
  • Updated the Redoc version to support search in the API, as well as to add query parameter fields fixes.
  • 'Entity Name' character size has been increased to 256.
  • Upgraded Log4j to 2.15
OpenMetadata - OpenMetadata 0.6.0 release

Published by harshach almost 3 years ago

Theme: Metadata Versioning, Events API, One-Click Ingestion and more

Metadata Versioning

  • OpenMetadata maintains the version history for all entities in the Major.Minor number, starting with 0.1 as the initial version of an entity.
  • Entity views in OpenMetadata provide a timeline visualization of all the metadata changes from version to version.
  • Metadata versioning helps simplify the debugging process. Users can view the changes and instantly identify if a recent change led to the data issue.

Events API

  • When the state of metadata changes, an event is produced that indicates which entity changed, who changed it, and how it changed
  • These events can be used to integrate metadata into other tools, or to trigger actions.

One-Click Ingestion deployment

  • UI integration with Apache Airflow as a workflow engine to run ingestion, data profiling

New Entities: ML Models and Data Models

  • Two new data assets have been added - ML Models and Data Models.
  • ML Models are algorithms trained on data to find patterns or to make predictions. Data Modeling tools such as dbt are getting adopted by many organizations.

New Connectors

  • AWS Glue
  • DBT
  • Maria DB

User Interface

  • In 0.6 release, the UI integration displays all the metadata changes of an entity over time as Version History. By clicking on the Version button, users can view the change log of an entity from the very beginning. The earliest version relates to the metadata pulled from third party systems using ingestion bots. The changes made by users or data engineers are displayed along with a timestamp. The changes in the database by automation are captured by the ingestion bot. Thereby providing a single pane view of the metadata’s evolution over time.
  • The UI supports setting up metadata ingestion workflows.
  • Improvements have been made in drawing the entity node details for lineage.
  • Entity link is supported for each tab on the details page.
  • Guided steps have been added for setting up ElasticSearch.
  • The entity details, search results page (Explore), landing pages, and components have been redesigned for better project structure and code maintenance.
OpenMetadata - OpenMetadata 0.5.0 release

Published by harshach about 3 years ago

Theme: Lineage, Data Reliability, Pipelines, Complex Types

Lineage

  • Schema and API support for Lineage
  • UI integration to show the lineage for Pipelines and Tables
  • Ingestion support for Airflow to capture lineage

Data Reliability

  • UI Integration for Data profiler
  • Unique , Null proportion
  • See how the table data grows through interactive visualization

New Entities: Pipeline

  • Add Apache Airflow as a pipeline service
  • Ingest all of your pipeline's metadata into OpenMetadata
  • Explore and Search Pipelines
  • Add description, tags, ownership and tier your pipelines

Complex Data types

  • Schema and API support to capture complex data types
  • Ingestion support to capture complex data types from Redshift, BigQuery, Snowflake and Hive
  • UI Support for nested complex data types, users can add description, tags to nested fields

New Connectors

  • Trino connector
  • Redash connector
  • Amazon Glue - In progress

User Interface

  • UI now completely built on top of JsonSchema generated code
  • Expand complex data types and allow users to update descriptions and tags
  • Pipeline Service and Details page
  • Pipeline Explore & Search integration
  • Search results will show if the query matches description or column names, description

Full Changelog: https://github.com/open-metadata/OpenMetadata/compare/0.4.0-pre...0.5.0