Mirror of Apache Pig
APACHE-2.0 License
Mirror of Apache DataFu
Apache Tez
Data Lakehouse local stack with PySpark, Trino, and Minio. Includes an example to process Raygun ...
Mirror of Apache Oozie
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitate...
Apache Atlas
Apache Flink
Mirror of Apache HCatalog
Apache Ambari simplifies provisioning, managing, and monitoring of Apache Hadoop clusters.
Apache Creadur RAT - Release Audit Tool
Apache Spark - A unified analytics engine for large-scale data processing
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
A distributed data integration framework that simplifies common aspects of big data integration s...
Fundamentals of Spark with Python (using PySpark), code examples
More than 2000+ Data engineer interview questions.