Official Dockerfile for Apache Spark
APACHE-2.0 License
Spark Structured Streaming / Kafka / Cassandra / Elastic
Mirror of Apache Toree (Incubating)
End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to sche...
Apache Spark Website
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
Apache Spark Kubernetes Operator
Docker packaging for Apache Flink
Convenience Docker images for Apache Tika Server
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
50+ DockerHub public images for Docker & Kubernetes - DevOps, CI/CD, GitHub Actions, CircleCI, Je...
Master's thesis on Big Data
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache CouchDB Continuous Integration (CI) support repository
DataStax Connector for Apache Spark to Apache Cassandra
Semi-official Apache CouchDB Docker images