Service for extracting tables from the CCAO system-of-record and uploading them to the Data Department's data warehouse
AGPL-3.0 License
Haskell on Apache Spark.
A simple Spark-powered ETL framework that just works 🍺
This repo contains examples of high throughput ingestion using Apache Spark and Apache Iceberg. T...
Spark data source for Cognite Data Fusion
Data Lakehouse local stack with PySpark, Trino, and Minio. Includes an example to process Raygun ...
sbt plugin for spark-submit
A Python package to submit and manage Apache Spark applications on Kubernetes.
ETL pipeline using pyspark (Spark - Python)
REST job server for Apache Spark
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks...
Apache Spark - A unified analytics engine for large-scale data processing
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for running complex Auditab...
SparkSQL.jl enables Julia programs to work with Apache Spark data using just SQL.
This construct builds some elements for you to quickly launch an EMR Serverless application. Afte...
The Almaren Framework provides a simplified consistent minimalistic layer over Apache Spark. Whil...