spark

Performance Observability for Apache Spark

APACHE-2.0 License

Stars
171
Committers
5

Bot releases are hidden (Show)

spark - Version 0.2.3 Latest Release

Published by menishmueli 2 months ago

  • New alert - Large data Broadcast, for requesting to broadcast large data sets with the broadcast() function
  • New alert - Large filter conditions, for wiring long filter conditions instead of using join logic
  • UI Improvements
spark - Version 0.2.2

Published by menishmueli 5 months ago

Support spark versions 2.4 logs in history server with version later than 3.2
Limited feature-set is available due to events having less data than spark 3.0 and up

spark - Version 0.2.1

Published by menishmueli 5 months ago

  1. Better Databricks stage to node support
  2. Support spark.dataflint.runId in custom history server providers when appId is not the spark appId
spark - Version 0.2.0

Published by menishmueli 5 months ago

  • Better support for Databricks Photon plans
  • Input nodes shows partitions filters and push down filters
  • Stage Breakdown - press the blue down arrow on sql node to see stage information
  • New alert - large number of small tasks
spark - Version 0.1.7

Published by menishmueli 6 months ago

Apache Iceberg alerts improvements

Add avg file size in read/write

More information when hovering on stage

spark - Version 0.1.6

Published by menishmueli 7 months ago

Apache Iceberg support:

  1. Better node naming
  2. Read metrics and reading small files alerts
  3. Write metrics and overwriting most of table alerts
    Write metrics require enabling an iceberg metric reporter, can be done for you by enabling spark.dataflint.iceberg.autoCatalogDiscovery to true, or setting the iceberg metric reporter manually for each catalog, for example:
    spark.sql.catalog.[catalog name].metrics-reporter-impl org.apache.spark.dataflint.iceberg.DataflintIcebergMetricsReporter
spark - Version 0.1.5

Published by menishmueli 8 months ago

  • Add support for history server with cluster-mode jobs (i.e. with attempt numbet)
  • Fix "wasted cores" calculation
  • Fix status tab SQL is flickering when there is SQL with sub queriers
spark - Version 0.1.4

Published by menishmueli 8 months ago

Small fix for scala 2.13 support

spark - Version 0.1.3

Published by menishmueli 8 months ago

Main Changes:

  1. DataFlint SaaS support
  2. Partition Skew Alert

You can see the full list of changes in the release notes

spark - Version 0.1.2

Published by menishmueli 9 months ago

Main Changes:

  1. Scala 2.13 support
  2. "Core Activity Rate" renamed to "Wasted Cores", new alert for high wasted cores
  3. The ability to disable anonymous telemetries

You can see the full list of changes in the release notes

spark - Version 0.1.1

Published by menishmueli 9 months ago

Version with a new tab - the Resources tab

Changelog can be found here: https://dataflint.gitbook.io/dataflint-for-spark/overview/release-notes#version-0.1.1

spark - Version 0.1.0

Published by menishmueli 10 months ago

First public release of DataFlint!

Changelog can be found here: https://dataflint.gitbook.io/dataflint-for-spark/overview/release-notes#version-0.1.0

Badges
Extracted from project README's
Maven Package Slack Test Status Docs
Related Projects