datasketches-hive

Sketch adaptors for Hive.

APACHE-2.0 License

Stars
48

Bot releases are visible (Hide)

datasketches-hive - datasketches-hive-1.2.0 Latest Release

Published by AlexanderSaydakov over 2 years ago

This is a maintenance release to make this Apache Hive component work with the latest versions of datasketches-java-3.1.0 and datasketches-memory-2.0.0

datasketches-hive - Apache Release 1.1.0-incubating

Published by leerho over 4 years ago

  • This release fixes critical bug
  • updates datasketches-java dependency to 1.3.0-incubating
  • minor licensing fixes
  • minor code cleanup.
datasketches-hive - Apache Release 1.0.0-incubating

Published by leerho about 5 years ago

This is the initial Apache release for this component.

  • The Java package structure has been changed to org.apache.datasketches
  • The file license headers have been updated with the Apache license header
  • The LICENSE, NOTICE, and DISCLAIMER-WIP files have been added and/or updated.

No other significant code changes from the prior version.

datasketches-hive - sketches-hive-0.13.0

Published by AlexanderSaydakov over 5 years ago

  • Based on sketches-core-0.13.0
  • CPC sketch UDFs
  • KLL sketch UDFs
  • additional quantiles sketch UDFs: toString, getN, getCDF
  • additional HLL sketch UDFs: SketchToString, getEstimateAndErrorBounds
datasketches-hive - sketches-hive-0.11.0

Published by AlexanderSaydakov over 6 years ago

Compatibility with sketches-core-0.11.0

datasketches-hive - sketches-hive-0.10.5: new core, HLL late init fix, char and varchar

Published by AlexanderSaydakov almost 7 years ago

  • based on sketches-core-0.10.3
  • support HLL sketch late init from Hive
  • support char and varchar types as HLL and Theta sketch input
datasketches-hive - sketches-hive-0.10.4: use sketches-core-0.10.2

Published by AlexanderSaydakov almost 7 years ago

This is a maintenance release to use the latest sketches-core-0.10.2

datasketches-hive - Sketches core 0.10.1, new Tuple sketch UDFs, performance improvement

Published by AlexanderSaydakov about 7 years ago

  • This is based on sketches-core-0.10.1 and memory-0.10.3
  • New Tuple sketch UDFs: ArrayOfDoublesSketchesTTestUDF, ArrayOfDoublesSketchToMeansUDF, ArrayOfDoublesSketchToVariancesUDF, ArrayOfDoublesSketchToEstimateAndErrorBoundsUDF, ArrayOfDoublesSketchToNumberOfRetainedEntriesUDF, ArrayOfDoublesSketchToQuantilesSketchUDF
  • Performance improvement: wrap() is used instead of heapify() in HLL UDFs
datasketches-hive - HllSketch performance improvement for strings

Published by AlexanderSaydakov over 7 years ago

  • HLL DataToSketchUDAF: Input strings are converted to char[] before passing to HllSketch. This is substantially faster than passing strings due to avoiding UTF-8 conversion process. Warning: effectively a different hash function is used for strings. So unions of sketches produced by this version and the previous version will have no overlap, and therefore produce incorrect results. We recommend upgrading to this version, and, if any sketches have been created with string inputs and stored, we recommend recomputing them from the raw data.
datasketches-hive - HLL sketch UDAFs and UDFs

Published by AlexanderSaydakov over 7 years ago

  • added DataToSketchUDAF, UnionSketchUDAF, SketchToEstimateUDF and UnionSketchUDF for HLL sketch
datasketches-hive - Align with core 0.10.0

Published by AlexanderSaydakov over 7 years ago

  • This is based on sketches-core-0.10.0 and memory-0.10.1
  • Tuple Sketch: added ArrayOfDoublesSketchToValuesUDTF to dump values
datasketches-hive - Align with core 0.8.2

Published by leerho almost 8 years ago

  • Fixed Memory shading problem. The shaded hive jar now includes both the shaded core and shaded memory code.
  • Added TupleSketch UDFs
datasketches-hive - Align with core 0.7.0

Published by AlexanderSaydakov about 8 years ago

  • Added frequent items sketch UDFs
  • Added parameter K to quantiles sketch union UDAFs
  • Added UDFs to get K from quantiles sketches
  • Heavy refactoring of theta sketch UDFs, added seed support, changed the default number of nominal entries to be consistent with core library, fixed treatment of empty sketches in union
  • Added UDFs to get PMF from quantiles sketches
  • Code style and documentation improvements
datasketches-hive - Align with core 0.6.0

Published by leerho over 8 years ago

Major Additions

Doubles Quantiles Sketch

Generic Quantiles Sketch with additional String implementation

Fixes

  • In DataToSketchUDAF, it was restricted to only Writable types, now allows any input types.

Other

  • Minor performance improvements.
  • Javadoc improvements
datasketches-hive - Align with core 0.3.0

Published by leerho over 8 years ago

  • Update Set Operations
datasketches-hive - Align with core 0.2.2

Published by leerho over 8 years ago

  • Updated Set Operations for Hive
datasketches-hive - Open Source

Published by leerho over 8 years ago

Initial Open Source Release aligned with sketches-core 0.2.0