spark-util

low-level helpers for Apache Spark libraries and tests

APACHE-2.0 License

Stars
16
Committers
1

spark-util

Spark, Hadoop, and Kryo utilities

Kryo registration

Classes that implement the Registrar interface can use various shorthands for registering classes with Kryo.

Adapted from RegistrationTest:

register(
  cls[A],                  // comes with an AlsoRegister that loops in other classes
  arr[Foo],                // register a class and an Array of that class
  cls[B] → BSerializer(),  // use a custom Serializer
  CDRegistrar              // register all of another Registrar's registrations
)
  • custom Serializers and AlsoRegisters are picked up implicitly if not provided explicitly.
  • AlsoRegisters are recursive, allowing for much easier and more robust accountability about what is registered and why, and ensurance that needed registrations aren't overlooked.

Configuration/Context wrappers

  • Configuration: serializable Hadoop-Configuration wrapper
  • Context: SparkContext wrapper that is also a Hadoop Configuration, for unification of "global configuration access" patterns
  • Conf: load a SparkConf with settings from file(s) specified in the SPARK_PROPERTIES_FILES environment variable

Spark Configuration

Misc