MaskedVByte

Fast decoder for VByte-compressed integers

APACHE-2.0 License

Stars
113

MaskedVByte

Fast decoder for VByte-compressed integers in C.

It includes fast differential coding.

We require x64 processors support SSE 4.1 or better. This includes virtually all x64 processors in service today, except for very old or specialized processors.

The code should build using most standard-compliant modern C compilers (C99). The provided makefile expects a Linux-like system.

Usage:

  make
  ./unit 

See example.c for an example.

Short code sample:

size_t compsize = vbyte_encode(datain, N, compressedbuffer); // encoding
// here the result is stored in compressedbuffer using compsize bytes
size_t compsize2 = masked_vbyte_decode(compressedbuffer, recovdata, N); // decoding (fast)

Interesting applications

Greg Bowyer has integrated Masked VByte into Lucene, for higher speeds :

https://github.com/GregBowyer/lucene-solr/tree/intrinsics

Reference

  • Daniel Lemire, Nathan Kurz, Christoph Rupp, Stream VByte: Faster Byte-Oriented Integer Compression, Information Processing Letters 130, February 2018, Pages 1-6 https://arxiv.org/abs/1709.08990
  • Jeff Plaisance, Nathan Kurz, Daniel Lemire, Vectorized VByte Decoding, International Symposium on Web Algorithms 2015, 2015. http://arxiv.org/abs/1503.07387

See also

Badges
Extracted from project README
Ubuntu 22.04 CI (GCC 11)