Simple archive format designed for quickly reading some files without extracting the entire archive. Possibly will be used in Bun.
25x faster than unzip
and 10x faster than tar
at reading individual files (uncompressed)
Format | Random access | Fast extraction | Fast archiving | Compression | Encryption | Append |
---|---|---|---|---|---|---|
hop | ||||||
tar | ||||||
zip | (when small) |
Features:
tar
& zip
(compression disabled)zip
, comparable to tar
(compression disabled)zip
, comparable to tar
(compression disabled)Anti-features:
Download the binary from /releases
To create an archive:
hop ./path-to-folder
To extract an archive:
hop archive.hop
To print one file from the archive:
hop archive.hop package.json
Why can't software read many tiny files with similar performance characteristics as individual files?
ls
) in large directory trees is slowUsing tigerbeetle github repo as an example
Extracting:
Archiving:
Extracting a node_modules
folder
copy_file_range
packed struct
makes serialization & deserialization very fast because there is very little encoding/decoding step.package Hop;
struct StringPointer {
uint32 off;
uint32 len;
}
struct File {
StringPointer name;
uint32 name_hash;
uint32 chmod;
uint32 mtime;
uint32 ctime;
StringPointer data;
}
message Archive {
uint32 version = 1;
uint32 content_offset = 2;
File[] files = 3;
uint32[] name_hashes = 4;
byte[] metadata = 5;
}