FUSE filesystem that chunks files up.
OTHER License
A filesystem that chunks files up.
splitfs
takes a directory and mirrors it on another. However, all files within that directory will be presented as directories themselves, containing one or more "chunk files" which each correspond to a portion of the original file.
For example:
$ tree /testdata
/testdata
10KB. (size=10KiB)
20KB.data (size=20KiB)
Andy_Mabbett_-_RSC_-_How_to_Edit_Wikipedia_-_01_-_italic_bold.webm (size=7.5M)
file-sources.txt (size=262 bytes)
Flower-300x300_dtf.jpg (size=168K)
subdir
20KB.symlink -> ../20KB.data
40KB.data (size=40KiB)
1 directory, 7 files
$ splitfs --chunk_size=10KiB ./testdata /mnt
$ tree /mnt
/mnt
10KB.data
71da741724c3f289_00000001_of_00000001.splitfs.chunk
20KB.data
1b15a91efc959b04_00000001_of_00000002.splitfs.chunk
1b15a91efc959b04_00000002_of_00000002.splitfs.chunk
Andy_Mabbett_-_RSC_-_How_to_Edit_Wikipedia_-_01_-_italic_bold.webm
764cc4cecb7e72ce_00000001_of_00000765.splitfs.chunk
764cc4cecb7e72ce_00000002_of_00000765.splitfs.chunk
...
764cc4cecb7e72ce_00000764_of_00000765.splitfs.chunk
764cc4cecb7e72ce_00000765_of_00000765.splitfs.chunk
file-sources.txt
74c420ec46a3845a_00000001_of_00000001.splitfs.chunk
Flower-300x300_dtf.jpg
efa4bffab14f7017_00000001_of_00000017.splitfs.chunk
efa4bffab14f7017_00000002_of_00000017.splitfs.chunk
...
efa4bffab14f7017_00000016_of_00000017.splitfs.chunk
efa4bffab14f7017_00000017_of_00000017.splitfs.chunk
subdir
20KB.symlink -> ../20KB.data
40KB.data
36d928335f3367da_00000001_of_00000004.splitfs.chunk
36d928335f3367da_00000002_of_00000004.splitfs.chunk
36d928335f3367da_00000003_of_00000004.splitfs.chunk
36d928335f3367da_00000004_of_00000004.splitfs.chunk
8 directories, 790 files
Note: The chunked filesystem is read-only.
Think of it as a filesystem-wide split(1)
. Some use cases:
split
a lot of files in a large directory structure but don't want to try hacking up a recursive shell loop to do it.Because go get
uses https
to download Git repositories, while perot.me
only serves them over the git://
protocol, you have to manually fetch the repository in the right place.
# (if you haven't defined `GOPATH`, Go defaults to `GOPATH=~/go`)
$ export GOPATH="$HOME/go"
$ mkdir -p "$GOPATH/src/perot.me"
$ git clone git://perot.me/splitfs "$GOPATH/src/perot.me/splitfs"
$ go get -v perot.me/splitfs
$ go build perot.me/splitfs
$ ./splitfs
Usage of splitfs:
splitfs [options] <source directory> <target mountpoint>
[...]
splitfs [flags] <source_directory> <mountpoint>
chunk_size
: The size of each chunk. Must be suffixed by a unit (B
, KiB
, MiB
, GiB
, TiB
). Default is 32MiB
.exclude_regexp
: If specified, files with their full path (rooted at the source directory) match this regular expressions will show up as regular files in the mountpoint, rather than getting chunked.filename_hash
: Algorithm for filename hashes in chunked filenames.filename_includes_total_chunks
: Controls whether or not chunk filenames will contain the total number of chunks of the overall file.filename_includes_mtime
: Controls whether or not chunk filenames will contain the mtime of the overall file.For a one-off, just use cat
:
$ cat /mnt/Flower-300x300_dtf.jpg/*.splitfs.chunk > /tmp/reconstituted.jpg
$ sha1sum /testdata/Flower-300x300_dtf.jpg /tmp/reconstituted.jpg
43e31dc3b3c541cf266d678b4309f73ca4d12cb6 /testdata/Flower-300x300_dtf.jpg
43e31dc3b3c541cf266d678b4309f73ca4d12cb6 /tmp/reconstituted.jpg
For more than just a one-off, wait until I implement unsplitfs
, or do it yourself and send a pull request.