Binary-to-text encoding that is safe to pass through modern text processors
OTHER License
Binary data encoding schemes that are safe to be passed through processing systems that expect human readable text, without requiring escaping.
Alternative To:
The safe encodings have been specially designed to avoid numerous issues with other binary-to-text encoding schemes. Here are the relative advantages of the various encodings:
Encoding | Bloat | SGML | JSON | Code | URI | File | Host | Trunc | Sort | White | Length | Human | No-Pad | Alpha |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
safe16 | 2.0 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 16 |
safe32 | 1.6 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 32 |
safe64 | 1.33 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 64 | ||
safe80 | 1.27 | ✓ | ✓ | ✓ | 1 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 80 | ||
safe85 | 1.25 | ✓ | ✓ | ✓ | 2 | 3 | ✓ | ✓ | ✓ | ✓ | ✓ | 85 |
!
$
(
)
,
;
may need special handling.!
$
(
)
,
;
*
, =
may need special handling.For comparison, baseXY encodings:
Encoding | Bloat | SGML | JSON | Code | URI | File | Host | Trunc | Sort | White | Length | Human | No-Pad | Alpha |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
base16 | 2.0 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 16 | ||||
base32 | 1.6 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 33 | ||||||
base64 | 1.33 | ✓ | ✓ | ✓ | 4 | 4 | 65 | |||||||
base85 | 1.25 | ✓ | 87 |
The choice of radix affects compressibility. Here are some size comparisons using the Windows XP service pack 3 x86 iso from MSDN (approx 600 MB):
Mode | original | s16 | s32 | s64 | s80 | s85 |
---|---|---|---|---|---|---|
iso.sxx | 1.00 | 2.00 | 1.60 | 1.33 | 1.27 | 1.25 |
iso.gz.sxx | 0.93 | 1.86 | 1.49 | 1.24 | 1.18 | 1.16 |
With post-compression (e.g. gzipped HTTP response):
Mode | original | s16 | s32 | s64 | s80 | s85 |
---|---|---|---|---|---|---|
iso.sxx.gz | 1.00 | 1.06 | 1.00 | 0.95 | 0.98 | 0.96 |
iso.gz.sxx.gz | 0.93 | 1.06 | 0.98 | 0.94 | 0.94 | 0.94 |
These show the current time performance of the reference implementations (using the same 600 MB iso). Safe80 and Safe85 are considerably slower due to their use of multiplication and division. Safe80 is further slowed by its naive use of 128 bit integers. Optimized versions will of course fare much better.
Type | Time (s) | Relative |
---|---|---|
s16 | 15.08 | 1.40 |
s32 | 11.175 | 1.04 |
s64 | 10.742 | 1.00 |
s80 | 65.653 | 6.11 |
s85 | 27.484 | 2.56 |
These specifications are part of the Specification Project
The reference implementations contain libraries and command line executables:
safeenc is a command line program that can convert to/from any safe format.
Specifications released under Creative Commons Attribution 4.0 International Public License. Reference implementation released under MIT License.