Lo and behold: GitHub - HertzDevil/pack.cr: Crystal compile-time (un)pack macros from Perl / Ruby
Packing into an IO
is done by Pack.pack_to
, whereas .pack
uses a temporary Bytes
-based builder that is as compact as possible. Unpacking directly from an IO
is not implemented yet; in fact, neither Perl nor Ruby has a similar capability. Note that the X
and @
directives require seekable IO
s in both directions.
There are currently two huge design difference between this library and Ruby / Perl. The first is that every repeat count or glob will correspond to exactly one argument or return value, so the Crystal values are never flattened:
# Crystal
buffer = Pack.pack("Lc*", 1, Int8[2, 3, 4, 5]) # => Bytes[1, 0, 0, 0, 2, 3, 4, 5]
Pack.unpack(buffer, "Lc*") # => {1, Int8[2, 3, 4, 5]}
Pack.pack("Lc4", 1, {2_i8, 3_i8, 4_i8, 5_i8}) # => Bytes[1, 0, 0, 0, 2, 3, 4, 5]
Pack.pack("Lc4", 1, Int8.slice(2, 3, 4, 5)) # => Bytes[1, 0, 0, 0, 2, 3, 4, 5]
Pack.pack("Lc4", 1, (2_i8..)) # => Bytes[1, 0, 0, 0, 2, 3, 4, 5]
# Ruby
buffer = [1, 2, 3, 4, 5].pack("Lc*") # => "\x01\x00\x00\x00\x02\x03\x04\x05"
buffer.unpack("Lc*") # => [1, 2, 3, 4, 5]
[1, [2, 3, 4, 5]].pack("Lc*") # TypeError (no implicit conversion of Array into Integer)
[1, *[2, 3, 4, 5]].pack("Lc*") # => "\x01\x00\x00\x00\x02\x03\x04\x05"
For unpacking it’s to avoid creating very long Tuple
s from simple formats like c256
. For packing it’s to maintain round-trip conversions and also to work around the inability to splat arbitrary containers (you can splat arrays in Ruby and you can most certainly “splat” lists in Perl even when you don’t ask for them). This means for us cccc
and c4
will represent entirely different things.
The second difference is that unpacking a
/ A
/ Z
results in a Bytes
instead of String
, because they say nothing about the string encoding of the byte sequences. Packing strings directly with those directives will probably still be allowed, via to_slice
. In contrast, U
produces a Char
or a String
depending on the count’s presence, and the result is always valid UTF-8.