ReferenceStorage(T)
exposes the instance representation of a reference type T
. The original use case of this type and T.unsafe_construct
is to materialize instances of T
from memory locations other than the GC heap, such as on the call stack or embedded inside another reference type. However, a different kind of use is also possible where multiple T
instances are created side by side from a single allocation. Let’s say we want a fixed-size collection of Foo
s:
class Foo
def initialize(@x : Int32, @y : String)
end
end
foos = Slice.new(10) { |i| Foo.new(i, i.to_s) }
foos[0] # => #<Foo:0x103304e20 @x=0, @y="0">
foos[3] # => #<Foo:0x103304dc0 @x=3, @y="3">
foos[7] # => #<Foo:0x103304d40 @x=7, @y="7">
Behind the scenes, this will call GC.malloc(instance_sizeof(Foo))
10 times. Because Foo
does not require variable-size allocation like String
or Log::Metadata
, we could easily group all the allocations into one call:
foos = Pointer(ReferenceStorage(Foo)).malloc(10)
10.times { |i| Foo.unsafe_construct(foos + i, i, i.to_s) }
foos.as(Foo) # => #<Foo:0x10094af20 @x=0, @y="0">
(foos + 3).as(Foo) # => #<Foo:0x10094af50 @x=3, @y="3">
(foos + 7).as(Foo) # => #<Foo:0x10094af90 @x=7, @y="7">
Notice how the gaps between those Foo
s’ addresses are now smaller, since those instances are indeed stored contiguously:
puts foos.as(UInt8*).to_slice(instance_sizeof(Foo) * 10).hexdump
00000000 b6 00 00 00 00 00 00 00 b8 55 fe 02 01 00 00 00 .........U......
00000010 b6 00 00 00 01 00 00 00 c8 55 fe 02 01 00 00 00 .........U......
00000020 b6 00 00 00 02 00 00 00 a0 6f 30 03 01 00 00 00 .........o0.....
00000030 b6 00 00 00 03 00 00 00 90 6f 30 03 01 00 00 00 .........o0.....
00000040 b6 00 00 00 04 00 00 00 80 6f 30 03 01 00 00 00 .........o0.....
00000050 b6 00 00 00 05 00 00 00 70 6f 30 03 01 00 00 00 ........po0.....
00000060 b6 00 00 00 06 00 00 00 60 6f 30 03 01 00 00 00 ........`o0.....
00000070 b6 00 00 00 07 00 00 00 50 6f 30 03 01 00 00 00 ........Po0.....
00000080 b6 00 00 00 08 00 00 00 40 6f 30 03 01 00 00 00 ........@o0.....
00000090 b6 00 00 00 09 00 00 00 30 6f 30 03 01 00 00 00 ........0o0.....
On each row, b6 00 00 00
is Foo.crystal_instance_type_id
, the next 4 bytes are @x
, and finally the remaining 8 bytes are @y.object_id
; there is no extra bookkeeping data for the allocator between the instances. The foos
raw pointer and the unsafe_construct
call could be further hidden from the public using a helper type:
struct ReferenceSlice(T)
include Indexable(T)
@buf : ReferenceStorage(T)*
getter size : Int32
def initialize(@size : Int32, & : Int32 -> {Tuple, NamedTuple})
@buf = Pointer(ReferenceStorage(T)).malloc(size)
size.times do |i|
args, opts = yield i
T.unsafe_construct(@buf + i, *args, **opts)
end
end
def unsafe_fetch(index : Int) : T
(@buf + index).as(T)
end
end
foos = ReferenceSlice(Foo).new(10) do |i|
args = {i, i.to_s}
{args, NamedTuple.new}
end
foos[0] # => #<Foo:0x10262cf20 @x=0, @y="0">
foos[3] # => #<Foo:0x10262cf50 @x=3, @y="3">
foos[-2] # => #<Foo:0x10262cfa0 @x=8, @y="8">
Indexing a ReferenceSlice(T)
becomes simple pointer arithmetic, and the objects have good spatial locality. In contrast, an Slice(T)
always requires dereferencing its buffer pointer; both the buffer itself and the T
instances could also be very far apart, depending on what the GC is doing.
The main drawback is that all the objects in a ReferenceSlice
must share the same lifetime, since there is only one allocation after all. A live reference to any T
is implicitly a live reference to all other T
s in that slice, thus extracting just a handful objects and discarding the rest is a bad idea. Additionally, the GC is not primarily designed for large allocations on the order of several MBs, so slice size can be a concern for fat classes (there are plenty of them among the Crystal AST nodes). Finally, this approach does not readily generalize to a resizable array of references, because calling #realloc
on @buf
invalidates all the objects immediately; their object_id
s would change in-place, and all the old T
s would point to freed memory.
I wonder if there are any practical applications for such a ReferenceSlice
type?