If you are happy doing it in compile time, that means you only need to do it once. If you are going to do it only once, then you don’t need to do it in compile time, because allocating that takes a negligible amount of time. How negligible? You are saving between microseconds and milliseconds of startup time, depending on the size of the array elements.
Here’s a trivial benchmark:
require "benchmark"
struct Bytes32
getter v : StaticArray(UInt8, 32)
def initialize
@v = StaticArray(UInt8, 32).new(0)
end
end
struct Bytes64
getter v : StaticArray(UInt8, 64)
def initialize
@v = StaticArray(UInt8, 64).new(0)
end
end
struct Bytes128
getter v : StaticArray(UInt8, 128)
def initialize
@v = StaticArray(UInt8, 128).new(0)
end
end
SIZES = [1, 2, 4, 8, 32, 64, 128]
N = 1_000_000
puts "Benchmarking allocation of #{N} elements of different sizes"
puts "-----------------------------------------------------"
SIZES.each do |size|
puts "
--> Element size: #{size} bytes"
Benchmark.ips do |x|
case size
when 1
x.report("Array(UInt8)") { Array(UInt8).new(N, 0_u8) }
when 2
x.report("Array(UInt16)") { Array(UInt16).new(N, 0_u16) }
when 4
x.report("Array(UInt32)") { Array(UInt32).new(N, 0_u32) }
when 8
x.report("Array(UInt64)") { Array(UInt64).new(N, 0_u64) }
when 32
x.report("Array(Bytes32)") { Array(Bytes32).new(N) }
when 64
x.report("Array(Bytes64)") { Array(Bytes64).new(N) }
when 128
x.report("Array(Bytes128)") { Array(Bytes128).new(N) }
end
end
end
And here’s the output on my machine, running it as `crystal run --release alloc_bench.cr`:
Benchmarking allocation of 1000000 elements of different sizes
-----------------------------------------------------
--> Element size: 1 bytes
Array(UInt8) 46.52k ( 21.49µs) (± 3.34%) 0.95MB/op fastest
--> Element size: 2 bytes
Array(UInt16) 27.13k ( 36.86µs) (± 7.55%) 1.91MB/op fastest
--> Element size: 4 bytes
Array(UInt32) 10.85k ( 92.21µs) (± 9.40%) 3.81MB/op fastest
--> Element size: 8 bytes
Array(UInt64) 2.09k (477.46µs) (± 7.09%) 7.63MB/op fastest
--> Element size: 32 bytes
Array(Bytes32) 369.98 ( 2.70ms) (± 7.21%) 30.5MB/op fastest
--> Element size: 64 bytes
Array(Bytes64) 192.54 ( 5.19ms) (± 4.22%) 61.0MB/op fastest
--> Element size: 128 bytes
Array(Bytes128) 96.72 ( 10.34ms) (± 3.45%) 122MB/op fastest
And you only save that much assuming the compile-time instantiation takes no time (which I don’t know for sure)
Also, yes, probably you want to use a slice if you can.
Footnote: the 32/64/128 byte cases are much slower because the members are more complex than the 1/2/4/8/16 ones but your use case may be more like that, who knows.
Footnote 2: why are you creating the array inside the loop? Move it out man!