Opinion on best practice for arrays/slices

jzakiya · March 18, 2021, 10:05pm

This is a simple question, to get devs opinion on.

If I have an array/slice that is dynamically created in the program at runtime, and I know its size, is it better (faster, uses less memory, etc) to explicitly create the full size of the array/slice at once, and then explicitly index elements to store values (I know what the indices are)

arry = Array(Int32).new(size_value) 
....

arry[i] = data

or is it “better” to just init the array and then push data into the arry (again, I control the order so it will fill up correctly)?

arry = Array(Int32).new
....

arry << data

The latter takes less code, but I assume by telling the compiler to create the full array, memory will/can be allocated/accessed more efficiently, or does it really make any real difference in the real world?

straight-shoota · March 18, 2021, 10:34pm

You can just combine both approaches:

arry = Array(Int32).new(size_value)
arry << data

Btw. your former example doesn’t even work because size_value is just the capacity of the internal buffer and arry's size is 0, thus you can’t insert values with #[]=. You would need to add pass a default value to the constructor.

jzakiya · March 18, 2021, 11:23pm

I guess that was confusing. size_value just meant the value length for the array.

drhuffman12 · March 19, 2021, 1:14am

If you’re interested in array-related performance comparisons, I have some examples here: ai4cr/array_bencher_spec.cr at master · drhuffman12/ai4cr · GitHub .

jgaskins · March 20, 2021, 2:31am

If you’re doing it once per larger unit of work, creating an array is perfectly fine, but if you know the resulting size that the array will be, I would 100% pass that to Array.new because it’s effectively free performance.

To create an array of 7 elements, for example, you are cumulatively allocating enough elements in the underlying buffer to hold 21 elements:

array = Array(Int32).new # capacity = 0
array << 1 # ALLOCATION, capacity = 3, see https://github.com/crystal-lang/crystal/blob/master/src/array.cr#L2067-L2069
array << 2
array << 3
array << 4 # ALLOCATION, capacity = 6, previous 3 freed
array << 5
array << 6
array << 7 # ALLOCATION, capacity = 12, previous 6 freed

Since each of these allocations isn’t simply extending the original buffer but instead moving it to a whole new space in memory, so each reallocation of the underlying buffer is a complete representation of the array in memory.

You can see the impact with a quick benchmark:

Benchmark code here (only difference between them is the initial_capacity parameter)

require "benchmark"

array_size = ENV.fetch("ARRAY_SIZE", "7").to_i

array = [] of Int32
Benchmark.ips do |x|
  x.report "simple" do
    array = Array(Int32).new
    array_size.times { |i| array << i }
  end

  x.report "initialized to size" do
    array = Array(Int32).new(initial_capacity: 7)
    array_size.times { |i| array << i }
  end
end

pp array.size # Ensure LLVM doesn't optimize it out

$ ARRAY_SIZE=4 crystal run --release bench_array.cr
             simple  17.06M ( 58.62ns) (± 1.27%)  80.0B/op   1.54× slower
initialized to size  26.24M ( 38.11ns) (± 1.36%)  64.0B/op        fastest
4
$ ARRAY_SIZE=7 crystal run --release bench_array.cr
             simple  10.84M ( 92.29ns) (± 0.74%)   144B/op   2.04× slower
initialized to size  22.05M ( 45.35ns) (± 1.14%)  64.0B/op        fastest
7
$ ARRAY_SIZE=25 crystal run --release bench_array.cr
             simple   4.83M (207.12ns) (± 0.65%)  464B/op   1.39× slower
initialized to size   6.71M (149.00ns) (± 0.66%)  256B/op        fastest
25

If that code path is being hit less than about 1000x/sec, though, I wouldn’t worry too much about optimizing via other data structures. Arrays are super flexible and when you’re building something out. Even Slice will usually allocate on the heap unless you tell it to use stack memory that you allocated yourself. This is a lot of work to save a handful of nanoseconds, so if those nanoseconds aren’t adding up to seconds you may want to stick with the easy thing.

jzakiya · March 20, 2021, 2:54am

Thanks @jgaskins for your detailed response. One reason I asked this is because I’ve seen different behavior in other languages.

It seems intuitive that if you allocate all the needed memory for the array at its creation there will be less work, as you show, to have to reallocate memory to increase the array when adding more elements. In my use though, it didn’t seem to make much differences in most cases, so you kind of confirm my intuition.

What I take from this then, from a strictly efficiency/performance perspective, it’s best to allocate the full array at creation if you know it, even if you use << to load it with data, otherwise it won’t make much difference in performance if you’re not filling the array with data too quickly.

Topic		Replies	Views
How to create a StaticArray with a variable size? Help & Support	13	770	October 21, 2022
StaticArray vs Array Help & Support	6	370	July 22, 2021
Is increasing array index size in the future? Crystal Contrib	33	654	March 1, 2025
Small Array Help & Support	1	232	July 16, 2021
Allocate Array of objects Help & Support	4	433	May 9, 2020

Opinion on best practice for arrays/slices

Related topics