I expected that if an array is my default, that I could append to it as in test2. However the array doesn’t ‘stick’? Could someone explain the correct way to do this / why I’m seeing the results I see?
test = Hash(String, Array(Int32)).new(default_value: [] of Int32)
test["cats"] = test["cats"] << 3
test["cats"] << 4
p test
test2 = Hash(String, Array(Int32)).new(default_value: [] of Int32)
test2["dogs"] << 1
test2["dogs"] << 3
p test2
Welcome to Crystal, @ducktape! Yeah, so the default_value doesn’t automatically instantiate that array unless asked to, and because the Hash’s V is the same type, it’s not complaining about the << method.
The api docs under Hash(K, V).new say:
Creates a new empty Hash where the default_value is returned if a key is missing.
Which is to say, the result of test2["dog"] << 4 is basically [] << 4 instead of the intended test2["dog"] << 4 with the initialized array.
It is rather strange behavior though, and I’m not sure if that’s conveyed well in the docs.
The thing is, you are actually appending to an array that is kept. However, you’re appending to the “default value” array that is returned when the key is not in the hash, not to the array “at” a particular key. If you want a new array that isn’t appended to, you can use the block constructor:
test2 = Hash(String, Array(Int32)).new() { [] of Int32 }
# in the next two lines, Hash#[] returns a new array, which is appended to and then never assigned to anything
test2["dogs"] << 1
test2["dogs"] << 3
p test2["dogs"] # => []
p test2["cats"] # => []
p test2 # => {}
However, that doesn’t really get you what you want. What you’re trying to do with default_value isn’t really what those constructors are meant for. It’s more like if you want to count the number of occurrences of a particular word in a manuscript; if you use default_value: 0, then you can just query the hash about any string at all, and it’ll tell you 0 occurrences, unless you’ve actually given it an entry.
Well to be clear, this specific block version works because it provides a reference the the hash itself, and the missing key. So you’re updating a reference to the actual hash, not just returning some arbitrary value.
# this does NOT work because in that block, [] is just an array... somewhere
test0 = Hash(String, Array(Int32)).new() { [] of Int32 }
test0["cats"] << 55 # this becomes [] << 55
p test0
# proof:
aux = test0["cats"] << 55
p aux
# the block is passed h (in our case, Hash(String, Array(Int32))) and k, (String)
# this works because h is our hash, so its updating itself here, whereas before, it's
# just returning a value and leaving the hash unchanged
test3 = Hash(String, Array(Int32)).new() { |h, k| h[k] = [] of Int32 }
test3["dogs"] << 55
p test3