Is there a Hash-like data structure that's not referencing its own elements?

I am wondering if there’s an elegant way to solve this.
The standard way for implementing a graph class with arbitrary node attributes probably is generics (variant #1 below).
(It just has a bare minimum, not useful interface, I know).

But this implementation has the drawback that some algorithms working on it cannot define their own data structures for the graph, e.g. a is_node_visited?.

Variant #1: Graph class with internalized node attributes:

class Graph(T)
    def initialize
        @nodes = Set(Node(T)).new
    end
    def add_node : Node(T)
        n = Node(T).new
        @nodes.add(n)
        n
    end
    def remove_node(node : Node(T))
        @nodes.delete(node) # deletes Node along with instance data (obviously)
    end
    class Node(T)
        property value = T.new
    end
end

g = Graph(String).new

n = g.add_node
n.value = "foo"
p n.value
g.remove_node(n)

On the other hand, if we remove the node attributes entirely from the graph class and use a generic hash, we will run in memory leaks, if the attribute (and graph) live long and are growing and shrinking a lot - since the hash will always keep referencing the old nodes.

Variant #2: Graph class with externalized node attributes:

class Graph
    def initialize
        @nodes = Set(Node).new
    end
    def add_node : Node
        n = Node.new
        @nodes.add(n)
        n
    end
    def remove_node(node : Node)
        @nodes.delete(node) # just deletes the structure information, attributes are outside...
    end
    class Node
    end
end

g = Graph.new
node_attrib = Hash(Graph::Node,String).new # generates a memory leak (when attribute lives long)

n = g.add_node
node_attrib[n] = "foo"
p node_attrib[n]
g.remove_node(n) # generates a memory leak in node_attrib (if attribute lives long)

Is there a way of defining a Hash-like data structure that is just capable of #[] and #[]=, but not able to enumerate its elements, hence not referencing them, so that the GC can do the cleanup all by itself?

Or is there any other elegant way (I know, one can always do some sort of reference counting on his own, but this seems very tedious)?

Thank you!

I think you’re looking for a WeakRef. With this you can make a WeakMap (name stolen from the same concept in Ruby). You can iterate over a WeakMap and, if the object has been GCed, it just won’t yield it:

class Person
  def initialize(@name : String)
  end
end

map = WeakMap(String, Person).new
map["jgaskins"] = Person.new("Jamie")

# Hold onto a reference to the Person object so it isn't GCed
wolfgang = Person.new("Wolfgang")
map["wolfgang371"] = wolfgang

pp map
# WeakMap(String,Person) {
#  "jgaskins" => #<Person:0x1032f0e60 @name="Jamie">,
#  "wolfgang371" => #<Person:0x1032f0e00 @name="Wolfgang">}

GC.collect

pp map
# WeakMap(String,Person) { "wolfgang371" => #<Person:0x1032f0e00 @name="Wolfgang">}

Notice that my entry was removed because nothing was referencing it other than the WeakMap itself, but since yours was stored in a variable, it was kept.

The implementation is here:

require "weak_ref"

class WeakMap(K, V)
  include Enumerable({K, V})
  @hash : Hash(K, WeakRef(V)) = Hash(K, WeakRef(V)).new

  def []=(key : K, value : V)
    @hash[key] = WeakRef.new(value)
  end

  def []?(key : K)
    if ref = @hash[key]?
      if value = ref.value
        value
      else
        @hash.delete key
        nil
      end
    end
  end

  def delete(key : K)
    @hash.delete key
  end

  def each
    @hash.each do |(key, ref)|
      if value = ref.value
        yield({key, value})
      else
        delete key
      end
    end
  end

  def pretty_print(pp)
    prefix = "WeakMap(#{K},#{V}) {"
    executed = exec_recursive(:pretty_print) do
      pp.surround(prefix, "}", left_break: nil, right_break: nil) do
        each_with_index do |(key, value), index|
          if index == 0
            pp.breakable
          else
            pp.comma
          end
          pp.group do
            key.pretty_print(pp)
            pp.text " => "
            pp.nest do
              pp.breakable ""
              value.pretty_print(pp)
            end
          end
        end
      end
    end
    unless executed
      pp.text "#{prefix} ...>"
    end
  end
end
3 Likes

A problem with that implementation is that if the only reference to a value comes from a key in the same WeakMap, that value can never be collected:

class Name
end

class Person
  def initialize(@name : Name)
  end
end

def foo
  name = Name.new
  # if either `name` below is replaced with `Name.new`,
  # then the key-value pair gets collected successfully
  WeakMap{Person.new(name) => name}
end

map = foo
pp map # => WeakMap(Person,Name) { #<Person:0x... @name=#<Name:0x...>> => #<Name:0x...>}

GC.collect
pp map # => WeakMap(Person,Name) { #<Person:0x... @name=#<Name:0x...>> => #<Name:0x...>}

So something like a Lua ephemeron table is needed for robustness.

1 Like

thanks for your answer, I really wasn’t aware about WeakRef!

However, there are two catches:
First, in your case the GC.collect only works, since you effectively call each when pretty printing, just before. It still works, if you call pp map.size instead, but it doesn’t work anymore, if you add a method x to WeakMap that returns @hash.size and then call pp map.x instead.

Second, for my use case, e.g. is_node_visited? I actually need the key to be a WeakRef, somehow - not the value…

I threw that code together to get you started. It was not meant to be a complete solution that covers every edge case. If you need something more robust, you will need to make that happen.

This definitely makes it more challenging and requires wrapping every key in a WeakRef. This requires making WeakMap#[]?(key : K) then also wrap the given key in a WeakRef, which then in turn requires overriding Reference#hash(hasher) and Reference#==(other) because WeakRef does not. You can do that by subclassing WeakRef and using that as your key type so you don’t need to monkeypatch. For example (again, this is for illustrative purposes and may or may not even work as written):

class WeakMap(K,V)
  private class Key(T) < WeakRef(T)
    def hash(hasher)
      value.hash hasher
    end

    def ==(other : Key(T))
      value == other.value
    end
  end
end

thanks again!

For my first point, I wasn’t expecting a complete solution - I was trying to say that the GC can only do the work if we manually walk all WeakMap elements. I don’t see a robust way of doing so - or is there a way to tell the GC he has to call some methods before actually doing it’s work?

For your second answer: I will definitely have a closer look into it, that might solve a (big) part of the problem.

However, when I was just playing around a bit, it seems that the WeakRef demo in the docs doesn’t work for me (anymore?)… (I’m using Crystal 1.7.3)

require "weak_ref"

ref = WeakRef.new("oof".reverse)
p ref.value # => "foo"
GC.collect
p ref.value # => nil, according to docs, but now I do get "foo"
1 Like

understood, the modified example works

ref = WeakRef.new("oof".reverse) # reverse is also necessary, otherwise it will reference "oof", which is still in scope
p ref # # => "foo" - don't call .value, because then it'll stay on the stack, hence will not get GC'ed
GC.collect
p ref # ... @target=Pointer(Void).null
p ref.value # .value here is safe => nil

do you have a clue why this is not working?

class Graph
    def initialize
        @nodes = Hash(InternalNode, Node).new
    end
    def add_node : Node
        ni = InternalNode.new
        n = Node.new(ni)
        @nodes[ni] = n
        n
    end
    def remove_node(node : Node) : Nil
        @nodes.delete(node.value)
        nil
    end
    private class InternalNode
    end
    alias Node = WeakRef(InternalNode)
end

g = Graph.new

n = g.add_node
p n
g.remove_node(n)
GC.collect
p n # should print ... @target=Pointer(Void).null

I tried to use a manual wrapping of WeakRef and also used #value carefully, but it doesn’t get freed.

I think there might be something in Hash representation that keeps a potential reference.

I did a trivial and ugly implementation based on array and nodes get freed after a second GC.collect

require "weak_ref"

class Graph
  class Entry
    property strong : InternalNode
    property weak : Node
    def_equals_and_hash weak

    def initialize(@strong, @weak)
    end
  end

  def initialize
      @nodes = Array(Entry).new
  end
  def add_node : Node
      ni = InternalNode.new
      n = Node.new(ni)
      @nodes << Entry.new(ni, n)
      n
  end

  def remove_node(node : Node) : Nil
      ni = uninitialized InternalNode
      @nodes.delete(Entry.new(ni, node))
      nil
  end
  private class InternalNode
  end
  alias Node = WeakRef(InternalNode)
end

g = Graph.new

n = g.add_node
p n
g.remove_node(n)
GC.collect
p n
GC.collect
p n

For the hash I tried to check if things where cleared in the @entries and it seems so. With the following you get a memory representation of the entries, which are the ones that eventually holds references to the keys AFAIK

  def dump_nodes
    puts Slice.new(@nodes.@entries.as(Pointer(UInt8)), (1 << @nodes.@indices_size_pow2) // 2).hexdump
  end

  def remove_node(node : Node) : Nil
      dump_nodes
      @nodes.delete(node.value)
      dump_nodes
      nil
  end
#<WeakRef(Graph::InternalNode):0x105533fe0 @target=Pointer(Void)@0x10552aeb0>
00000000  f3 79 05 3f                                       .y.?
00000000  00 00 00 00                                       ....
#<WeakRef(Graph::InternalNode):0x105533fe0 @target=Pointer(Void)@0x10552aeb0>

So far that seems reasonable so I am unsure where the ref is coming from.

Also note that WeakRef is a pessimistic check. If there is an Int that happens to have the same value of the memory it will not be freed.

Thanks for your investigations. The Array version works for me as well now.

I had another look into the Hash version.
Before I posted yesterday, I already tried if several GC.collects actually help - they didn’t.
However, if I change the last three lines from…

g.remove_node(n)
GC.collect
p n # should print ... @target=Pointer(Void).null

to

g.remove_node(n)
p n # should print ... @target=Pointer(Void).null
GC.collect
p n # should print ... @target=Pointer(Void).null

the last output actually is null.

It seems accessing the WeakRef between destroying the object and GC’ing is important.
As a result this code runs fine:

require "weak_ref"

class Graph
    def initialize
        @nodes = Hash(InternalNode, Node).new
    end
    def add_node : Node
        ni = InternalNode.new
        n = Node.new(ni)
        @nodes[ni] = n
        n
    end
    def remove_node(node : Node) : Nil
        @nodes.delete(node.value)
        node.inspect # necessary, otherwise very last line does not yield "nil"
        nil
    end
    private class InternalNode
    end
    alias Node = WeakRef(InternalNode)
end

g = Graph.new

n = g.add_node
g.remove_node(n)
GC.collect
p n.value # -> nil, if above "node.inspect" is in place

As for the WeakMap on the keys in the hash I now tried the following:

require "weak_ref"

class WeakMap(K, V)
    private class Key(T) < WeakRef(T)
        def_equals_and_hash value # identical to commented out block
        # def hash(hasher)
        #     value.hash hasher
        # end
        # def ==(other : Key(T))
        #     value == other.value
        # end
    end
    include Enumerable({Key(K), V})
    @hash = Hash(Key(K), V).new
    def []=(key : K, value : V)
        @hash[Key.new(key)] = value
    end
    def []?(key : K)
        @hash[Key.new(key)]?
    end
    def delete(key : K)
        @hash.delete(Key.new(key))
    end
    def each
        @hash.each do |(key, value)|
            p ["each", key, key.value, value]
            if key.value
                yield({key, value})
            else
                p "never called"
                @hash.delete(key)
            end
        end
    end
end

class Graph
    def initialize
        @nodes = Set(Node).new
    end
    def add_node : Node
        n = Node.new
        @nodes.add(n)
        n
    end
    def remove_node(node : Node) : Nil
        @nodes.delete(node)
        nil
    end
    class Node
    end
end

g = Graph.new
node_attrib = WeakMap(Graph::Node,String).new

n = g.add_node
node_attrib[n] = "foo"
g.remove_node(n)
node_attrib.each {}
n = g.add_node
node_attrib[n] = "bar"
GC.collect
node_attrib.each {}
GC.collect
p node_attrib.size # still 2

Seems like I just don’t get rid of the old key…

Now I have two versions, that are pretty much comparable, side-by-side: a specialized one that works and a generic one that doesn’t work.

Working specialized:

require "weak_ref"

class WeakMap(K,V)
    include Enumerable({K,V})
    @hash = Hash(K,V).new
    def []=(key : K, value : V)
        @hash[key] = value
    end
    def []?(key : K)
        @hash[key]?
    end
    def delete(key : K)
        @hash.delete(key)
    end
    def each
        @hash.each do |(key, value)|
            yield({key, value})
        end
    end
    def clean
        @hash.each_key do |k|
            @hash.delete(k) if !k.value
        end
    end
end

class Graph
    def initialize
        @nodes = Hash(InternalNode, Node).new
    end
    def add_node : Node
        ni = InternalNode.new
        n = Node.new(ni)
        @nodes[ni] = n
        n
    end
    def remove_node(node : Node) : Nil
        @nodes.delete(node.value)
        node.inspect # on weak type; necessary, otherwise very last line does not yield "nil"
        nil
    end
    private class InternalNode
    end
    alias Node = WeakRef(InternalNode)
end

def fn(g, h)
    n = g.add_node
    h[n] = "foo"
    g.remove_node(n)
end

g = Graph.new
h = WeakMap(Graph::Node, String).new
fn(g,h)
GC.collect
h.clean
p h.size # is 0

Broken generic:

require "weak_ref"

class WeakMap(K, V)
    private class Key(T) < WeakRef(T)
        def_equals_and_hash value # identical to commented out block
        # def hash(hasher)
        #     value.hash hasher
        # end
        # def ==(other : Key(T))
        #     value == other.value
        # end
    end
    include Enumerable({Key(K), V})
    @hash = Hash(Key(K), V).new
    def []=(key : K, value : V)
        @hash[Key.new(key)] = value
    end
    def []?(key : K)
        @hash[Key.new(key)]?
    end
    def delete(key : K)
        @hash.delete(Key.new(key))
    end
    def each
        @hash.each do |(key, value)|
            yield({key, value})
        end
    end
    def clean
        @hash.each_key do |k|
            @hash.delete(k) if !k.value
        end
    end
end

class Graph
    def initialize
        @nodes = Hash(Node,Bool).new
    end
    def add_node : Node
        n = Node.new
        @nodes[n] = true
        n
    end
    def remove_node(node : Node) : Nil
        @nodes.delete(node)
        node.inspect # on strong type?!
        nil
    end
    class Node
    end
end

def fn(g, h)
    n = g.add_node
    h[n] = "foo"
    g.remove_node(n)
end

g = Graph.new
h = WeakMap(Graph::Node, String).new
fn(g,h)
GC.collect
h.clean
p h.size # should be 0, but is 1

I’m pretty much blind where the error is.

ok, this time both memory management and semantics play
(sorry, just had to delete two stupid posts of mine)

require "weak_ref"

class WeakKeyMap(K, V)
    # precondition (1): do _not_ subclass WeakRef(T)
    private class Key(T)
        def initialize(k : T)
            @k = WeakRef(T).new(k)
        end
        def value
            @k.value
        end
        def ==(other : Key(T)) # this is important for semantics
            value == other.value
        end
        # precondition (2): do _not_ use this
        # def_equals_and_hash @k.value
    end
    include Enumerable({Key(K), V})
    @hash = Hash(Key(K), V).new
    def initialize
        # since...
        # - GC has no hooks for cleaning
        # - Hash iterators are all private, so we cannot store the iterator state here
        # ... we have to use fibers
        spawn do
            while true
                @hash.each do |(key, value)|
                    # precondition (3a): remove keys regularly
                    @hash.delete(key) if !key.value
                    Fiber.yield
                end
                Fiber.yield
            end
        end
    end
    def []=(key : K, value : V)
        @hash[Key.new(key)] = value
    end
    def []?(key : K) : V|Nil
        @hash[Key.new(key)]?
    end
    def [](key : K) : V
        self[key]?.as(V) # V may or may not include Nil
    end
    def delete(key : K)
        @hash.delete(Key.new(key))
    end
    def each
        @hash.each do |(key, value)|
            # precondition (3b): remove keys regularly
            @hash.delete(key) if !key.value
            yield({key, value})
        end
    end
end

class Graph
    def initialize
        @nodes = Set(Node).new
    end
    def add_node : Node
        n = Node.new
        @nodes.add(n)
        n
    end
    def remove_node(node : Node) : Nil
        @nodes.delete(node)
        nil
    end
    class Node
    end
end

def fn(g, h)
    n1 = g.add_node
    h[n1] = "foo"
    raise "fail" if h[n1]!="foo"
    g.remove_node(n1)
    n2 = g.add_node
    h[n2] = "bar"
    raise "fail" if h[n2]!="bar"
    g.remove_node(n2)
    n3 = g.add_node
    h[n3] = "foo1"
    raise "fail" if h[n3]!="foo1"
    g.remove_node(n3)
    n4 = g.add_node
    h[n4] = "foo2"
    raise "fail" if h[n4]!="foo2"
    g.remove_node(n4)
    n5 = g.add_node
    h[n5] = "foo3"
    raise "fail" if h[n5]!="foo3"
    g.remove_node(n5)
    n6 = g.add_node
    h[n6] = "foo4"
    raise "fail" if h[n6]!="foo4"
    g.remove_node(n6)
end

g = Graph.new
h = WeakKeyMap(Graph::Node, String).new
fn(g,h)
p h.size # 6
GC.collect
4.times {Fiber.yield}
p h.size # 2
2.times {Fiber.yield}
p h.size # 0

so, I probably have to give up now…

My last “solution” doesn’t work on more complex data structures (not included here).
This is obviously due to my missing hash(hasher) method. If I include it, the semantics are fine, but the memory leak is back.
Now, even if I try to trick the GC as in the following code, I cannot find a way to do so - it even seems to be independent of the offset I add (0x1234… wouldn’t be helping anyhow)…

require "weak_ref"

# trying to work around the GC
lib C
    # in C: void *memcpy(void *dest, const void *src, size_t n);
    fun memcpy(dest : UInt64*, src : UInt64*, n : UInt32)
end

class WeakKeyMap(K, V)
    # precondition (1): do _not_ subclass WeakRef(T)
    private class Key(T)
        def initialize(k : T)
            @k = WeakRef(T).new(k)
        end
        def value
            @k.value
        end
        def ==(other : Key(T)) # this is important for semantics
            value == other.value
        end
        def hash(hasher)
            v1 = @k.value.object_id
            v2 = 0u64
            C.memcpy(pointerof(v2), pointerof(v1), 4) # trying to work around GC logic
            v2 += 0x1234567812345678u64 # doesn't remove memory leak
            # v2 = 0u64 # uncommenting removes memory leak
            v2.hash(hasher)
        end
        # precondition (2): do _not_ use this
        # def_equals_and_hash @k.value
    end
    include Enumerable({Key(K), V})
    @hash = Hash(Key(K), V).new
    def initialize
        # since...
        # - GC has no hooks for cleaning
        # - Hash iterators are all private, so we cannot store the iterator state here
        # ... we have to use fibers
        spawn do
            while true
                @hash.each do |(key, value)|
                    # precondition (3a): remove keys regularly
                    @hash.delete(key) if !key.value
                    Fiber.yield
                end
                Fiber.yield
            end
        end
    end
    def []=(key : K, value : V)
        @hash[Key.new(key)] = value
    end
    def []?(key : K) : V|Nil
        @hash[Key.new(key)]?
    end
    def [](key : K) : V
        self[key]?.as(V) # V may or may not include Nil
    end
    def delete(key : K)
        @hash.delete(Key.new(key))
    end
    def each
        @hash.each do |(key, value)|
            # precondition (3b): remove keys regularly
            @hash.delete(key) if !key.value
            yield({key, value})
        end
    end
end

class Graph
    def initialize
        @nodes = Set(Node).new
    end
    def add_node : Node
        n = Node.new
        @nodes.add(n)
        n
    end
    def remove_node(node : Node) : Nil
        @nodes.delete(node)
        nil
    end
    class Node
    end
end

def fn(g, h)
    n1 = g.add_node
    h[n1] = "foo"
    raise "fail" if h[n1]!="foo"
    g.remove_node(n1)
    n2 = g.add_node
    h[n2] = "bar"
    raise "fail" if h[n2]!="bar"
    g.remove_node(n2)
    n3 = g.add_node
    h[n3] = "foo1"
    raise "fail" if h[n3]!="foo1"
    g.remove_node(n3)
    n4 = g.add_node
    h[n4] = "foo2"
    raise "fail" if h[n4]!="foo2"
    g.remove_node(n4)
    n5 = g.add_node
    h[n5] = "foo3"
    raise "fail" if h[n5]!="foo3"
    g.remove_node(n5)
    n6 = g.add_node
    h[n6] = "foo4"
    raise "fail" if h[n6]!="foo4"
    g.remove_node(n6)
end

g = Graph.new
h = WeakKeyMap(Graph::Node, String).new
fn(g,h)
p h.size # 6
GC.collect
4.times {Fiber.yield}
p h.size # 2
2.times {Fiber.yield}
p h.size # 0
1 Like

@wolfgang371 Thanks for sharing your struggles/efforts. I learned several things about Crystal watching you and others try to solve this.

2 Likes

ha - this time, I think I have it :smiley:

After some more debugging it turned out that it’s not the GC’s fault, but the Key(T) has an issue.
The hash method just calculates different hashes for the keys before and after GC’ing, so it never succeeds removing the keys. Now I cache the latest Hasher and this seems to work for all my examples.

require "weak_ref"

class WeakKeyMap(K, V)
    # precondition: do _not_ subclass WeakRef(T)
    private class Key(T)
        @hasher = Crystal::Hasher.new
        def initialize(k : T)
            @k = WeakRef(T).new(k)
        end
        def value
            @k.value
        end
        def ==(other : Key(T))
            value == other.value
        end
        def hash(hasher)
            # this is crucial, since after GC is setting .value to nil, the hashes are different and deletion doesn't work anymore
            if @k.value
                @hasher = @k.value.hash(hasher)
            else
                @hasher
            end
        end
    end
    include Enumerable({Key(K), V})
    @hash = Hash(Key(K), V).new
    def initialize
        # since...
        # - GC has no hooks for cleaning
        # - Hash iterators are all private, so we cannot store the iterator state here
        # ... we have to use fibers
        spawn do
            while true
                @hash.each do |(key, value)|
                    @hash.delete(key) if !key.value
                    Fiber.yield
                end
                Fiber.yield
            end
        end
    end
    def []=(key : K, value : V)
        @hash[Key.new(key)] = value
    end
    def []?(key : K) : V|Nil
        @hash[Key.new(key)]?
    end
    def [](key : K) : V
        self[key]?.as(V) # V may or may not include Nil
    end
    def delete(key : K)
        @hash.delete(Key.new(key))
    end
    def each
        @hash.each do |(key, value)|
            @hash.delete(key) if !key.value
            yield({key, value})
        end
    end
end

class Graph
    def initialize
        @nodes = Set(Node).new
    end
    def add_node : Node
        n = Node.new
        @nodes.add(n)
        n
    end
    def remove_node(node : Node) : Nil
        @nodes.delete(node)
        nil
    end
    class Node
    end
end

def fn(g, h)
    n1 = g.add_node
    h[n1] = "foo"
    raise "fail" if h[n1]!="foo"
    g.remove_node(n1)
    n2 = g.add_node
    h[n2] = "bar"
    raise "fail" if h[n2]!="bar"
    g.remove_node(n2)
    n3 = g.add_node
    h[n3] = "foo1"
    raise "fail" if h[n3]!="foo1"
    g.remove_node(n3)
    n4 = g.add_node
    h[n4] = "foo2"
    raise "fail" if h[n4]!="foo2"
    g.remove_node(n4)
    n5 = g.add_node
    h[n5] = "foo3"
    raise "fail" if h[n5]!="foo3"
    g.remove_node(n5)
    n6 = g.add_node
    h[n6] = "foo4"
    raise "fail" if h[n6]!="foo4"
    g.remove_node(n6)
end

g = Graph.new
h = WeakKeyMap(Graph::Node, String).new
fn(g,h)
p h.size # 6
GC.collect
4.times {Fiber.yield}
p h.size # 2
2.times {Fiber.yield}
p h.size # 0
1 Like

Storing the entire Hasher value is not a good idea. This retains information about data outside this particular Key instance. The value of the hasher argument is derived from anything that was hashed before this particular Key instance. As part of a larger structure that is hashed as a whole, this practice prevents changes in anything that was hashed before a Key instance to embody. So the hash information would be wrong.

A better solution is to store a sub hash which only covers the key value.

def hash(hasher)
  if value = @k.value
    key_hasher = value.hash(Crystal::Hasher.new)
   @hasher = key_hasher
  else
    key_hasher = @hasher
  end
  key_hasher.hash(hasher)
end
1 Like

great, so, finally we have…

including the fix from below:

require "weak_ref"

class WeakKeyMap(K, V)
    private class Key(T)
        @hasher = Crystal::Hasher.new
        def initialize(k : T)
            @k = WeakRef(T).new(k) # using WeakRef(T) because subclassing doesn't work
        end
        def value
            @k.value
        end
        def ==(other : Key(T))
            value == other.value
        end
        def hash(hasher)
            if value = @k.value
                key_hasher = value.hash(Crystal::Hasher.new)
                @hasher = key_hasher
            else
                # this is crucial, since after GC is setting .value to nil, the hashes are different and deletion doesn't work anymore
                key_hasher = @hasher
            end
            key_hasher.hash(hasher)
        end
    end
    include Enumerable({Key(K), V})
    @hash = Hash(Key(K), V).new
    def initialize
        # since...
        # - GC has no hooks for cleaning
        # - Hash iterators are all private, so we cannot store the iterator state here
        # ... we have to use fibers
        spawn do
            while true
                @hash.each do |(key, value)|
                    @hash.delete(key) if !key.value
                    Fiber.yield
                end
                Fiber.yield
            end
        end
    end
    def []=(key : K, value : V)
        @hash[Key.new(key)] = value
    end
    def []?(key : K) : V|Nil
        @hash[Key.new(key)]?
    end
    def [](key : K) : V
        self[key]?.as(V) # V may or may not include Nil
    end
    def delete(key : K)
        @hash.delete(Key.new(key))
    end
    def each
        @hash.each do |(key, value)|
            @hash.delete(key) if !key.value
            yield({key, value})
        end
    end
end

The only (minor) drawback is the ugly purging of the deleted keys.
GC (post) hooks would be fine here, but monkey-patching doesn’t help.

Sorry, I just noticed the final expression in my previous comment was wrong: hasher.hash(key_hasher). It needs to be the other way round: key_hasher.hash(hasher). :see_no_evil:

1 Like

Regarding the GC having no hooks, there is the undocumented GC.before_collect method you can pass a block that will execute … before the collection. So it will be one off in the collection cycle if you do clean ups there.

4 Likes