Collecting all generic instances in a Program

I want to collect all types that appear in a Program.

So, after parsing some source code (only the prelude in the example below), I get a Program object from the compiler

Initialising the compiler
require "compiler/crystal/**"
src = Crystal::Compiler::Source.new("", "")
compiler = Crystal::Compiler.new
compiler.no_codegen = true

I have a class that traverses the types.

class TypeCollector
  getter types = Set(Crystal::Type).new

  def collect(type : Crystal::GenericType)
    @types << type
    type.generic_types.values.reject(&.unbound?).each do |bound_instance_type|
      collect bound_instance_type.as Crystal::GenericInstanceType
    end
    type.types?.try &.values.each do |sub_t|
      collect sub_t
    end
  end

  def collect(type : Crystal::NamedType | Crystal::GenericInstanceType)
    @types << type
    type.types?.try &.values.each do |sub_t|
      collect sub_t
    end
  end
end

program = compiler.compile(src, "").program
tc = TypeCollector.new
tc.collect(program)

I assumed this should in theory aggregate all types into one big set. While it does so for most types, there is an issue with generic modules.

To check if I did really get all the types, I decided to check if parents of each of the type in a set are themselves in this set. Turns out it’s not the case, so I aggregated all “uncollected” types into another set.

all_types = tc.types
uncollected = Set(Crystal::Type).new

all_types.each do |type|
  type_parents = type.parents
  type_parents.try &.each do |parent|
    unless all_types.includes? parent
      uncollected << parent
    end
  end
end

uncollected.each do |uncollected_type|
  type_args = uncollected_type.type_vars.as(Hash).map do |label, var|
    {label, var.type}.as Tuple(String, Crystal::Type)
  end
   uncollected_type.to_s.rjust(35) + " -- " + type_args.to_s
end

The output surprised me a little:

Output
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
             Comparable(Pointer(T)) -- [{"T", Pointer(T)}]
Comparable(Pointer(Pointer(UInt8))) -- [{"T", Pointer(Pointer(UInt8))}]
                       Indexable(T) -- [{"T", T}]
                       Indexable(T) -- [{"T", T}]
                   Indexable(T | U) -- [{"T", (T | U)}]
                 Iterator(Array(T)) -- [{"T", Array(T)}]
                 Iterator(Array(T)) -- [{"T", Array(T)}]
                 Iterator(Array(T)) -- [{"T", Array(T)}]
              Iterable(Tuple(K, V)) -- [{"T", Tuple(K, V)}]
            Enumerable(Tuple(K, V)) -- [{"T", Tuple(K, V)}]
              Iterator(Tuple(K, V)) -- [{"T", Tuple(K, V)}]
                        Iterator(K) -- [{"T", K}]
                        Iterator(V) -- [{"T", V}]
                        Iterable(B) -- [{"T", B}]
                      Enumerable(B) -- [{"T", B}]
                      Iterable(Int) -- [{"T", Int+}]
                    Enumerable(Int) -- [{"T", Int+}]
                    Iterable(Float) -- [{"T", Float+}]
                  Enumerable(Float) -- [{"T", Float+}]
                        Iterator(B) -- [{"T", B}]
                        Iterator(E) -- [{"T", E}]
                        Iterator(B) -- [{"T", B}]
                      Enumerable(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(U) -- [{"T", U}]
                 Iterator(Array(T)) -- [{"T", Array(T)}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(U) -- [{"T", U}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                 Iterator(Array(T)) -- [{"T", Array(T)}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
          Iterator(Tuple(T, Int32)) -- [{"T", Tuple(T, Int32)}]
              Iterator(Tuple(T, O)) -- [{"T", Tuple(T, O)}]
            Iterator(Tuple(T1, T2)) -- [{"T", Tuple(T1, T2)}]
       Iterator(Tuple(U, Array(T))) -- [{"T", Tuple(U, Array(T))}]
                 Iterator(Array(T)) -- [{"T", Array(T)}]
                 Iterator(Array(T)) -- [{"T", Array(T)}]
                 Iterator(Array(T)) -- [{"T", Array(T)}]
                      Enumerable(T) -- [{"T", T}]
                        Iterable(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                        Iterator(T) -- [{"T", T}]
                         Channel(T) -- [{"T", T}]
                         Channel(T) -- [{"T", T}]
                       Indexable(T) -- [{"T", T}]
                        Iterable(T) -- [{"T", T}]
                      Enumerable(T) -- [{"T", T}]
                       Indexable(T) -- [{"T", T}]

I also printed uncollected_type.class, and they’re all GenericInstanceTypes, i.e. GenericClassInstanceType or GenericModuleInstanceType

In particular, I don’t understand the following

  • Iterable(T), Iterator(T), Indexable(T) appear many times, despite it being a set. I figured it’s probably because the Ts are bound to different things, but they appear to be bound to some type T in type_vars.
  • Iterable(Int), Iterable(Float) and Comparable(Pointer(Pointer(UInt8))) appear there. This are fully instantiated types. The latter one seems particularly weird to me.
  • And most importantly, why are there “uncollected” types in the first place?

@zlotnleo just as a quick comment (not an answer to your questions), there is https://github.com/crystal-lang/crystal/blob/master/src/compiler/crystal/tools/print_hierarchy.cr that collect/shows types used in a program. It can be used from $ crystal tool hierarchy your_awesome.cr.

When you do:

class Foo(T)
  include Iterable(T)
end

then it will make Foo include a generic type instance of Iterable with a type parameter T. This will happen once for each include Iterable(T) found in the code. Unfortunately those are saved as generic type instances… which I think we shouldn’t do.

(unless I’m misunderstanding something)

Also note that that are currently many bugs surrounding generic bugs so what you expect to get will not always be what you get.

That is, that Iterable(T) in the above example will appear as a generic type instantiation of the generic type Iterable(T).

@asterite So, when Foo is instantiated, e.g. Foo(Int32), would the compiler also create an instance Iterable(Int32), or would it be the same Iterable(T) with T somehow bound to Int32 from Foo(Int32)?

@asterite So, when Foo is instantiated, e.g. Foo(Int32) , would the compiler also create an instance Iterable(Int32) , or would it be the same Iterable(T) with T somehow bound to Int32 from Foo(Int32) ?

It keeps the original type, but when you ask its parents (via parents) it will create instantiations of those.

This is probably not ideal.

It’s great that you are taking a look at the compiler’s source code, and that you are getting to learn it! The source code became to exist thanks to a lot of experimentation, and I believe there’s a lot to improve in how everything is modeled. Having said that, why are you fiddling with the compiler like this?

I’m looking into compiling Crystal to another language, so in this case I’d like to get a list of all “actual” types that can be used throughout the program

To which language?

@asterite JavaScript. I thought it would be nice to have something with the Crystal type system running in browser

Did you take a look to https://www.mint-lang.com/ ?

It’s a web programming language from Crystal.

Did you take a look at Crow: https://github.com/geppetto-apps/crow

It’s a transpiler of Crystal to Flow/JS, but it looks pretty outdated… (2-3 years old)

here are some GitHub issues about having crystal target web assembly. This is not Javascript but I thought you might be interested.