Reducible unions from type filters

While revisiting some of my local WIP branches I rediscovered this example:

abstract class A; end

class B1 < A
  def foo; end
end

class B2 < A
  def foo; end
end

class B3 < A; end

x = B1.new || B2.new || B3.new
typeof(x) # => A
if x.responds_to?(:foo)
  typeof(x) # => (B1 | B2)
end

x's type is a reducible union within the then-branch. It seems this is by design, so that x.foo always works; the result is the same if the condition is x.is_a?(B1) || x.is_a?(B2) instead. But weird things could happen:

x = B1.new || B2.new || B3.new
if x.responds_to?(:foo)
  y = uninitialized typeof(x)
  typeof(y)  # => (B1 | B2)
  y = B2.new # Error: type must be (B1 | B2), not A+
end

Moving the responds_to? call into the typeof eliminates the union:

x = B1.new || B2.new || B3.new
y = uninitialized typeof(x.responds_to?(:foo) ? x : raise "")
typeof(y)  # => A
y = B2.new # okay

Another example:

class A; end
class B < A; end

x = B.new || A.new
typeof(x) # => A
if x.is_a?(A) || x.is_a?(B)
  typeof(x) # => (A | B)
end

I am under the impression that no reducible unions should ever be exposed to Crystal programs. Can this be reconciled?

Maybe! But then how would type filtering for responds_to? work in this case? I think that’s the main reason we made it work like that. Of course it’s really unsound because, as you point out, whenever you put something into that union, it goes to the parent again.

I’m not sure how we can solve this. I feel that I’d really like to be able to talk about B1 | B2 in some cases without going to A.

See also: [RFC] Don't merge unions into parent type when explicitly said so · Issue #9050 · crystal-lang/crystal · GitHub

Maybe the type union B1 | B2 that results from a responds_to? should be a special union type that can’t be reduced. If you merge it with something else, say B3, it knows it needs to not be reduced, and it becomes B1 | B2 | B3.

The same would happen with the RFC I linked above.

However, if we do this, then it’s less clear for users when these unions reduce or not.

The reason we reduce B1 | B2 to A is because if A is a huge hierarchy, then a code might produce lots of different union types, like B1 | B2, B2 | B3, B1 | B3, B1 | B2 | B3… it’s a combinatorial explosion. And in practice it leads to very slow compile-times. So reduction is done to preserve this, not because it’s a nice semantic thing to have. I wonder if there’s a way to solve this without sacrificing type safety… :thinking:

There are three different issues here:

  1. Annotated types gets abstracted (I can’t explicitly mention B1 | B2 and mean it).
    Its reasonable that x gets the type A, but if I state x : B1 | B2 | B3 = B1.new || B2.new || B3.new, then it should not do the generalization. This is fixed in the PR by Ary.
    In a comment on the RFC linked by Ary, HertzDevil suggest having a different notation for non-reducible unions. I think we can try keeping the actual notation, as in the PR, and fix the problems below instead.

  2. When a non-reducible union is merged with a type, it forgets its non-reducibility.
    When doing

y = B2.new # Error: type must be (B1 | B2), not A+

That A+ comes from merging (B1 | B2) | B2. The compiler here should note that B2 is in B1 | B2, and that B1 | B2 shouldn’t be reduced/abstracted.

  1. (Totally unrelated) An uninitialized object is treated differently than an initialized one.
    In fact, there is no reason for the compiler to do that merge in 2. y gets a B2, so typeof(y) is B2. If I do
y = (B1.new || B2.new).as(typeof(x))
y = B2.new

it works fine.

There is a difference between (B1 | B2) and (A | B); the B is redundant, because it is a subtype of another type in the same union (A). At the minimum I think these redundant types should be eliminated.

1 Like