Are unions in Crystal considered as tagged unions (sum type)?

or are they untagged unions? I’m mainly trying to relate to principles in other programming languages. I couldn’t find anything in the docs. Thanks.

I’m not sure what your question is aiming at, why is this relevant to you?

For some simple code like

struct A
   @a : Int64  = 1
   @b : Int64  = 2
end

a = "foo".as(String|Int32)
b = "foo".as(String|Int64)
c = "foo".as(String?)
d =  1.as(Int32?)
e = "foo".as(String|A)

The types in the LLVM IR are:

%"(Int32 | String)" = type { i32, [1 x i64] }
%"(Int64 | String)" = type { i32, [1 x i64] }
%"(Int32 | Nil)" = type { i32, [1 x i64] }
%"(A | String)" = type { i32, [2 x i64] }

As you can see the union type is represented with a tag and a field big enough for the actual value. Except for unions between a reference type and Nil, there the nil value is simply represented as a null pointer.

I was trying to find out the best way in crystal to model the enum ip address data type example from rust. https://doc.rust-lang.org/book/ch06-01-defining-an-enum.html. Given rust’s enums are basically sum types.

    enum IpAddr {
        V4(u8, u8, u8, u8),
        V6(String),
    }

    let home = IpAddr::V4(127, 0, 0, 1);

    let loopback = IpAddr::V6(String::from("::1"));

perhaps there is a better place/category to talk about this. But I am posting ways I would model it in crystal for completeness. Personally I wouldn’t mind hearing other people’s thoughts on this.

  1. using static duck typing
struct V4
  getter :octet1

  def initialize(
    @octet1 : Int32, 
    @octet2 : Int32, 
    @octet3 : Int32, 
    @octet4 : Int32)
  end
end

struct V6
  def initialize(@str : String)
  end
end

alias Ip = V4 | V6

home = V4.new(127,0,0,1)
loopback = V6.new("::1")
  1. using module as interface
module Ip
end

class V4
  include Ip
  getter :octet1

  def initialize(
    @octet1 : Int32, 
    @octet2 : Int32, 
    @octet3 : Int32, 
    @octet4 : Int32)
  end
end

class V6
  include Ip

  def initialize(@str : String)
  end
end

home = V4.new(127,0,0,1)
loopback = V6.new("::1")
  1. using an abstract class
abstract class Ip
end

class V4 < Ip
  getter :octet1

  def initialize(
    @octet1 : Int32, 
    @octet2 : Int32, 
    @octet3 : Int32, 
    @octet4 : Int32)
  end
end

class V6 < Ip
  def initialize(@str : String)
  end
end

home = V4.new(127,0,0,1)
loopback = V6.new("::1")

all three ways we can use the case to match the type

def show_ip(ip)
  case ip
  when V4 then "ip v4 #{ip.octet1}"
  when V6 then "ip v6"
  end
end

Personally I would argue the aliased union, so your number 1, is the most idiomatic solution. But then I never understood why we added abstract classes…

My arguments for that would have nothing to do with the internal data presentation though. Crystal should be viewed as a high level language where you only care about these details when optimizing an identified performance bottleneck in your application. I would simply argue it makes the best use of the language features.

I’m sure it’s just an example, but just for the benefit of anybody stumbling on this in the future, of course stdlib sports a datatype for IP addresses: https://crystal-lang.org/api/0.34.0/Socket/IPAddress.html

1 Like

@epoch Only alias is the equivalent to Rust because in the next version apparently you will be able to do:

case ip
in V4
in V6
end

and if you forget one of the cases you will get a compile error. Using classes or modules you will not get an error. And in Rust you get an error, so only alias is equivalent.

Ideally you would be able to use classes or modules with a sealed annotation but it seems not many like the idea so maybe it won’t happen. It would be nice if defining a closed group of types would be orthogonal to being able to check they are all covered in a case, like you can do in Scala and Kotlin, but not in Crystal.

1 Like

alias union is also my favourite usually but I believe all three are exhaustive.

on version 0.34.0 I am getting a warning on all three which I should be getting a compile error on the next version?

 163 | case ip
       ^
Warning: case is not exhaustive.

Missing types:
 - V6

A total of 1 warnings were found.

Try this:

module Ip
end

class V4
  include Ip
end

class V6
  include Ip
end

class IpHolder
  getter ip

  def initialize(@ip : Ip)
  end
end

holder = IpHolder.new(V4.new)
case holder.ip
when V4
when V6
end

You get a warning where “Ip” is in the missing types.

Try now this:

class Ip
end

class V4 < Ip
end

class V6 < Ip
end

class IpHolder
  getter ip

  def initialize(@ip : Ip)
  end
end

holder = IpHolder.new(V4.new)
case holder.ip
when V4
when V6
end

Same warning.

If you do this:

ip = V4.new || V6.new
case ip
...
end

it’s not the same because apparently the compiler types it as V4 | V6, not a Ip. You need to have an Ip to verify it.

alias is the only equivalent way to how it’s done in Rust.

1 Like

The semantics are changing in 0.35.0.

case value
when x then ...
end

Will now act like it did before 0.34.0. Exhaustive case is now opt in via:

case value
in x then ...
in y then ...
end

See https://github.com/crystal-lang/crystal/pull/9258.

1 Like

Totally agree with you here Hank. I learned it the hard way thinking modules interfaces work like in golang (structural typing matching the shape) which sort of works in crystal for methods if you don’t annotate the type but unfortunately does not work for instance variables when you initialize.

Thank you for the link to sealed annotation. Keen to see where it goes in the future but It seems alias is the way to go for now.

@epoch unions in crystal are not tagged. They are implicit.

In Haskell you have Maybe a = Just a | Nothing, in crystal is T | Nil directly. So there is no need to tag a value with a Just.

But, since they are implicit, there are ways to code in a more tagged/explicit fashion if you like. But then you might need to fight at some point with method restrictions. A Just a is not an a.

1 Like