puts Union(NamedTuple(b: String), NamedTuple(a: String, b: String)).from_yaml "a: a\nb: b"
# {a: "a", b: "b"} (expected)
puts Union(NamedTuple(a: String), NamedTuple(a: String, b: String)).from_yaml "a: a\nb: b"
# {a: "a"} (unexpected)
puts Union(NamedTuple(a: String), NamedTuple(a: String, b: String)).from_yaml "b: b\na: a"
# {a: "a"} (unexpected)
Guess it can be solved using classes with YAML::Serializable::Strict but still searching for shorter solution
What isn’t expected here? The implementation of .from_yaml (and JSON) for union is to try deserializing the data into each member of the union and the first one wins, which you’re seeing here.
Unless I’m missing something the solution, to this simple example, is to just not use a union when you already know the structure of the input data. Can you share more about your actual use case?
EDIT: The first example “works” because the two member named tuple is actually first in the union. E.g.
crystal eval 'pp Union(NamedTuple(b: String), NamedTuple(a: String, b: String))'
(NamedTuple(a: String, b: String) | NamedTuple(b: String))
1 Like
Just reordered tuple fields so more “complex” type comes first in Union, works fine
@Blacksmoke16 Being order dependent on the union types sounds fragile. It means that A | B isn’t the same as B | A. What if the compiler decided to reorder the types for some reason?
2 Likes
The compiler normalizes the union order, so A | B and ,B | A are the same.
But still, this order-dependent deserialisation is not ideal. It comes with a number of drawbacks (see YAML de-serialization has a performance issue with Bool and Float (even if not used) · Issue #15436 · crystal-lang/crystal · GitHub for example). Yet I don’t see any better alternative.
I’ve run into this so many times and even ended up changing names of a few types so they’d come first in the A | B normalization. I never thought to use an explicit Union(A, B) to specify priority order. That’s a really neat trick. Not ideal, as @ysbaddaden mentioned, but definitely a better tradeoff vs changing the names of the types to be in lexical order for priority.
The fact that JSON and YAML deserialization even supports union types is honestly pretty amazing. I recognize that it’s necessary in order to support types like String?, but the fact that supporting that simple concept also allows it to support things like String | Int64 and even unions of complex nested objects is incredible and isn’t something I’ve seen in other statically typed programming languages, at least not in a way that’s this simple to work with.
1 Like