Ensuring smooth language changes for 2.0

A few incompatible changes to language semantics were introduced between 0.36.1 and 1.0.0, and we do not want to repeat the same when 2.0 drops. For breaking changes initiated by Crystal code, the @[Suppress] annotation is one possible solution, but the same idea won’t work for language-level changes such as redefining overload order. I believe that, in order to ensure smooth migrations to 2.0:

  • The Crystal compiler under 2.0 semantics must be readily available at the same time ongoing development happens on 1.x.
  • Any breaking language changes must be introduced in such a way that there exists valid 1.x code that is also valid 2.0 code with no changes. (Entirely new language features can be introduced in 1.x directly.)
  • If the above is not possible (e.g. renaming a type and the existing code relies on the type’s name), the checks on Crystal::VERSION should be minimized.
  • Mixing 1.x and 2.0 semantics within the same program or project should be prohibited.

For a concrete example of “valid 1.x code that is also valid 2.0 code with no changes”, consider renaming a type Foo to Bar: (this is not a language-level change)

# Crystal 1.(x-1)
class Foo; end

# Crystal 1.x
@[Deprecated]
class Foo; end

@[Suppress(:deprecated_name)]
alias Bar = Foo

# Crystal 2.0
class Bar; end

Code that works on both 1.x and 2.0 is:

x.is_a?(Bar)
x.class == Bar
x.class.name.in?("Foo", "Bar")
x.class.name == Bar.name # both sides are "Foo" on 1.x

# avoid if possible
x.class.name == {% if compare_versions(Crystal::VERSION, "2.0.0") >= 0 %} "Bar" {% else %} "Foo" {% end %}

Code that doesn’t, and therefore needs to be migrated:

x.is_a?(Foo)   # Foo is undefined in 2.0
x.class == Foo # Foo is undefined in 2.0
x.class.name   # produces different results between 1.x and 2.0
x.class.name == "Foo"
x.class.name == "Bar"

If the renamed type were an AST node instead, we cannot do the same because its behaviour is hardcoded into the macro interpreter; this happened here where Global was briefly changed to SpecialVar. Thus renaming it would be a language-level change, and the renaming mechanism must leave the possibility of code working under both semantics. More specifically, these must work under both 1.x and 2.0:

{% x.is_a?(SpecialVar) %}
{% %w(Global SpecialVar).includes?(x.class_name) %}

# AST node types themselves cannot be referred from macros so this fails
# {% x.class_name == SpecialVar.class_name %}

# avoid if possible
{% x.class_name == (compare_versions(Crystal::VERSION, "2.0.0") >= 0 ? "SpecialVar" : "Global") %}

Code that doesn’t, and therefore should be migrated:

{% x.is_a?(Global) %}
{% x.class_name %} # => "Global" on 1.x, "SpecialVar" on 2.0
{% x.class_name == "Global" %}
{% x.class_name == "SpecialVar" %}

This is what “valid 1.x code that is also valid 2.0 code with no changes” means, and to achieve this we now know that Crystal::MacroInterpreter#visit(node : IsA) needs to be revamped to support “aliases” of AST node types, before Global can ever be renamed. We could generate deprecation warnings for uses of Foo and Global, but without the ability to test the same code on 1.x and 2.0 semantics right now, cases like direct uses of x.class.name would be very difficult to detect. So we should decide upon the way those kind of breaking changes are exposed, as soon as possible.


To that end, here are some solutions that I could think of:

Distribute Crystal 2

$ cat code.cr
def f; puts "okay"; end
def f(x = 0); end
def f(x); end
f

$ crystal code.cr
Showing last frame. Use --error-trace for full trace.

In code.cr:4:1

 4 | f
     ^
Error: wrong number of arguments for 'f' (given 0, expected 1)

Overloads are:
 - f(x)

$ crystal2 code.cr
okay

Distribute 2.0.0-dev until we are done with 1.x (possibly never). This will most certainly create the Python 2 problem where too many legacy projects stick to 1.x, and we might end up having to keep two different branches up-to-date, but it has the cleanest CLI interface (and probably cleanest distribution workflow too).

Decouple language level from the compiler version

$ crystal --lang-level=2 code.cr
okay

$ crystal code.cr
Showing last frame. Use --error-trace for full trace.
...

Implement separate versioning for language semantics. The same compiler could support 1.x (--lang-level=1) semantics by default, and only opt in to what would currently be 2.0 (--lang-level=2) semantics if this value is provided. Then Crystal’s major version would be incremented if deprecated parts of the standard library are removed (which we do now), or if an old language level is no longer supported, which may or may not happen together with stdlib removals. Some additional bookkeeping is required for shards, e.g. they must declare the supported language levels, and for the official docs. A variant is to specify this through an environment variable instead, which more or less ensures all invocations of the compiler use the same level.

If everyone uses --lang-level=2 then this means they are willing to accept breaking semantic changes within minor releases, since level 2 semantics are obviously unstable and there is no clear indicator of when it will become stable. To solve this we don’t actually allow level 2 until some kind of language feature freeze, but instead allow --lang-level=dev to signal the intent that those semantics are indeed unstable, and should not be supplied for everyday use. Continuous integration is all that’s needed to detect any incompatibilities between level 1 and level dev; if there are none, the latest stable level will be enough.

Have a “use strict” option

$ crystal --strict code.cr
okay

$ crystal code.cr
Showing last frame. Use --error-trace for full trace.
...

A stronger version of above, where --strict on Crystal x.y implies --lang-level=dev, and every Crystal (x+1).0 release removes support of all previous levels. This reduces the maximum number of language levels to 2, current-major and next-major, but also means maintenance of 1.x will halt as soon as 2.x development starts (probably not a real issue, as we did stop supporting 0.x that soon). “Strict” also has different connotations compared to “development”, and having the former imply unstable behaviour is probably not a good thing.

6 Likes

I don’t necessarily agree on that. We’re using compiler flags for testing experimental compiler features. I could definitely see this as something that could continue after the test period, when a feature has been accepted but is not yet activated in the old semantics. Selecting compiler behaviour for specific features is useful for a gradual transition to the new semantics.
This plays well with a language level or strict option.

Maintenance is independent of feature development. We can happily continue providing supporting releases for 1.x while development of new features has moved to 2.x.

Thanks for bringing this up. I feel it’s a bit early to start discusing this, but nevertheless it’s good to have this in the back of our heads.

I wasn’t around when the transition between 0.36 and 1.0. But I think the transition between 1.X and 2.0 is a different one, since I wouldn’t expect people to have to move to 2.0 immediately. In fact, having the two co-exist, in whatever form, allows us to polish the 2.0 without the pressure of forcing projects to move.

A brain-stormy alternative to your first proposal (having crystal and crystal2) would be to drop support for crystal at a given time after crystal2 is around (say, 2 years), but ensure it will still be installable for some more years, yet without bug fixes. This will permit that old code can still be run, without loading us with a lot of burden.

I don’t think this is too early. We already have a couple of breaking changes in the pipeline that we’ll probably want to have for 2.0. It would be good to have a plan how to move on. We shouldn’t have to wait up until shortly before 2.0 to get them merged somewhere.

Also, not all changes are hard breaking changes, there’s also the case when it’s more gradual and a first iteration would introduce a warning which becomes an error in the next major release.

As I understand it, these proposals are about the development period. When 1.x is the stable release, but we already want to have some 2.0 functionality available as an option.

Maintenance after the 2.0 release is a completely different topic. But of course, we can and should support the latest 1.x release for some time after the 2.0 release.
And older releases will of course be available any time (although using unsupported versions wouldn’t be recommended).

Ah, thanks for the clarification. But then, I’d prefer to discuss specific things, like, what is super important that can’t wait for 2.0? Or that isn’t just Deprecateble in 1.X?

I don’t think it’s about being too important that we couldn’t wait. But pulling breaking changes up gives more chances to find potential issues with them, makes it easier to partially migrate have people update (or start) their code bases with future-proof semantics.

A specific change that we could incorporate is https://github.com/crystal-lang/crystal/pull/8893 which is already tagged for 2.0 (I don’t think we’re at a stage were we can say it will definitely be in 2.0, but we could start providing it as a preview feature to see how it works). That PR also contains a good explanation of a feature flag workflow.

1 Like

Has the core team discussed something like Editions before, (found in Rust) as a graceful, opt-in rollout strategy for new features/semantic changes to the language?

2 Likes

Nope. I’m not sure how they do the separation per crate, but I expect that would be rather difficult to do in Crystal.

1 Like

If nothing else, it would be really nice to have a branch that is easy to install that consists of stuff aimed at 2.0 as well as having a place to merge stuff that is aimed at future 1.x branches. The current process seems to leave PRs floating for a while (until things are selected for the next release), and quickening the response time from submit to merge would be nice, even if it takes the same time to reach an official release.

It might also give quicker feedback, as some people would prefer to install the most up to date version to gain access to newer features and therefore might find errors faster than the release schedule. This would create a rust-ish situation where people may install unstable versions, but is that bad? It would mean there are features added that people want while making sure there are quicker feedback.

I’m not all that big on command line feature flags. Seems like a mess, especially when taking crates into account.

2 Likes

3 posts were split to a new topic: Timeline for multithreading support

What you suggest sounds like having multiple development branches in parallel. I fear that it would be hard to keep track of what’s happening where and cause a lot of confusion. At the current size of the project, I don’t think it is feasible to manage.

In the last core team meeting we decided to keep and maintain just one branch, adding the new, breaking features with opt-in command-line flags. Then, the version 2 of Crystal will simply consist of the compiler with some of these turned on by default.

5 Likes

Continuation on compatibility profiles:

I’ve been thinking about this a bit more and I’m more convinced that the schema for compatibility profiles proposed in Compatibility profiles for CI tests and the compiler · Issue #11706 · crystal-lang/crystal · GitHub is not an adequate representation. It’s potentially ambiguous because the meaning changes over time and overall too confusing.

It might be good enough for the confined use case of Makefile configuration, but we should take the big picture into consideration. What we need is an easily understandable solution for users of the compiler to opt-in for progressive language changes in a consistent and comprehensible way.

The idea of decoupling the language level support from the compiler version looks very appealing.
We’re talking about bundling feature options of the compiler. Each bundle represents a variant of the language, expressed as a specific compiler configuration. So I think it makes a lot of sense that each compiler version can support different language levels.

I think a good example for this concept are language editions in Rust, which serve exactly the same purpose as what we’re looking for in Crystal. Using the year as edition identifier seems to be great idea. This is kind of an alternative versioning scheme. It only needs major versions. And it’s distinctive from the compiler version to avoid confusion.

A great aspect of Rust editions is that the choice of edition is per crate. So every crate can progress on its own pace. I’m not sure if that would work in Crystal as well, but we can try it. I suppose it means scoping feature flags in the compiler by source location. For syntax features, this should be trivial. For semantic changes, it will be more complex. But I figure it could be doable.
As far as I understand, that’s essentially how it works in Rust as well. The general memory representation (AST) is identical for all editions, only some behaviour changes depending on the selected edition for the crate that a syntax node belongs to. This obviously poses some limits on what you can change in an edition, but I think that’s fine.

6 Likes

From a user perspective, having the compiler support multiple language editions sounds like a fantastic feature that would make it much easier to try out version 2 and gradually upgrade projects.

@straight-shoota So something like --edition=2024 would enable all the flags from the 2024 edition (I like that Rust follows on C/C++ to use the year as reference) that would in turn change how we lex/parse/analyse the code based on its location (stdlib, shard, .).

It might make the compiler & stdlib internals a bit of flag mess sometimes, but that sounds like a dream come true from an user perspective :+1:

2 Likes

Bonus: define individual flags right in shard.yml along with the general edition :heart_eyes:

1 Like

We’re already putting individual flags into the compiler to enable future changes as preview features (preview_overload_order for example). An edition would just enable a collection of flags as a convenience for the user.

The mess will grow if we add more flags, of course. And maybe editions would encourage having more flags because they’ll be more heavily used. But then I think there’s not really any alternative either if we want to introduce changes progressively. And that’s certainly better than eventually introducing 2.0 with a hard cut.

Though there would be some limitations to the per-shard, especially flags that change the stdlib behavior, where once the flag is enabled it changes the shared stdlib code and affects all shards (see Return type of `unbuffered_read`/`unbuffered_write` · Issue #14377 · crystal-lang/crystal · GitHub for example).

Maybe an opt-in / opt-out mechanism for each flag? Like I enable the 2024 edition but opt-out of io_buffered_strict_types, or I enable the 2023 edition and opt-in to some flags I know are compatible (or need).

@straight-shoota I mentioned the potential for messy code, but that puts the burdon on us, not on everyone using the language. We know first hand on frustrating it can be (LLVM :face_exhaling:).

Editions might encourage to focus on grouping changes, for example a more general io_strict_types (or even std_strict_types) to group all IO related type constraints, instead of a specific io_buffered_strict_types.