Incremental compilation exploration

Sorry. It may have been a misleading expression. I am not good at English so I have ChatGPT and Deepl write for me. What I wanted to say was that Crystal is a special language that has features that other languages do not have.

Even if all method type information is needed for incremental compilation, I don’t want to put it directly into the code.

Compared to Ruby, it’s harder to reuse library code in Crystal. While Ruby libraries often work even if they’re a decade old, in Crystal, they often don’t work as expected. One reason is that type information in the code becomes outdated. This means I need to modify the code, but I would prefer to avoid doing so. I just want a way to update or fix the external type information files.

In my opinion, type information should be kept in external files:
(1) External type files made by users
(2) .lock files with type information for all methods made by the compiler

It would be nice to have a command-line tool that guesses type information and update or modify these external type files.

External type files provided by the library’s author and automatically generated external type files for all methods (as .lock files) are suggestions from the author and their compiler. Users can use it as is or modify it themselves.

In Crystal, duck typing is possible. Therefore, even if the code is the same, type specifications should differ for each project.

However, I believe these ideas have already been discussed in detail with the core team members. Or perhaps these ideas are not possible at all. I am writing poorly here simply because I feel the need to express my feelings as these views are in the minority in the community.

(Keep in mind that I am more of a hobbyist programmer and am not driven by the need to prevent errors in my work. I always have respect for professional programmers).

@asterite can you comment on how much would be gained by changing to explicit requires?

Just reading through the comments here to remind myself of what everyone said, it sure seems like people love a lot of the things that make Crystal Crystal … except for the ruby-like require. If that alone could get us some miles towards better LSP / faster compile it seems like the community wouldn’t complain.

Re. type inference. It seems like this one is more of a spectrum than binary. Already Crystal asks us to provide types at times. I wonder if there could be some kind of performance mode

crystal -p0 my_entry_point.cr
crystal -p1 my_entry_point.cr
crystal -p2 my_entry_point.cr

so with low performance, it does lots of type inferencing. But then you bump that to p1 and you start getting “could not infer type of XXX in p1 mode, please provide it explicitly”. I’m used to seeing those as part of my Crystal development. If the compiler asked me to do that a little more in favor of getting better faster compile times / better LSP I wouldn’t mind. I really miss my nice LSP experiences from JS/TS/Python etc.!

It might be kinda weird having to add a “compiler performance level” to your shards.yaml or people asking for help and always asking “what performance level were you at” but on the flip side it might be interesting to think of it as a “compiler smartness” setting. You choose if you want a dumb and fast or smart and slow compiler. Struggling with something? Maybe smarten up a bit and get some help. Wanna go faster? Dumb it down and do a little more typing.

Crazy?! maybe. I really don’t know what I’m talking about. Just sharing my feelings :upside_down_face:

1 Like

disagree, the reason Ruby adopt this solution, because Ruby don’t support type declarations from the start, but Crystal support it. and the advantage of the type declarations is, for the code reader, is very clear of the type, code logic. it can only be a supplement for missing type declaration, i guess it probably partially implemented, using some way.

2 Likes

My ADHD-riddled brain is having issues concentrating on the entire thread, so apologies in advance if I miss something.

Crystal’s compilation speed is, at most, a mild annoyance for me, but one that tends to grow as a project gets larger. But maybe I’m just used to long compilation times. I also don’t use an LSP, which I gather compiles things sometimes? So I may just not notice the pain of long compilation times nearly as much.

A full compile of Benben on my desktop (Core i9-10850K) takes approximately 1 minute 7 seconds, not counting the time it takes for Shards to pull down dependencies. That’ll probably only get longer and longer as I expand its capabilities. Adding incremental compilation would help since the majority of my compilations come after changing just a single value or something very small. But, I would still be stuck in that edit->compile->test loop. What I would rather see in a hypothetical “Crystal 2: Remilia Edition” is a move to something closer to image based development, which would probably solve all the problems mentioned here. But, I dunno how feasible that is with something like LLVM.

I’m not part of the Crystal team, but I figured I’d give a response here based on my personal experience.

Early on when I was first learning this language, I remember getting burned when I didn’t specify types. Not in the sense that the compiler was producing incorrect code or that the language was ill-designed, but I kept running into one of two things:

  1. Obscure error messages that pointed to locations that had nothing to do with the actual problem.
  2. Subtle type errors because of typos I made, which caused the compiler to infer a union type where one wasn’t intended.

This led to me saying “to hell with this” and instead adding types everywhere I could except in instances where it was either painfully obvious what the code did, or it was a super simple method that could easily be debugged. Maybe this makes my code look overly verbose or overly typed, but it has saved me a ton of headaches. Rather than leaving off types by default, I make that the exception.

The actual bug in question was in my unfinished port of Doom to Crystal, and it took me about a week to finally figure out what was going on (it was an unintentional int->float conversion, iirc). I’m still undecided on whether or not to point blame at a poor error message in the compiler, or the “types are optional” design. But regardless, it influenced me to add types almost everywhere in my code.

:thinking: What would you call a “small” project? And why would you not recommend Crystal for something “large”? That’s kind of a concerning thing to hear :sweat_smile:

7 Likes

ADHD-riddled brain … compilation speed is, at most, a mild annoyance for me … 1 minute 7 seconds … majority of my compilations come after changing just a single value or something very small.

As a fellow neurospicy individual, the thought of waiting over a minute to change one line of code makes me squirm in my chair :joy:

I also don’t use an LSP, which I gather compiles things sometimes?

This is how I work in ~90% of languages:

  1. I write us, editor suggests user_values, I hit tab and the word is inserted.
  2. I write ., a menu comes up with suggestions, I see sum, write s, hit tab.

That’s six keystrokes, which in combination with the suggestions from the LSP means that I can essentially off load a large chunk of my focus to the LSP, and speed up my typing by quite a large amount.

This cycle is either:

  • Hundreds of milliseconds (Haskell, F#, Elixir, C#, Java, Kotlin, Swift…)
  • Tens of milliseconds (Erlang, Go, Zig, Rust, Ruby…)

For Crystal, that same cycle is the duration of the entire compilation time, as there’s no incremental compilation. If I was working in your project, it would mean that my us -> user_values workflow would take 1 minute and 7 seconds. By that time, I’m 1½ hours into doom scrolling the internets.

1 Like

Go, Dart do this things very well (with LSP).

But, those languages will never become Crystal.

AFAIK, the Ruby programmer rarely write code like this.

How about we introduce a new kind of encapsulation similar to how FFI works. For example something like this:

mod Foo
  class Bar # exposing top level `Bar` class
    fun foo(x : self)
    fun self.bar
  end

  fun foo(a: Int32, b: Int32) # exposing top level `foo` method
  fun foo(a : Int32, b: String)
end

Basically it is some wrapper that explicitly declare all required methods with associated types before hand. since everything in this mod is explicit it can be compiled separately and cache independently.
It works like other shared c libraries but it can smartly resolve crystal types by comparing the memory layout.

We do not need to make anything a mod, just write a wrapper one when you identify some piece of code can be isolated and can be slow to compile.

the advantage of this approach:

  • do not change the current language design, just extend them
  • is flexible because it can be written by the users instead of the library writers and can be modified at will
  • can be automatically generated so some degree (like libgen)
  • code can be reused for multiple endpoints, can be compiled separately (and ideally parallel)
  • everything inside mod can be a black box so we do not need to keep a single huge AST tree anymore

There might be some challenge around how to split the prelude runtime code depended by each mod though.

obviously you can also write something like this:

mod Foo
  def foo(input : String) : String
     do_something_with_string
  end
end

basically it acts similar to current module, but with explicit types and auto delegation. It also takes care of type casting like current ffi methods do.

Go, Dart do this things very well (with LSP).

But, those languages will never become Crystal.

AFAIK, the Ruby programmer rarely write code like this.

Do I understand correctly that you’re suggesting that Go and Dart are the exceptions? If so, you have it backwards.

To expand on the list I gave above, here’s a list of languages that I’ve spent a non-trivial time on, with respect to their LSP/editor tooling situation:

  • None

    • Janet (LISP → C)
    • Fennel (LISP → Lua)
    • Most schemes
  • Some

    • JavaScript (partial due to weak types)
    • Crystal (incomplete compilation, very slow compile times)
  • Incomplete implementation, fast

    • Nim (kitchen sink language with 3+ layers of syntax sugar)
    • Zig (extremely fast)
  • Full, but slow

    • Haskell (it’s fine until it isn’t)
  • Full

    • C#, F#, Java, PHP, TypeScript, Ruby, Elixir, OCaml, Lua, C, C++, all of the Pascals, Swift, Dart, Kotlin, Scala, Python, Common Lisp, Clojure, Clojurescript… and a tonne more I can’t remember
  • Full, at blazing speeds

    • Go (single milliseconds)
    • Rust (faster than I can react)
    • Erlang (single milliseconds, written in C)

In that list you have:

  • Dynamic, weakly typed languages
  • Dynamic, strongly typed languages
  • Statically typed languages without inference
  • Statically typed languages with inference
  • Procedural languages
  • OOP languages
  • FP languages
  • All-of-the-above languages
  • Languages that also use LLVM as a backend

Replace object.method with package.function, module:function, namespaces/function for whatever flavor of language; it’s all the same regardless of the underlying implementation.

The functionality that this provides is:

  • Inline errors
  • Autocomplete
  • Suggestions
  • Go to definition
  • Global rename
  • Refactoring
  • Snippets

That’s the beauty of LSPs/editor tooling; you get to have all the nice things regardless of what your language is. As of today, not having a good LSP implementation is increasingly hard to motivate.

The same goes for compile times. Our low end machines are millions of times faster than the ones used to send humans into space, and they’re falling over converting one form of text into another.

For the Lucky example, the 37 seconds are (visually, to me as an end user) spent to create this:

class Note < BaseModel
  table do
    belongs_to user : User
    description : String
  end
end

Because there’s no incremental compilation, I would imagine that 36.9 of those seconds are spent not generating a file that’s 6 lines long.

I want to apologize if I come across as negative, I realize that this is a difficult problem to solve and that a potential solution, whichever it is, might turn off some users.

I’m writing this because I love Crystal. It’s an amazing language… with a developer experience that’s decades behind the majority of (known, active, used) languages out there, including ones that are as old or younger than Crystal.

This is by no means critique of Crystal as it is now, rather what I perceive as the risk of stagnation in this area for the foreseeable future. I don’t think I’ve ever read a thread about the language on a forum without a couple of people writing something to the effect of “amazing language, but I stopped using it because of the very long compile times”.

5 Likes

It’s not about explicit requires. It’s about that and typing every method.

When a language has these two things, every file/module can be looked almost independently of other modules. For example if the compiler sees this:

# foo.cr
class Foo
  def self.bar(x : Int32) : Int32)
    x + 1
  end
end

# bar.cr
require "foo"

Foo.bar("a")

First, foo.cr can be looked at and typed without needing actual code that calls Foo.bar. Also, foo.cr doesn’t require any file (well, I guess prelude, but that never changes) so there’s no need to type that file every again unless prelude changes or unless that file changes. The compiler can cache that information, and even cache the generated object file.

Then, bar.cr requires foo (and prelude). It knows the type of Foo.bar without having to “instantiate” it with an actual call, and can determine that Foo.bar is an error. Also, bar.cr can only need to be re-typed if it changes, or if foo changes, or if prelude changes. The compiler can build a graph of files and know exactly what needs to be retyped when something changes.

Compare this to Crystal now. In Crystal we have modules. For example this:

module Moo
  abstract def foo
end

Okay, Moo defines a foo method… Now we can have this:

require "moo"

class Foo
  getter moo : Moo

  def initialize(@moo : Moo)
  end
end

Okay, Foo holds a Moo. What’s the problem with that?

Do you know how Crystal implements modules (and types in general)? The compiler needs to know all possible typesto be able to produce code. It handles calling methods on them by checking which type is it.

So, let’s say we have this code:

require "moo"
require "foo"

class MyMoo
  include Moo

  def moo
    1
  end
end

foo = Foo.new(MyMoo.new)
foo.moo.moo

Now the compiler knows that Moo has one child, MyMoo. The compiler will still know that Foo holds a Moo. When resolving foo.moo.moo it will say:

  • okay, foo.moo is of type Moo
  • foo.moo.moo is calling moo on something of type Moo. But that’s a module… so what are the possible values? Ah! The only types that exist that include Moo are MyMoo. Cool, so we have to call MyMoo#moo here. No other way.

Now… let’s say I create a new file in my project, that I include in my compilation:

class AnotherMoo
  include Moo

  def moo
    "hello"
  end
end

Now… this new file actually affected existing files! Now Moo can be MyMoo or AnotherMoo (I know that we only pass a MyMoo to Foo, but the compiler only knows that Foo holds a Moo) so it will have to re-compile that old code and consider those two values, and do a multi-dispatch. Not only that! Notice that this new Moo returns a String. So now the moo call even returns a different type, so anyone relying on that call has to be recompiled.

It gets worse.

class Generic(T)
  include Moo

  def initialize(@x : T)
  end

  def moo
    @x
  end
end

Great! Now we have a Moo that’s included in a generic type. Now the compiler needs to know all possible instantiations of Generic to be able to know how to resolve Moo#moo. And a generic instantiation can happen anywhere! The compiler needs to look at every single file in the project to see if someone creates a Generic(Int32) or a Generic(String).

So now you know why the compiler compiles everything from scratch.

Explicit requires doesn’t solve this.

What solves this is to require the definition of Moo#moo to have a type:

module Moo
  abstract def moo : Int32
end

Now:

  1. We can easily type a call to Moo#moo: it always returns an Int32!
  2. We’ll eventually have to generate code when Moo#moo is called… do we need to know all possible Moo types? Not necessarily. Languages implement this with virtual tables (I won’t explain this here)

And this doesn’t work if some types are typed and some aren’t (gradual typing). If you have at least one module that isn’t typed, the compiler will have to look at the entire source code.

To improve compile-time speeds it has to be all or nothing: we either change the language to require explicit imports, and types on every boundary (every method declaration, we introduce interfaces or protocols, etc.) and get faster compile times, or we don’t and the language will always take more and more time to compile as your project grows (which doesn’t scale!)

Why do I say this if it sounds negative? So you can have an informed decision about whether to use Crystal for your project or not. I still use Crystal for my hobby projects and it works great.

12 Likes

Great write up as always, thank you.

So, where does that put us? What does the rest of the core team say? Can we hear their voices?

Please correct me if I’m wrong, but it sounds like there’s two options:

  1. The language stays the same, and will always have subpar compile times and editor tooling. The (official?) position of the core team is “don’t use Crystal for big projects”(?).

  2. The language changes, (potentially) gets orders of magnitude better compile times and editor tooling (by extension). The position of the core is “use Crystal for any project”(?).

Option 1 sounds like the touch of death.

In my opinion, I do not consider Crystal’s compilation speed to be slow to begin with, so I personally wouldn’t pursue incremental compilation (or the interpreter, for that matter).

6 Likes

I do not consider Crystal’s compilation speed to be slow to begin with

Do you mean for your purposes, or compared to other languages?

If the former, how do you manage large projects? I’ve worked in (C++) projects where compilation times were on the order of tens of minutes, and it was really painful. CI/CD was even worse, and down the line, expensive.

If the latter:

Go, ~1000 lines

$ time go build .

real    0m0,120s
user    0m0,177s
sys     0m0,065s

Rust, 3 lines, “Hello World!”, LLVM

$ cargo build
   Compiling rust v0.1.0 (/home/.../programming/rust/hello_world)
    Finished dev [unoptimized + debuginfo] target(s) in 0.22s

Crystal, 1 line, “Hello World!”, LLVM

$ time crystal build hello_world.cr

real    0m2,975s
user    0m1,705s
sys     0m0,793s

I’ve worked in huge Go projects (millions of lines) where compilation took a few seconds.

I’ve worked in smaller Rust projects where (subsequent) compilation took, at worst, tens of seconds. Similar experience with C#, F#, Java and Swift.

Crystal is not these languages. But it is, and will be, compared with those languages.

This topic got hot all of a sudden!

Thanks @asterite for providing your thoughts, and exploring so much of this space in the first place. It’s really educational, and your example helped solidify the challenges moving forward.

My own experience reflects @MistressRemilia’s - I come from an enterprise Java background where even simple Java projects take a noticeable amount of time to build (unless using an IDE, which is a requirement for Java development at this point, but doesn’t solve CI/CD now taking forever), and the build times of crystal are mere annoyances in comparison. My largest hobby crystal project clocks in at ~7300 lines of code, and takes ~55 seconds to compile from a cold start, usually ~30 seconds after that. I also don’t use an LSP at all (Sublime has pretty good prediction with regular regular expressions and I haven’t felt the need for one yet).

Also like @MistressRemilia, I run into an inflection point with new projects, where I’ll start with adding no typing information to anything, eventually get to a point of “Well dang, what was the type of this argument again? Sheep” (with potentially stronger language), and then react by adding types to everything.

I could see a future of Crystal development such that:

  1. New and small projects continue to build and develop as they are today
  2. This inflection point is encountered by a project when it gets $LARGE (for arbitrary values of $LARGE)
  3. Project adds typing information to everything, able to take advantage of incremental compilation, development continues.

The two crystal changes that I could see helping this (both suggestions were provided above, and I think both would be needed) are:

  1. A new crystal tool that adds missing typing information to methods after type inference is complete (would operate similarly to the existing format crystal tool in my mind) to quickly add typing everywhere it’s needed.
  2. A new compiler flag that would fail fast whenever it encountered a new signature that was missing typing information (rather than having multiple “tiers” of compilation speed, I’d prefer just to have a one-and-done approach).

I think with these, crystal would be a viable approach for larger projects going forward. It wouldn’t help with the LSP problem for smaller projects as much. I wonder if there’s a different approach that could be used for that?

8 Likes

with the recent improvements of the language compiler, Crystal is pretty usable now if I use it with a modern desktop CPU (AMD Zen 3), I presume it is even better if you use Apple M1/M2.

still, the current compiler is still not fast enough for us to have a decent LSP (it only works well for tiny files), nor it is fast enough for typical webdev workflow. The frustrating part is you can’t even throw in money to solve that problem, improving the CPU specs only help to a certain point.

Incremental compilation might not be needed, but at least we need to find out some way to utilize the hardware better.

2 Likes

Just my two cents, but I’ve always considered the interpreter a better solution than incremental compilation (when it comes to Crystal). Based on what I understood as the problem, and was confirmed above by @asterite, about it being an issue where it needs to compile everything from scratch in order to account for interweaving of types, using the interpreter seems like the best approach since it’s already running and just changing types at runtime.
Ideally, I’d like Crystal development to be done with the interpreter, and just have longer compile times for production, which seems like a good tradeoff to me.

5 Likes

I want to start by saying I was working on the yearly plan for Crystal to let the community know our coming focus of attention. This will be in a separate message, but long story short, we are focusing on strengthening what we have today, and after a couple of releases we plan to put our attention to tooling and compilation times. We agree this is a very important topic, and it’s also one that requires significant time and effort.

There is an obvious tension between expressiveness and efficient compilation. To me, requiring annotations everywhere and giving up the dynamic feel of the language is a no-go. That’s what makes Crystal Crystal, and I would first explore any possible alternative before making a Java with nicer syntax.

As for alternatives, let me stress that incremental compilation and LSP/IDE tooling are not the same. For instance, one can imagine an autocomplete that is imperfect, working on specific scenarios in which the type is obvious from the context, or simply returns everything and just saves you some typing with fuzzy search. This will already improve the situation.

Regarding compilation, and the slow process of changing → compiling → testing, I would like to know if people tried the interpreter. In my experience it saves significant time. For instance, we can agree that the compiler is in itself a big project, yet running a specific spec takes little time.

Also, I’m not yet convinced like Ary that we can’t save significant time by locally propagating changes from a previous compilation… But this requires further research.

Closing remark: what an interesting topic to have a panel in CrystalConf 2023! ;-)

15 Likes

Note that I haven’t been doing any Crystal-related work for a long time. So considering me a core team member isn’t fully correct at this point. And it’s only my opinion, not the entire core team opinion. I’m sure everyone will have a different opinion. I’m only saying what I’m saying based on what I heard from others using Crystal in big projects. But, like all things, everything is a trade-off. If it’s okay for you to wait a bit more for compilation but you really enjoy Crystal and the benefits it gives you, you can try it with bigger projects.

Also, I’m glad that my comment wasn’t taken negatively, and that it sparked this great discussion. Thank you! :heart_decoration:

17 Likes