Sponsorship for Incremental Compilation

Hello,

My company is now supporting the Crystal project with a monthly donation (DocSpring on BountySource).

My number one priority is incremental compilation. I want to eventually port my entire Rails application to Crystal + Lucky, but I would only want to do that if my develop => test feedback loops can be extremely fast. (In the meantime, I’m going to start by rewriting a few of my background jobs and running them with sidekiq.cr.)

I noticed that incremental compilation is on the roadmap. (I also thought this Rust article was really interesting.) Has anyone started working on this, or are there any notes / docs I could read that talk about the challenges?

In addition to my company’s monthly donation, I would like to sponsor a developer to work full-time on incremental compilation, until it’s ready for real-world usage. (My company is still very small so I don’t have a ton of resources, but I would be able to sponsor the development of this one feature.)

I’m not too worried about the initial compilation times, but I would like to see a “hello world” program compile nearly instantly on the second run. And for a large web app using the Lucky framework, I would like to see sub-second type-checking and compilation times during development. (Especially if no files have changed.)

I know this will take a lot of work, but I think this would be a really fun challenge. I personally have very little experience with compilers, but I would also like to see if I can contribute some ideas and follow along with the development. (It would be great to learn more about how the Crystal compiler works.)

Are any compiler developers interested in working on this? How long do you think it would take, and how much would it cost to pay for development during that time?

Also, are there any other companies in here who might be willing to co-sponsor the development?

Thanks!

11 Likes

LLVM is already caching files in LLVM cache folder (on macOS it is in home dir/.cache/crystal//*.bc/o files.
We just need to check that timestamp on them is not stale with originals and then just to load them instead of parsing them and generating LLVM byte code again.

I can look at it after I will finish debugging fixing.

I also want to look at not loading all code in prelude.cr if it is not needed. Kind of trimming the codebase only to what is needed. It will take down the code size a lot.

1 Like

This is great, thank you! :heart_decoration:

I noticed that incremental compilation is on the roadmap

That roadmap is a bit outdated, though of course incremental compilation, or in general improving compile time performance, would be really nice to have.

In addition to my company’s monthly donation, I would like to sponsor a developer to work full-time on incremental compilation, until it’s ready for real-world usage.

This again would be great!

I believe incremental compilation might be possible to achieve, it’s just really hard to implement because of how powerful the language is, or how dynamic it is at compile time. The main example of this is how the compiler tracks generic instantiations and will do a multi dispatch on them, but there’s no easy way to tell which are all of the generic instantiations. Or how you can’t easily tell (unlike Rust) what are a methods’ dependencies until you effectively instantiate a method, and each method instantiation might depend on different things (we don’t have interfaces nor protocols in Crystal). Another thing is macro run: it can do arbitrary stuff at compile-time, so how do you avoid running this on each compilation?

With all of this I’m guessing a big chunk of the work involving implementing incremental compilation is determining how feasible it is, and if so, what changes need to be made to the language to achieve that (if any), how to implement all of that, and just then finally implement it. And that’s only if there’s a clear way to do it.

So if you are willing to sponsor all of those stages, I believe Brian and/or Juan can do a fantastic job (they are very smart, talented, and way more patient than I am).

3 Likes

The main issue is that it’s not immediately clear what are all the dependencies of a given LLVM bc/ll file and how to compute that. That is the main challenge involving incremental compilation.

1 Like

Can we add dependency and its SHA digest to some file that we can check if it was changed then we will invalidate all the chain then?

Another option might be to make existing compilation faster by using multiple cores.

I have no idea how feasible that is, but if it is easier that would also be a huge boost. Lots of CPUs have 6 or more cores, so being able to use those during compile time would be huge. But maybe this is extremely hard I don’t know :wink:

3 Likes

Thanks for your reply!

Ah, I can see that there are still a lot of unknowns and it will be very difficult to estimate how long this might take. So I have probably underestimated how complicated this will be! I was originally thinking that this might be something like 2-3 months, but I can see that it could easily turn into 6-12 months (or even longer.)

I have also been talking to Nico from Manas about this possibility! And it would be great if some other companies could also help to sponsor this work. It sounds like it will be an enormous project, so now I’m actually not too sure if I cover all of the development, but definitely still interested in contributing!

3 Likes

I think there comes a point where the compilation speed tops out (or, should I say, does not get exponentially slower). I’ve noticed over the past 3-5 months, I’ve added around 1k~ lines of code, but compilation still takes roughly 3-7 seconds.

Last year, it was still around 3-7 seconds. I initially thought the compiler was going to poop on me and start being slow af, but it’s actually not that bad. Plenty fast for the amount of spaghetti code I throw at it.

I’ve learned over the years, if I treat the compiler well, it will treat me well.

I’ve also learned, the compiler doesn’t give a *** about my feelings, so I need to be careful.

2 Likes

Do you want incremental compilation or just “really fast” feedback loop on developing web apps? Just wondering…

1 Like

I don’t know what @nbroadbent is thinking, but for me personally I would like it to be faster. How that’s done is not relevant (IMO). So if there is no incremental compilation but it’s faster/uses more cores/whatever that’s great!

3 Likes

Thanks for the info girng_github! 3-7 seconds isn’t too bad.

My information was coming from this Medium article, where an Amber app with 100 models would take 6 minutes and 48 seconds to compile on a decent MacBook.

My Rails app isn’t extremely big, but it currently has about ~200 Ruby files and ~50k lines of code. Plus around ~200k lines of code from all of the gems that I’m using (and I’m hoping to eventually port most of these to Crystal.) So my impression is that the Crystal compiler would probably struggle at this scale.

Incremental compilation seems like a clear solution, but I’m definitely interested in supporting other ways to speed up compilation.

1 Like

I should also mention that I’m not a big fan of splitting things up into micro services, and I really prefer working on a single application in a monorepo.

I’ve also considered Sorbet for Ruby, TypeScript, Nim, Elixir + Dialyzer, Python + mypy, and a few other options. But I keep coming back to Crystal. It’s a really beautiful language, so I would love to support the development and make it even better!

4 Likes

Granted he doesn’t do much with macros, or anything advanced (afaik), which is most likely the big time consumer when compiling.

IMO that article is a bit misleading, just because one framework takes a good while to compile doesn’t mean the language itself is slow to compile; it just shows that the design and how you use the language can have adverse effects when compiling.

1 Like

It is still not clear why Amber has such slow compile times and it may have been resolved, but I did a similar test with Lucky when that article came out and the compile times were significantly faster. Part of it could be that Amber uses ECR and ECR is known to be slow. Lucky uses Crystal classes and methods for HTML which is much faster to compile.

I’m with you on not splitting things up into micro services (at least not until you really need to). I know a few people using Lucky with big projects and they’re hitting a bit over 10 seconds. Definitely not fast, but still workable IMO.

4 Likes

One other thing: I’ve changed how I work and it has made the compile times less painful. I tend to write more code upfront and have fewer cycles where I’m testing whether something compiles or not. Once I feel good about what I’ve written, then I’ll check the compiler. It’s nice because you can fix compiler errors and once they’re fixed it usually works.

The other nice thing is that when there is an error it appears much faster. If you open a Lucky project and in an action to 1.foo and see how quickly it fails to compile you’ll see what I mean. So I tend to use this to my advantage and write a bunch of code and fix errors as they come up since they happen much more quickly. I also tend to write unit tests after the fact and do Compiler Driven Development for the first pass. It works pretty well!

4 Likes

That’s a really good point I didn’t notice before you said it!

Compiling the compiler takes about 32 seconds. However, that involves the semantic (around 10s) and codegen phase (around 20s) and if there’s an error it can only happen in the first phase, and usually long before it finishes. For example I just made a change in a random file and it took 1.4s to get the compile error, which I think is pretty reasonable.

5 Likes

Nice! Glad it is fast even when compiling large projects. The only time this doesn’t work as well is when there is a runtime error you’re trying to debug or if there is something visual. So when I’m doing front end work it can be a bit painful. Still trying to figure that out, but I’ve been experimenting with making most of the CSS changes in thee browser and then copy pasting them back to the source code once it looks good

1 Like

it might be worth finding out if this was the framework or the language. I thought macros where slow but recently remember reading a benchmark that says they are faster then I thought.

Cool thing is, if you check out my WSL benchmark thread, compiling on native Linux is far faster than WSL. So my numbers are off if you are on native Linux. I’m on WSL.

Did you say 200k as in 200,000? Good lord LOL. That’s a lot of code!!

Haha!! I’d say not having @[MyAnnotation] abstract def public private annotation method names littered all over my code is far more advanced than having my eyes bleed while trying to read code.

2 Likes

It’s great to hear that the compile times aren’t as bad as I thought, and we can adapt to a slightly different dev/testing workflow.

I read through the conversation on this Reddit post about the spectator library, and I would love to use a full-featured spec library like this. (I also really prefer the “expect syntax” over Crystal’s “should syntax”, so that’s the main reason why I have been using spectator for my tests.) I’ve also been looking at the mocks.cr library, and mocking is also on the roadmap for spectator. I’ve learned that these libraries use a lot of macros, so it seems like I currently have to make a choice between powerful abstractions / productivity / developer happiness and much longer compile times. (Especially when working on a complex application with a large test suite.)

If incremental compilation makes it possible to cache a lot of the macro expansions and compilation, then we can “have our cake and eat it too”, and I think the development experience will be even more amazing. So I’m definitely still interested in supporting this, and I hope it’s possible to find a way forward!

3 Likes