Faster `--release` compile times but slightly worse performance?

I agree to @RespiteSage. Crystal’s compile times are already a pain point for many. I don’t think we want the default to be even longer.

That being said, the interpreted mode can eventually be an option for quick compilation speed while sacrificing runtime performance. As such it might take over some of the use cases for the current, unoptimized default mode.

1 Like

Yeah somebody with a very large project, might find this option useful…

If I’ve understood it correctly, separate compilation units are the default in C, C++, and it is one of the driving forces behind header only implementations.

ALSO, this relates to the link time optimization that was recently removed, because the whole point of that is to be able to optimize past compilation unit boundaries. There used to be checks in the compiler that put release mode into one file unless LTO was enabled. Now it does it unconditionally. LTO in itself is not super fast though - I’m not certain how well the performance of speedup from separate compilation units + LTO compare to just putting everything in the same compilation unit. Or for that matter how well it do optimize compared to putting everything into a single file.

@asterite I don’t know how easy it is to enable (it needs to be enabled through the whole compilation chain, so it is more than just a linker flag to be added), but if it is possible to do the similar benchmark with LTO flags supplied then that would also be very interesting.

It’s definitely easier than the one line change I had to do, so for now I’ll drop it (I don’t have time or will to try a more complex thing)

I think the use case for release and development time are exclusive

Use Case - PRODUCTION RELEASE

many folks are fine with this taking a bit longer for better performance

Compilation speed can be slower but not that slow
Performance best we can get

Use Case - DEVELOPMENT FEEDBACK LOOP

People need this to be faster so that we can run test and development feedback loop is fast

Compilation speed must be faster
Performance can be moderate

5 Likes

I agree with @pseudonym in having -On (where n is 0, 1, 2, 3…) and --release being an alias to e.g. -O3, and the default being -O0… maybe in the future add a -Os for binary size optimization. So no behavior change and everybody could be happy :slight_smile:

4 Likes

Is there any update to this? I don’t think there isn’t any use case for it like some of you claimed.

I think the most important aspect of it the attention span, for how long we can keep focus before distracted.

Developing stuff always involve experimenting, you go tweaking the source code, running it though the test samples, inspecting the output, and tweaking the source code again. And depend on the task, sometime the non release mode is just too slow, and the compilation time for release mode is not make the whole process runs faster, either. So its always nicer to have another choice to choose.

Also, I think in real world scenarios, maybe multi LLVM files is not that slower compare to a single fully optimized LLVM files, since the code usually tend to be pretty isolated anyway.

And if I understand correctly, each of those LLVM file can be cached separately, right? I have multi files that share the same common library, but compiled to seperate binaries for different purposes, I guess they can be somewhat reused the common big part?

What if we added more verbose output to crystal build to let the user know whats happening while they wait? I noticed Rust’s cargo build is very verbose and that makes it “feel” faster; even though I have to wait just as long.

3 Likes

I have mixed feelings about more output during builds. Normally, I like minimal output by default, so as to not slow the build down. But I can imagine a some sort of verbose option could sometimes be helpful (e.g. debugging).

2 Likes

There’s -s that show some steps of the build process.

2 Likes

To me the crystal compiler is actually quite fast. It finishes its job in under 5 seconds, which is acceptable for me for a compiled language with global type inference.

The painful time lies in the Codegen (bc+obj) phrase, which always cost a few minutes despite that the majority of the code content does not changed.

Personally I have multi entry files (acting as long running processes) that reuse a single large core, the differences are just several dozen lines of code. And yet the compilation time for each of those entry file remain the same. This quickly eats up patience.

Well there are several ways to sidestep the issue (like combine all the entries file to a single binary and use option_parser to redirect the tasks), but I think it is irrelevant to the subject.

Anyway, my suggesting is

  • -O 0 map to current non release mode
  • -O 1 is like -O 0, but stripping all debug information
  • -O 2 is the “split to multi files” mode
  • -O 3 is the same with old release mode.

Personally I want --release to be -O 2 and --optimize to be -O 3, but keeping --release as -O 3 like old time is fine, too.

I think for most use cases devs would prefer

  • During development - Faster compilation speed even if the executable perf is lower. The ability to get quick feedback is important for a lot of projects.

  • For release - Faster executable perf, even if its at the cost of compilation speed. Once your project is ready for production, ability to iterate faster doesn’t matters.

I am not sure why are even talking about the --release mode for these optimizations. as a developer 90% of the time I am using normal builds and speeding that up would be great fro dev ex.

I think @eliasjpr is right in this;

  • we should definitely do the compile time optimizations
  • we should do it for non --release modes
  • when users use --release, just continue with the current approach
7 Likes

Don’t mix oranges and bananas.

Performance Optimization levels must be separated from Symbolization options ( like strip out or obfuscate symbols )

4 Likes

Without strawberries

3 Likes

Corresponding pull request: Add incremental release compilation by kostya · Pull Request #13464 · crystal-lang/crystal · GitHub

3 Likes