Why can't use multi-core when compile an application use Crystal compiler

What i means is:

When i build a app like following:

$: CRYSTAL_WORKERS=16 /usr/bin/crystal build --release src/procodile.cr

I observed that there is always only one core that is 100% usage.

Thanks

As of Compiler: enable parallel codegen with MT by ysbaddaden · Pull Request #14748 · crystal-lang/crystal · GitHub, it’s able to for codegen when not in release mode. Rest of the process is single threaded tho.

Sorry, didn’t understand.

  1. should we build compiler itself with -Dpreview_mt then this feature will be enabled.
  2. Or we still built compiler as usual, but if enable -Dpreview_mt when build app, will enable this feature?

Thanks

My understanding is it’s when you build your application with -Dpreview_mt in non-release mode.

discussing in one telegram group, Some people say that you need to add this falg when compiling the compiler itself, so, I ask this question in the issue too.

To clarify this a bit:

  • The compiler frontend cannot make use of any multi-threading at all.
  • The LLVM backend can run parallel codegen in multiple threads, under the condition that it generates code for mulitple compile units. As stated in the description of #14748, this is not possible with the --single-module flag, and by extension --release and --cross-compile (because they imply --single-module).

Effectively, the compiler can only make limited use of multi-threading for the codegen stage of development builds.

This requires of course that the compiler itself was built with -Dpreview_mt. I’m not aware any distribution package builds the compiler with this flag at this moment. So you’d have to build it yourself.

1 Like

Sorry, i consider still not so clear, Is my following description correct?

  1. build compiler itself with -Dpreview_mt enabled
  2. use this compiler to compile application without --single-module enabled, and with set ENV to CRYSTAL_WORKERS=4, it will enable run parallel codegen in multiple threads?

I replied on github

Let’s keep the active usage discussion here instead of resurrecting a closed PR.

Yes, those are basically the steps you’d want to follow. You may want to adjust the value of CRYSTAL_WORKERS to fit your system specification (i.e. the number of cores you want to utilize). CRYSTAL_WORKERS=4 is the default value.

You should make sure to build the compiler with --release for best optimization.

You can get more detailled output about compiler performance with the --stats flag. It shows the time spent in the individual compiler stages. Only LLVM codegen (Codegen (bc+obj)) makes use of multi threading.

Please note that without preview_mt, LLVM codegen on Unix systems already runs in multiple processes. That’s before and after #14748. This change only affects how parallel codegen works with preview_mt and enables it on systems that don’t have fork, i.e. Windows.
On a Unix system there will likely not be a huge noticable difference with preview_mt.

1 Like