Choosing CPU for fastest compilation

I am interested in reducing the compilation time in debug mode as much as possible. (To speed up development iteration.)
What properties matter the most when choosing hardware?
I guess the thread count does not matter much as the compiler utilizes only 1 thread in most stages of the compilation.
What to look out for? Per-thread speed, or specific instruction set extensions, or L1 cache?

With the newly started efforts on bringing multithreading to default, will it improve compilation speed? How much is the compilation process parallellizable? Once this feature is released, is it worth buying a CPU with lower per-thread speed, but more cores? Or even a multithreaded compiler couldn’t utilize it?

Are there any cloud computing providers who offer VPS or similar with good compilation speed?

It could matter more in the future. See Compiler: parallel codegen with MT by ysbaddaden · Pull Request #14227 · crystal-lang/crystal · GitHub

3 Likes

It looks like the main parameter here is “Single Thread Rating” / “turbo speed frequency” (and it is necessary to take into account the CPU architecture).

Current Ryzens 7/9 (Zen 4) reach over 4000 points on Single Thread Rating.
https://www.cpubenchmark.net/cpu.php?cpu=AMD+Ryzen+7+7700&id=5169

Apple M3 Pro is close to the 5000 point mark in Single Thread Rating.
https://www.cpubenchmark.net/cpu.php?cpu=Apple+M3+Pro+12+Core&id=5750

My colleague’s real experience - Apple M1 Pro is on par (or little faster) even with Intel 14700k in the --release build (in the Crystal compilation - all those Intel 14700k cores are pretty useless). Note: his compilation tests on Mac ran from OSX and internal fast SSD, tests on 14700k ran from external slower USB SSD with Fedora 39, so te comparison is probably a little distorted, but still…

Cloud: server side CPUs focus on number of cores rather than frequency and “single thread rating” :(

So, the powerfull desktop computer with high frequency CPU is probably the answer.

1 Like

You’ll want a fast single core for all the passes (semantic, …) but one (llvm codegen).

On development, without --single-module (implied by --release), you’ll benefit from many cores during the pass (see --threads) unless you’re on windows because it depends on fork. We hit thread safety issues in LLVM with the MT version (sadly).

I’m afraid it won’t change for the time being. We may eventually have some parallelism in some passes (semantic?) but I don’t see how the codegen crystal pass could be parallelized for example, because of the aforementioned thread safety issues in LLVM :sob:

3 Likes

Thanks a lot. (I would mark this answer as solution alongside with @pfischer’s as both comments include useful info, but unfortunately this forum only allows to select one comment.)

5 posts were split to a new topic: CPU Compile Speed Benchmarks