Compile Performance Guidlines

I was looking into optimizing a programs runtime and it started me down a rabbit hole of compile time performance. I have to admin I totally dont understand this or have a framework to really know what the trade offs are when wrtiing a program. I am just discovering more of this but I wanted to see if anyone had any blogs or ideas on how to know what the compile time performance is when writting Crystal?

As some fun numbers here is a breakdown of compiling crystal and my webapp on my laptop.

This might be because of volume used not actual impact of usage but top offenders seem to be.

  • Semantic (main)
  • Macros
  • Codegen (bc+obj)
  • Codegen (crystal)
1 Like

I assume this means Macro runs? Last time I checked macros themselves don’t actually have as big of an impact as people think. Macro runs on the other hand compiles and executes a Crystal program inline while compiling another program, which by its own nature can have a big impact. But it’s not the same thing as macros themselves.

However, macro runs result can be reused, so you only incur the impact on the first cold run of it.

They’re even compiled in release mode if I remember correctly :see_no_evil:

I am not trying to blame macros. I am just trying to figure out what the impact of some code is.

Right, just wanting to clarify that your Macros label is actually Macro Runs, and not actually macro expansion in your application. This part of things is currently unattributed in the stats output, but you could rebase Comparing crystal-lang:master...straight-shoota:feature/compiler-stats-macro-time · crystal-lang/crystal · GitHub onto master to see that included as an extra data point.

Is there a way to attribute compilation time to a line of code?

As I write this I realize it might be a pipe dream and I can already see problems with this.

If you squint, it’s probably possible but I’m not sure yet how much effort it would take. I say “if you squint” mostly because the unit the compiler cares most about appears to be more coarse than lines of code.

I tried at one point to instrument codegen to find out which methods were taking the longest to compile. I managed to get it working-ish (even included it in the output of --stats) but it seemed that the codegen time of a method also included the codegen times of all methods downstream of it, as well. There was also the fact that the compiler parallelizes compilation in forked processes by default. It becomes difficult to track pretty quickly.

So just tracking codegen is fraught, but total compilation time for a method also includes the semantic phase before codegen. I’m not sure how to track anything there yet, but since it determines which codegen needs to run at all it could really swing the metrics around.