I investigated where the Crystal compiler is spending time using classic print debugging.
module Crystal
class Compiler
def compile(source : Source | Array(Source), output_filename : String) : Result
source = [source] unless source.is_a?(Array)
# 1 new_program
program = new_program(source)
# 2 parse
node = parse program, source
# 3 semantic
node = program.semantic node, cleanup: !no_cleanup?
# 4 codegen
units = codegen program, node, source, output_filename unless @no_codegen
# 5 cleanup
# ... omission ...
Result.new program, node
end
end
end
I measured the time spent on each part of the Compiler#compile method.
- new_program
- parse
- semantic
- codegen
- cleanup
I also timed the external LLVM functions.
LibLLVM.run_passesLibLLVM.target_machine_emit_to_file
See the patch file for the actual code used.
After compiling and installing the Crystal compiler with changes to display elapsed time, I compiled the unmodified Crystal compiler.
| Stage | Time (seconds) |
|---|---|
| new_program | 0.000388207 |
| parse | 0.000065000 |
| semantic | 12.552620028 |
| codegen | 355.245409133 |
| - LibLLVM.run_passes | 252.340241198 |
| - LibLLVM.target_machine_emit_to_file | 93.280652845 |
| cleanup | 0.000013180 |
| total | 367.798495548 |
The results show that the two LLVM functions consume most of the time (93.97%).
This fact might be common knowledge for those who have been using Crystal for a long time or are part of the core team, but for someone like me who started using Crystal later, it is very interesting.
