IO seems to be slower with `--release` flag

Hi folks :wave:

I’m not sure if anyone came across with something like this before but actually Mint (GitHub - mint-lang/mint: A refreshing programming language for the front-end web.) runs slower with --release flag - at least parts of it - then without.

With --release flag (built in alpine docker container and statically linked and also with build on developer machine):

❯ time mint build
Mint - Building for production
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
βš™ Ensuring dependencies... 0ΞΌs
βš™ Clearing the "dist" directory... 3.392ms
βš™ Copying "public" folder contents... 14.85ms
βš™ Compiling your application:
  βž” Parsing 153 source files... 18.442s
  βž” Type checking: 132.299ms
  βž” Compiling: 206.04ms
βš™ Writing index.html... 445ΞΌs
βš™ Writing manifest.json...51ΞΌs
βš™ Generating icons... 120.198ms
βš™ Creating service worker...17.826ms
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
All done in 22.701s!
./mint build  23,23s user 0,20s system 103% cpu 22,717 total

Without --release flag (built in development machine):

❯ time mint-dev build
Mint - Building for production
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
βš™ Ensuring dependencies... 0ΞΌs
βš™ Clearing the "dist" directory... 790ΞΌs
βš™ Copying "public" folder contents... 3.201ms
βš™ Compiling your application:
  βž” Parsing 153 source files... 8.391s
  βž” Type checking: 218.748ms
  βž” Compiling: 823.635ms
βš™ Writing index.html... 697ΞΌs
βš™ Writing manifest.json...145ΞΌs
βš™ Generating icons... 342.277ms
βš™ Creating service worker...12.597ms
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
All done in 10.825s!
mint-dev build  11,59s user 0,46s system 105% cpu 11,377 total

It seems to me that IO related tasks - clearing dist folder, copying public folder, parsing files - are slower while CPU heavy tasks - type checking, compiling - are quicker.

I don’t remember this being the case before, but I haven’t been checking regularly.

I have two questions:

  1. Anyone know why it is this way?
  2. Anyone did in-depth profiling of Crystal applications? I’m interested in per method duration to see which parts are the bottlenecks. Flamegraph maybe?

Thanks

Try compiling statically on Alpine with, and without --release. Then, the same on the developer machine directly. This way we will be able to know more precisely on which cases the binary is slower.

Anyone know why it is this way?

It would be nice if you could have a small benchmark program to see this in action, otherwise it’s pretty much impossible for others to try to figure out why this is happening.

Anyone did in-depth profiling of Crystal applications? I’m interested in per method duration to see which parts are the bottlenecks. Flamegraph maybe?

If you are on Mac you can use XCode’s Instruments β€œTime Profiler”. But you’ll need to put the offending code in a loop to get good results.

1 Like
Timings for the `mint build` command with different builds.

# 0.36.1 (alpine in docker)

Build args: --static
Timings: 14,47s user 0,21s system 106% cpu 13,730 total

Build args: --static --release
Timings: 25,18s user 0,23s system 102% cpu 24,677 total

# 0.35.1 (alpine in docker)

Build args: --static
Timings: 16,47s user 0,24s system 107% cpu 15,612 total

Build args: --static --release
Timings: 14,57s user 0,19s system 105% cpu 13,963 total

# 0.36.1

Build args: --error-on-warnings --error-trace --progress
Timings: 10,48s user 0,23s system 109% cpu 9,749 total

Build args: --error-on-warnings --error-trace --progress --release
Timings: 9,86s user 0,24s system 110% cpu 9,148 total

# 0.36.0

Build args: --error-on-warnings --error-trace --progress
Timings: 11,61s user 0,26s system 106% cpu 11,197 total

Build args: --error-on-warnings --error-trace --progress --release
Timings: 28,24s user 0,19s system 101% cpu 27,928 total

# 0.35.1

Build args: --error-on-warnings --error-trace --progress
Timings: 10,00s user 0,20s system 109% cpu 9,289 total

Build args: --error-on-warnings --error-trace --progress --release
Timings: 10,57s user 0,26s system 109% cpu 9,866 total

Locally it seems that the issue was with 0.36.0 since it’s not happening on 0.36.1 any more, but on alpine it seems still the case.

If you are on Mac you can use XCode’s Instruments β€œTime Profiler”. But you’ll need to put the offending code in a loop to get good results.

Thanks but I’m on Linux and I want to find the offending code :smiley:

What’s the result with the Ubuntu Docker image?

I’m using Elementary OS (which is based on Ubuntu 18.04 LTS (bionic)), the docker based images should be different?

I guess you’re right.

The only way to be sure with your issue is to profile the binary, for example with Valgrind (I’ve no experience with this).

IO is tricky to profile with crystal. You can profile cpu OK: Performance - Crystal

Thanks for pointing me to Valgrind, I’ve managed to record calls made which is helpful.

It might not be IO, so in theory what can make programs slower in --release mode?

Maybe it hits the IO harder in release mode, which causes more contention on it, pulling the whole thing down…hard to tell…

I tried this code

50000.times do
  f = File.tempfile
  f << "hello world!"
  f.flush
  f.seek(0)
  puts f.gets_to_end
  f.delete
end

And it is not slower in release mode. In fact it is a bit faster: ~1.7s vs 1.5s on my computer.

1 Like

So we finally figured out why it was slow: it wasn’t IO but exceptions:

Basically the parser of Mint used exceptions to exit early from functions and by refactoring that into checking nil caused a massive speed up in parsing.

With this and some other optimizations the parsing of the same code base went from the earlier reported 25.18s to 499.643ms.

I still don’t know why using the --release flag doubled the parsing time but I guess it does something to exceptions :man_shrugging: - if anyone has ideas let me know I’m curious :slight_smile:

5 Likes

Exceptions are slower in release? Any micro benchmarks to prove it? You could optionally profile it to try and find out Performance - Crystal

Given this code:

require "benchmark"

class SomeError < Exception
end

1000000.times do
  begin
    raise SomeError.new
  rescue
  end
end

Running it in different scenarios:

❯ crystal build benchmark.cr
❯ time ./benchmark           
./benchmark  4,00s user 0,04s system 111% cpu 3,636 total

❯ crystal build benchmark.cr --release
❯ time ./benchmark                    
./benchmark  3,02s user 0,10s system 111% cpu 2,793 total

❯ crystal build benchmark.cr --release --no-debug
❯ time ./benchmark                               
./benchmark  2,73s user 0,06s system 113% cpu 2,460 total

So it doesn’t seems to be the case that exceptions slow it down, maybe it’s a combination of exceptions and overloaded functions bur unfortunately I don’t have more time to investigate it.

If anyone wants to dig deeper into this, this is the PR that was lead to the improvement in parsing times:

2 Likes