Crystal 0.34.0 slower?

mgomes · May 23, 2020, 4:07am

Thanks for the details @gummybears, that looks like a very cool project.

There’s not much to go on from your methods like you had mentioned, but I am not convinced this is just a problem with your program. I modified Kostya’s crystal-benchmarks-game repo to instead compare Crystal versions (https://github.com/mgomes/crystal-bench).

Here are my results:

### Binarytrees

Version	Time,sec	Memory, MB
0.30.0	1.77	49.0
0.34.0	1.83	48.9

Fannkuchredux

Version	Time,sec	Memory, MB
0.30.0	2.92	2.9
0.34.0	3.63	2.8

Fasta

Version	Time,sec	Memory, MB
0.30.0	2.22	3.8
0.34.0	4.51	4.0

Mandelbrot

Version	Time,sec	Memory, MB
0.30.0	8.40	2.9
0.34.0	8.27	3.0

Meteor

Version	Time,sec	Memory, MB
0.30.0	0.14	3.4
0.34.0	0.13	3.3

Nbody

Version	Time,sec	Memory, MB
0.30.0	1.13	3.2
0.34.0	1.44	3.3

Pidigits

Version	Time,sec	Memory, MB
0.30.0	9.52	8.2
0.34.0	8.96	7.4

Regexdna

Version	Time,sec	Memory, MB
0.30.0	1.14	20.0
0.34.0	1.22	19.3

Revcomp

Version	Time,sec	Memory, MB
0.30.0	1.89	365.6
0.34.0	2.25	339.0

Spectralnorm

Version	Time,sec	Memory, MB
0.30.0	1.75	4.3
0.34.0	1.82	4.3

Threadring

Version	Time,sec	Memory, MB
0.30.0	0.37	5.2
0.34.0	0.51	5.0

Not all the tests are slower but for some, like Fasta, it’s quite a bit slower. I suspect the answer is in there somewhere. I still have to look more closely at the slower programs.

HankAnderson · May 23, 2020, 11:42am

If you compile fasta with Crystal 0.33.0 and run it with n=15000000, you get:

3.04 real         2.24 user         0.12 sys

If you compile it with -D disable_overflow and run it, you get:

2.19 real         1.97 user         0.13 sys

From this, it’s pretty obvious that the slowness comes from overflow checking.

@gummybears Even if you changed your code to use &+ instead of +, I’m sure you didn’t change every occurrence of + and - in the standard library. You have to change it everywhere. But that’s impossible of course.

My conclusion is one of these:

Get used to the new performance. The old performance was fake because programs might have been wrong because of silent overflows.
It might still be good for Crystal to provide a flag to disable overflow checking. Rust removes overflow checking when compiled in release mode. The idea is that once you make sure in development that nothing overflow, you disable the check at runtime for greater performance.

I believe doing 2 is very, very important if Crystal wants to be successful in more math-y territories, where performance is critical. Overflow? Not that critical because you usually use big numbers like Int64 or UInt64 and the numbers are bounded.

I wouldn’t do it like Rust. I would have overflow checking by default, but still provide a flag to disable it. The core team should basically revert the PR that removed -D disable_overflow. Or better put it with another name.

HankAnderson · May 23, 2020, 12:16pm

Actually, re-running fast with 0.33.0 with and without overflow gives me a difference of 0.3 seconds. I don’t know why I got 3.04 the first time.

There’s something else going on. 0.34.0 is significantly slower than 0.33.0. In my tests I see 2.41 vs 3.56, similar to what @mgomes said.

Oh, well, nevermind… I guess for fasta the thing is that it’s using puts and print a lot. Changing it to STDOUT.flush_on_newline = false makes it faster again.

There is a difference of 0.3s between oveflow and no-overflow, so that might be it.

HankAnderson · May 23, 2020, 2:09pm

@gummybears Can you run your benchmark with 0.33.0 with and without -D disable_overflow?

gummybears · May 23, 2020, 4:35pm

@HankAnderson I will post the results asap

gummybears · May 23, 2020, 5:53pm

@HankAnderson

I build the executable with

$ crystal build main.cr -o crasyvm_nooverflow.out --release --no-debug -D disable_overflow

and ran the tests 3 times each, minimum execution time showed, no averages.

Observations

Using Crystal 0.31.1 with overflow disabled has a significant
impact, and is only 0.62 seconds slower as compared to Crystal 0.30.

From Crystal 0.32.1 up to 0.34.0, disabling overflow has some
impact (0.25 seconds averaged) but compared to Crystal 0.30
the test is on average 1.42 seconds slower.

On average 1.17 seconds, the test program is doing extra work
and this is not to check integer overflows but something else.

Summary (23-05-2020)

Crystal      0.30     0.31.1     0.32.1   0.33.0    0.34.0

Fibonacci     N/A       2.55       3.34     3.43      3.28
n = 35

Delta t                -0.88      -0.28    -0.28     -0.20
no overflow
checks
compared with
18-05-2020

Delta t                 0.62       1.41     1.50      1.35
compared with
Crystal 0.30

Other metrics
Compile       N/A     180.01     178.23   175.83    147.45
Size exe      N/A    4270784    4097760  4106032   3929864


Summary (18-05-2020)
No changes made in my codebase

Crystal      0.30     0.31.1     0.32.1   0.33.0    0.34.0

Fibonacci    1.93       3.43       3.62     3.71      3.48
n = 35

Other metrics
Compile    105.09     144.28     147.32   145.88    154.12
Size exe  4135464    3960576    3897608  3913952   3914176

ErikAGriffin · May 29, 2020, 3:25pm

Any more progress here? I’m curious as to what could have caused the performance change

gummybears · May 30, 2020, 1:39pm

I think I did all I can do. I thought of bisecting the github repo starting at 0.30, but decided to continue work on my interpreter and stay with the Crystal 0.34 version.

oprypin · May 30, 2020, 1:46pm

Well if you don’t want to bisect it and also won’t let anyone else bisect it, this is a pretty big waste of time.
I’m honestly so annoyed by this thread, why did all these people even continue the guessing game for days when there is a direct and reliable solution that’s just being dismissed.

mgomes · May 30, 2020, 3:11pm

It was a productive thread, we got a lot of different benchmarks testing the overflow feature. With respect to the bisecting, @gummybears isn’t referring to bisecting his app, he has said that his code has not changed. What needs to be bisected is Crystal between these versions. So anyone can do that work.

Based on the benchmark results, it does still seem to be something else that caused a slowdown between versions.

I am still wondering if anyone from the core team has any info around what performance benchmark tests are performed between Crystal releases. If none, maybe that is something that we could develop? Performance is a feature of the language that is touted and treasured, and so it would be great if it is measured between releases.

gummybears · May 30, 2020, 3:23pm

I don’t think you understand, I am talking about perhaps bisecting the Crystal compiler repo but decided against it, as I am not familiar with the code of the Crystal compiler.

Perhaps you should read things better before commenting in such a way. I am working on an interpreter. The performance problem is not with my code as far as I can tell but with the Crystal compiler. There is no guessing game here, I did what people asked me to do and provided the information.

oprypin · May 30, 2020, 4:11pm

If we’re putting things that way… No I don’t think you understand.

You don’t need to be familiar with any code at all, that’s the great part about bisecting.

But since you’re claiming that the slowdown affects your code particularly, that means any measurement needs both your code where such a slowdown is reproducible and Crystal compiler code. So, what would be measured is your code but what would be bisected based on that is Crystal compiler repo.

oprypin · May 30, 2020, 4:12pm

Yeah, “anyone” who has the Crystal compiler code and the code affected by slowdown.
Which currently is just @gummybears

rogerdpack · June 1, 2020, 9:48pm

Yes reproducible sample would be nice…

Topic		Replies	Views
BigInt is really really slow when run on linux, Or just Boehm GC issue? Help & Support	31	889	September 26, 2022
Crystal 0.34.0 has been released! Official	41	1848	April 13, 2020
Compiling to eliminate overflows with 0.31.0 Help & Support	2	303	September 25, 2019
The CrystalRuby gem through the Crystal lens Community	5	356	May 10, 2024
Unhandled exception: Arithmetic overflow (OverflowError) Help & Support release	5	399	July 16, 2023