Crystal 0.34.0 slower?

gummybears · May 18, 2020, 12:11am

Hi,

Finally upgraded from Crystal 0.30 to Crystal 0.34 after fixing my code. I had to make some minor changes to my code (integer division mainly). I ran all my tests and finally they are all green.
I am using the // operator to do integer divisions.

Next thing I did, was ran a simple test to compare performance. The test computes the fibonnaci number recursively for n = 35. I am a little bit disappointed as it seems my bytecode interpreter runs slower when compiled with Crystal 0.34.

Crystal 0.30 time 1.89 seconds
Crystal 0.34 time 3.29 seconds

The code is compiled with the following command

crystal build main.cr -o ../../crasy --release --no-debug

Blacksmoke16 · May 18, 2020, 12:13am

0.31.0 enabled overflow checking by default. This would most likely be caused by the overhead of that check. You could switch to the wrapping operators (pre 0.31.0 behavior) using like &* or &+, etc.

EDIT: &// isn’t a thing, updated to use a different operator.

gummybears · May 18, 2020, 12:39am

I changed the relevant code (+ into &+, - into &- etc) but I am not seeing any improvement.
My fibonacci test now takes 4.16 seconds.

I undid the changes (+ into &+ etc) and ran the test again, and the test completes
in 3.33 seconds with overflow checking enabled.

I don’t think the overflow checking is the problem.

Blacksmoke16 · May 18, 2020, 12:42am

Are you doing anything IO related? There was also a change related to that. If possible, maybe share your code and we can see if we can figure it out.

gummybears · May 18, 2020, 12:59am

I am reading 3 files, so I am doing some IO, using File.each_line to read these files.
Total lines of code I am reading is less than a 1000 lines.

Blacksmoke16 · May 18, 2020, 1:04am

Writing anything to STDOUT?

gummybears · May 18, 2020, 1:05am

Yes, 1 line of text

fibonnaci(035) : 9227465

Blacksmoke16 · May 18, 2020, 1:07am

Hm, yea I’m out of ideas then. IO change was really only related to writing output, and the overflow check is the only other thing I could think of. Hard to say without seeing the code.

gummybears · May 18, 2020, 1:16am

My project is an interpreter, consisting of a tokenizer, lexer, parser, symbol table, semantics, type checker, a bytecode compiler and a virtual machine.
Sharing code is a bit of problem, as there is a lot of code I wrote. I know from experience that I should not tinker with the virtual machine part. That is the reason I am using this recursive fibonacci test as I can quickly determine the impact of a change to my virtual machine.

I am also out of ideas, but thanks for your help.

straight-shoota · May 18, 2020, 11:16am

Without seeing the actualy code, everything’s just a wild guess.
It would also help to identify the cause if you could test your code with versions between 0.30.0 and 0.34.0 to see where exactly the performance hit occurs.

gummybears · May 18, 2020, 11:33am

I understand that not seeing the code is not helpful but not sure that posting 91000+ lines of code (source and spec code) will also be helpful.

I ran all test specs without any changes except the version of the Crystal compiler. I did not see any difference in performance there. The time it takes to complete these test specs are the same for both versions, 0.30 vs 0.34, around the 2 minute mark.

So I than I decided to run my fibonacci test and there is where I saw the performance difference. Yesterday I used valgrind and it seems more time is spent /usr/share/crystal/src/int.cr (0.34) as compared to 0.30 but I am not certain yet as it is hard to compare the different valgrind runs.

Hopefully I will find the problem myself, but this will take some time.

HankAnderson · May 18, 2020, 12:07pm

when I find myself in times of trouble
father turing comes to me
speaking words of wisdom
profile it

profile it, profile it
profile it, profile it
yeah there will be an answer
profile it

straight-shoota · May 18, 2020, 2:49pm

Not sure if it was clear: I meant testing performance with Crystal 0.31.0, 0.32.0, 0.33.0 in order to narrow it down where it changes.

gummybears · May 18, 2020, 3:03pm

I understood you the first time, it is a bit of work.

This is the plan, I will start downloading all the binaries from 0.30 up to 0.34 and per version

compile my code and report the compile time and size of executable
run and time all my test specs
run and time the fibonacci test

Is this a good enough plan ?

HankAnderson · May 18, 2020, 3:16pm

Not sure if it was clear: if you profile it (on mac you can use Instruments, on other systems maybe something else), you will immediately know the cause of the slowness. No need to test against several versions, you will do it for 0.34.0 and then you will know what’s slow.

jwoertink · May 18, 2020, 3:19pm

I would love to be able to do this for some of my code, but I’ve never profiled any code ever. Do you have any tutorials on how to do that with crystal code? What apps do you need to have, and commands you need to run?

Blacksmoke16 · May 18, 2020, 3:24pm

A PR to add a Profiling section to Guides - Crystal would be .

gummybears · May 18, 2020, 3:26pm

I understood what you were saying and that is what I was doing yesterday with valgrind. I also reported my initial findings but understanding valgrind’s output is not that easy, at least for me.

gummybears · May 18, 2020, 3:31pm

Not sure if your question is directed at me, but on Linux I am using valgrind.
There is not a lot I found on the internet on how to best interpret
the output of valgrind but here are the commands I use

$ valgrind --tool=callgrind ./name_of_exe

Note, valgrind is quit slow, but it produces a file
callgrind.out.PID which you can view and inspect with

$ kcachegrind callgrind.out.PID

jwoertink · May 18, 2020, 3:47pm

I was more asking @HankAnderson since he mentioned to profile your code using instruments. I’m on mac, but I don’t know where to begin with any of that.

I’ve never used valgrind either. I know I was pointed to it before for some issues I was running in to, but to me it was like “oh, you want to fix a light switch in your house? here’s a book on electrical engineering and all of the city codes”. Yeah, I got a bit overwhelmed and noped right out lol.

Thanks for that little snippet, it’s already a little easier to ingest than what I’ve found looking around online. As @Blacksmoke16 mentioned, if anyone reading this has knowledge on this stuff, a PR to the reference guides would be (and preferably something written from the viewpoint of a person that’s never done this stuff )

Topic		Replies	Views
Crystal 0.34.0 has been released! Official	41	1905	April 13, 2020
Build with --release performance is slow than the 2017 crystal version? Help & Support	16	645	June 13, 2022
BigInt is really really slow when run on linux, Or just Boehm GC issue? Help & Support	31	894	September 26, 2022
Ruby out performs Crystal significantly on this numerical algorithm Help & Support	18	1274	April 29, 2023
Very slow build speeds for hello world Help & Support	42	919	August 18, 2024

Crystal 0.34.0 slower?

Related topics