Digging into struct initialization performance

Crys · September 10, 2024, 1:15pm

@ysbaddaden started toying with BLAKE3, trying to assess how much faster it was compared to SHA256 in Crystal, and see if it would bring some free improvement…

What started as a simple benchmark ended up in a discovery of a performance issue related to the initialization of Crystal structs, which stems from Crystal's Ruby-based syntax.

This is a companion discussion topic for the original entry at https://crystal-lang.org/2024/09/10/digging-into-struct-initialization-performance

kojix2 · September 12, 2024, 4:54am

Very interesting post!

When creating an instance object of a structure class, does Crystal always copy the data in stack memory?

I couldn’t figure that out because the post says that LLVM sometimes optimizes the copy away and sometimes doesn’t.

And “uninitialized” and then call “init” to prevent a second copy?

jgaskins · September 12, 2024, 7:56pm

The Crystal compiler will always copy the structs, yes. It passes structs by value. LLVM optimizes the code that Crystal generates and might optimize the pass-by-value to a pass-by-reference if and only if it has an optimization for that calling pattern.

It sounds like Julien was saying that if the ivar is a reference (he mentions Pointer, but I don’t know if that means specifically instances of Pointer(T) or any Reference type) then LLVM has a great optimization for that calling pattern — the entire struct is inlined to the point where there’s no difference in the generated code between using the struct and not using it. It’s 100% free at runtime.

This aligns with my experience with simple structs where even my most micro of microbenchmarks showed no performance difference between using the wrapper struct and performing the same operations on the wrapped class instance.

ysbaddaden · September 19, 2024, 12:52pm

During codegen there are no more differences between Reference and Pointer. For LLVM it’s a mere pointer.

What I meant is that when we wrap a pointer-sized value in a struct (nothing more) then LLVM codegen will optimize the struct away, and the generated assembly will be identical to passing the pointer/reference directly. The struct becomes a nice abstract with zero cost.

But as soon as we pass something else (e.g. wrap the value in a StaticArray), the generated assembly will start to be different, even if the struct would still be pointer sized; because of alignment, I think, it must copy each internal value. At that point, the struct might not be a zero cost abstraction anymore.

Now, unless the struct is significantly large, the cost shall be be hardly noticeable.

Topic		Replies	Views
How to easily get consistent object hash?	21	1889	April 3, 2019
Performance struct vs class	13	1309	February 16, 2020
Mutable structs - what am I doing wrong? Help & Support	15	571	September 24, 2019
Build with --release performance is slow than the 2017 crystal version? Help & Support	16	645	June 13, 2022
Propose: Incremental compilation Crystal Contrib	33	838	January 15, 2024

Digging into struct initialization performance

Related topics