Crystal parallelism and Kemal

I was trying to find a good resource on how to have Kemal benefiting from Crystal’s parallelism support.
I did some googling but I couldn’t find a resource on this.
Do you know about any good material that covers this?

Thanks

Kemal uses the standard libraries’ HTTP::Server module under the hood. All you should have to do is essentially just build your binary with -D preview_mt and that should be it. The number of worker threads defaults to 4 but an be controlled by the CRYSTAL_WORKERS env var.

Just note that multithreading is still considered experimental, and some portions of Kemal may not be thread safe.

Thanks a lot. I will give it a try.
I want to update the version running on https://www.techempower.com/benchmarks/

The Techempower’s benchmarks already run in multiple processes via reuse_port: true. It’s highly unlikely the results would improve.

If you do update benchmarks, please update all Crystal frameworks. It’s good to see apples to apples comparison.

2 Likes

This is very true. In fact, not only would they not improve, but they may very well end up worse. Crystal is much more efficient across fibers in single-threaded mode than it is when multithreaded. It would use more CPU time to do the same amount of work with preview_mt enabled.

Depending on your production infrastructure, either one of these could be more realistic. On Heroku, it’s easier to run a single multithreaded process to use the full computing power of the box you’re running on. On Kubernetes, it’s easy enough to assign multiple processes to a single box that you don’t have to use either multithreading or the fork+reuse_port trick.

All that to say, I think it’s a good idea to have both benchmarks — multithreading and multi-process.

2 Likes

In my benchmarks once you get past like 4 simultaneous workers then it starts to bottleneck on the GC, FWIW…

Don’t even need to look at kemal. Stdlib’s HTTP:::Server is already not threadsafe. Multi-threading is really just a technology preview and not recommended for any productive use. It probably works fine most of the time but only very few parts of the standard library (mostly the concurrency primitives) have been revisited to make sure they’re thread-safe (even that might not be finished).

As @jgaskins explained, multithreading is unlikely to outperform multi-process deployment. It it did show better results, there would be a flaw somewhere.
I agree it’s still worth showing both options, but given the current preview state, I’d advise against posting any benchmarks in such a comparison. Multithreading support is a work in progress and it actually still requires quite some effort to finish. Until there’s a somewhat stable release for multithreading support in view, there’s no point in comparing performance. It would only suggest multithreading was ready to use, and probably paint an incorrect picture of its performance characteristics.

3 Likes

aaah, I see your point.
I will wait then until this is no longer a preview flag.
There is no point on adding noise in those benchmarks.

Thanks!