Crystal performance....HELP!

Mmh, I can’t spot any major red flag, only small or structural things.

I don’t know how optimized halite is, it might be worth to change it for stdlib HTTP::Client or even a libcurl binding for libcurl’s multi interface. Also having a pool of HTTP::Client instances, keeping connections open to the same hosts might improve some. If you do a lot of requests against the same host then have more than one instance per host, if you do requests to a lot of different hosts then make sure to limit the amount of kept connections.

https://gist.github.com/thelinuxlich/c459ca5cd77718307a58a4a3e3c335c5#file-scraper-cr-L65 Avoid allocations if you can. For example here you can avoid all the intermediate arrays just by swapping the [] for {} within the map. You can avoid the bigger intermediate array by swapping the map for to_h {|match| {match[1], match[2] } and you can avoid the hash by using scan with a block and a little case statement within.

https://gist.github.com/thelinuxlich/c459ca5cd77718307a58a4a3e3c335c5#file-scraper-cr-L86 Avoid calling the regex engine if possible, so for example this is .delete("^0-9"). This doesn’t call the regex engine but uses Char#in_set? internally.

https://gist.github.com/thelinuxlich/c459ca5cd77718307a58a4a3e3c335c5#file-scraper-cr-L87-L88 Avoid doing conversions twice. x = v if v = x.to_i?

https://gist.github.com/thelinuxlich/c459ca5cd77718307a58a4a3e3c335c5#file-scraper-cr-L105 does this happen frequently? If so it might be worth to avoid raising and calling next instead to restart the loop.

It might also be worth to prepare the SQL statements outside the loop http://crystal-lang.github.io/crystal-db/api/0.8.0/DB/SessionMethods.html#prepared(query)-instance-method. I would also play with wrapping things into transactions, with a commit every thousand iterations or so. Having a pool of DB and Redis connections rather than all the workers fighting over a single one might also be beneficial.

Collecting some timings with Benchmark.measure and calculating averages for sections of the program in some global state might help in understanding what is actually a bottleneck and should be focused on first. If you extract things to methods, you might even have a chance to learn something about that using standard perf tools.

Finally you might want to toy with -Dpreview_mt.

I realize these are mostly not simple changes and each requires testing and verification, but such is performance optimization anywhere :) Also realize that this problem is mostly IO bound, so applying the same level of optimization to your nodejs implementation will likely see a comparable speed.

2 Likes