Hi guys! I was hoping some people could share their experiences some pitfalls when writing system services/ web apis or anything in general in actual production with Crystal lang.
When I was at Plezi, we had to build a service that will take the two most called route from our current app.
It taken data to recognize what client it might be, then give this custom form.
The whole process taken 800ms in ruby dropped to 300ms in crystal.
This saves our ruby app from the massive long call (about 70% of our call), adds some resilience (if the app gets down, this continues to works) and improves user experience.
Nowadays I am building an app that will take AI and serve it to the outside world.
My company is using Crystal for our authorization service. Our main requirement for it were that it be low-latency under load and Crystal provides that better than a Ruby web service can for a couple reasons. The obvious reason is that Crystal apps tend to just be several times faster than Ruby apps.
The other big reason is that Crystal encourages lower-latency conventions. Ruby web services often run in a static thread pool with Puma (so, for example, if you have 16 threads, you can’t even start handling your 17th concurrent request until one of the other 16 has completed), whereas a Crystal web service can spin up as many fibers as it needs to handle all incoming requests concurrently.
Another convention is that Ruby effectively (not literally, but effectively) requires you to build up an entire response body before you send your first byte, whereas with Crystal it’s so much easier to begin streaming your response far earlier, so streamable formats like HTML or MessagePack let the client start processing earlier. This is especially nice if you use a DB whose results are streamable (like Neo4j).
Don’t use an SSL server in production (until this is fixed: https://github.com/crystal-lang/crystal/issues/8108), or you can monkey patch around it like I do at present :)
I agree that I wouldn’t recommend using Crystal’s SSL server in production. It works mostly, but besides the obivous issues, there are some rough edges and AFAIK there has never been a security audit. For projects without strong security requirements this might be fine, but I’d always use a proxy to terminate TLS sessions in front of my Crystal apps. But this is really easy to implement and also offers some other features out of the box, so I don’t feel this needs immediate action on Crystal side.
My company uses almost all Crystal now in production for the last 2 years. It’s been mostly pretty solid, and has given us a ton of benefits like deploying apps to smaller EC2 instances than what we needed when using rails. Our apps perform ~3x faster with a quarter the memory and CPU usage as the rails apps they were rebuilt from.
The downside has been when a new release comes out, and you want to use all the latest stuff to fix some bugs, but then other stuff stops working. Crystal 0.33 felt super solid for us, but moving to Crystal 0.34 has been a lot less stable as we see more down time. I’m confident it’ll get fixed, and it’s not really a show stopper in our case. As more people use this and help to catch weird bugs, things will start even out just as 1.0 comes around the corner.
Oh yeah I forgot one, RAM usage increases slowly until it levels out at about 200MB. The GC isn’t really tunable yet. Not a big deal but thought I’d mention it.
@jwoertink what problems, have they been reported? Cheers!
bdwgc is very tunable: https://github.com/ivmai/bdwgc/blob/master/doc/README.environment
@jwoertink I hope newly merged debugger support will help you tremendously as it helped me when I was debugging Crystal compiler itself to validate debug info generation ( sounds like recursion, eh?! )
When initially I started to work on debugging support it was PITA with a lot of puts lines in the code.
When it started to show at least some of the information it become much easier to debug it. So I hope it will help you guys to identify possible issues in your code.
The issue I’ve seen is mainly in production inside of Docker containers. I don’t think I’ll be able to use the debugger for that. I’m also not really sure what I’d be looking for. But I’m still really stoked for all the work you put in to the debugger! That’ll help with development so much!
Yeah, the error we’re seeing mostly I reported here. They say it’s a valid error, and I’m not sure if this is even really the cause. My apps live on elasticbeanstalk which look for a % of the requests to be 500 range. Over a threshold, EB puts the container in to a “degraded state”. On Crystal 0.33 we were in degraded state about 5min a day. On 0.34, we sit in degraded state about 3 hours a day or more.
Would be very interesting to somehow do a git bisect and figure out what (if anything) caused the difference…)