Hi guys! I was hoping some people could share their experiences some pitfalls when writing system services/ web apis or anything in general in actual production with Crystal lang.
Hello!
When I was at Plezi, we had to build a service that will take the two most called route from our current app.
It taken data to recognize what client it might be, then give this custom form.
The whole process taken 800ms in ruby dropped to 300ms in crystal.
This saves our ruby app from the massive long call (about 70% of our call), adds some resilience (if the app gets down, this continues to works) and improves user experience.
Nowadays I am building an app that will take AI and serve it to the outside world.
My company is using Crystal for our authorization service. Our main requirement for it were that it be low-latency under load and Crystal provides that better than a Ruby web service can for a couple reasons. The obvious reason is that Crystal apps tend to just be several times faster than Ruby apps.
The other big reason is that Crystal encourages lower-latency conventions. Ruby web services often run in a static thread pool with Puma (so, for example, if you have 16 threads, you canāt even start handling your 17th concurrent request until one of the other 16 has completed), whereas a Crystal web service can spin up as many fibers as it needs to handle all incoming requests concurrently.
Another convention is that Ruby effectively (not literally, but effectively) requires you to build up an entire response body before you send your first byte, whereas with Crystal itās so much easier to begin streaming your response far earlier, so streamable formats like HTML or MessagePack let the client start processing earlier. This is especially nice if you use a DB whose results are streamable (like Neo4j).
Stack:
- Kubernetes (creating Docker images for Crystal apps is surprisingly easier than for Rails apps)
- gRPC (Crystal shard) for intra-cluster communication
- Neo4j (Crystal shard) as its primary data store
- Redis for caching (Crystal shard, but we use my fork that uses
crystal-db
ās connection pool)
Donāt use an SSL server in production (until this is fixed: https://github.com/crystal-lang/crystal/issues/8108), or you can monkey patch around it like I do at present :)
I agree that I wouldnāt recommend using Crystalās SSL server in production. It works mostly, but besides the obivous issues, there are some rough edges and AFAIK there has never been a security audit. For projects without strong security requirements this might be fine, but Iād always use a proxy to terminate TLS sessions in front of my Crystal apps. But this is really easy to implement and also offers some other features out of the box, so I donāt feel this needs immediate action on Crystal side.
My company uses almost all Crystal now in production for the last 2 years. Itās been mostly pretty solid, and has given us a ton of benefits like deploying apps to smaller EC2 instances than what we needed when using rails. Our apps perform ~3x faster with a quarter the memory and CPU usage as the rails apps they were rebuilt from.
The downside has been when a new release comes out, and you want to use all the latest stuff to fix some bugs, but then other stuff stops working. Crystal 0.33 felt super solid for us, but moving to Crystal 0.34 has been a lot less stable as we see more down time. Iām confident itāll get fixed, and itās not really a show stopper in our case. As more people use this and help to catch weird bugs, things will start even out just as 1.0 comes around the corner.
Oh yeah I forgot one, RAM usage increases slowly until it levels out at about 200MB. The GC isnāt really tunable yet. Not a big deal but thought Iād mention it.
@jwoertink what problems, have they been reported? Cheers!
bdwgc is very tunable: https://github.com/ivmai/bdwgc/blob/master/doc/README.environment
@jwoertink I hope newly merged debugger support will help you tremendously as it helped me when I was debugging Crystal compiler itself to validate debug info generation ( sounds like recursion, eh?! )
When initially I started to work on debugging support it was PITA with a lot of puts lines in the code.
When it started to show at least some of the information it become much easier to debug it. So I hope it will help you guys to identify possible issues in your code.
The issue Iāve seen is mainly in production inside of Docker containers. I donāt think Iāll be able to use the debugger for that. Iām also not really sure what Iād be looking for. But Iām still really stoked for all the work you put in to the debugger! Thatāll help with development so much!
Yeah, the error weāre seeing mostly I reported here. They say itās a valid error, and Iām not sure if this is even really the cause. My apps live on elasticbeanstalk which look for a % of the requests to be 500 range. Over a threshold, EB puts the container in to a ādegraded stateā. On Crystal 0.33 we were in degraded state about 5min a day. On 0.34, we sit in degraded state about 3 hours a day or more.
Would be very interesting to somehow do a git bisect and figure out what (if anything) caused the differenceā¦)