IdleGC shard (reduce memory usage) & CRYSTAL_LOAD_DEBUG_INFO=0

I just published IdleGC v1.0.0, idle-gc shard, which does garbage collection when your Crystal process is otherwise idle.

Just add require "idle-gc" and call IdleGC.start to start a background Fiber, which will periodically poll for idle periods, garbage collecting as needed.

You may also add calls to IdleGC.background_collect, such as near the end of you web request handler. This won’t block your response from completing, but will only collect after the process is idle.

Benchmark results indicate memory savings of 9% to 38% at the cost of a small (unnoticeable to me!) bit of CPU time. Please try it for yourself and see if it lowers your application’s memory usage!

IdleGC is in production use on Total Real Returns, which was mentioned here a few weeks ago.

Also, I noticed huge memory spikes unless I ran my server with the undocumented CRYSTAL_LOAD_DEBUG_INFO=0 environment variable. This was caused by Exceptions trying to print their backtrace, forcing the load of debug symbols, even though I compiled with --no-debug. I’d recommend others running in memory-constrained environments to consider the CRYSTAL_LOAD_DEBUG_INFO=0 setting.

6 Likes

Great that it helps improve your memory consumption. But I’m doubtful of this concept.

Testing if the process is idle uses a very explorative approach, basically doing Fiber.yield and measuring how long it takes for the fiber to resume. When it returns quickly, the process is considered idle.
This has a number of problems. It’s quite wasteful with injecting a couple of fiber swap cycles. If the process is actually under heavy load, this will make it worse. And the meaning of such time measurements depends very much on each system (like hardware capabilities and configuration, overall system load etc.).

Making use of injecting a GC run at an idle time would work much better when it directly integrates with the scheduler which can much better tell whether it’s idling.

Additionally, BDWGC has a lot of configuration options for tuning it towards specific use cases. The Crystal runtime uses the default configuration which is good for generic applications. But your web server use case may benefit from adjusting these settings to a purpose-specific configuration.

2 Likes

Nice! If I understand correctly, the GC is still being run by the allocator, right?

Regarding CRYSTAL_LOAD_DEBUG_INFO, should we consider that to be 0 when a program is compiled with --no-debug. I think we could know that at compile-time.

1 Like

To clarify: my use case is running a Kemal-based web server, with one process per Kubernetes Pod. Having small – and more importantly, fairly predictable – memory footprint is key to defining reasonable Pod memory requests and limits, allowing me to pack more replicas onto a single small (inexpensive) Node to handle concurrent requests.

I found that doing explicit GC.collect got me lower memory usage AND kept a lid on growth-over-time. Doing it in the background when idle made it work without slowing down user requests.

@straight-shoota Yes, if there’s a more direct way to determine “idle” from the Fiber scheduler, I’m very open to it! :slight_smile: CPU overhead seems to be quite minimal for IdleGC::IdleDetection.process_is_idle?, like a few microseconds, called every second by IdleGC::Timer.

@beta-ziliani Yes, IdleGC just calls GC.collect at the appropriate time.

@asterite That makes sense that --no-debug could be known at compile-time. I was quite surprised when my usually-few-MB Kemal program jumped in RSS size by +40MB suddenly, and eventually found that it was happening on any Exception. In my case it was just the exception of an HTTP client disconnecting before the response was finished.

Idle-time garbage collection is academically sound (e.g. Google Chrome: Idle Time Garbage Collection Scheduling), but of course the details matter and this is a simple implementation without getting into any Crystal internals. Also, the Go runtime forces a full GC every 2 minutes: forcegcperiod.

Thanks all! :fireworks:

4 Likes

Wish work arounds for GC weren’t necessary, hopefully we’ll figure it out at some point why processes seem to just grow :)

I think the default GC works decently well and I don’t know that there is a more generic way to stop slow memory growth other than the periodic GC-every-two-minutes, like the Go runtime forcegcperiod or Crystal’s IdleGC does.

It probably works well enough that most people don’t care and don’t need to care, and that’s fine.

If you do care about maintaining small and predictable memory usage, I don’t think it’s too much of a burden to the programmer to toss in an IdleGC.start and/or an IdleGC.background_collect call at probably one point in your code. (Or GC.collect if you’re insensitive to latency.)

In exchange for adding a few hints about when it might be a good time to collect, you get all the programmer convenience of memory management / garbage collection, and all the benefits of low memory usage.

In exchange for making it explicit, programs that don’t need it (especially short-lived ones) never have to pay a GC cost.

Go only gets away with no hints by essentially bundling this periodic collection into the runtime. But in exchange for that invisible simplicity, the periodic collection is far less optimal than activity-driven collection.

@rogerdpack Would you be in favor of building the GC-every-two-minutes rule into Crystal?

I’m not sure that I’d be in favor of building that into Crystal, rather than just putting some notes in the docs, pointing memory-sensitive programmers in this direction.