"signals delivery fails constantly" in multi-threaded Crystal app

Craig · May 22, 2023, 3:18pm

Hi Crystal folks – I have a small Crystal 1.8.0 app (on Ubuntu 20.04) that serves some vector tile map data using http/server, the pg shard and crystal-sqlite3. Encoding the vector tile data is somewhat CPU intensive, so I have this running on a 64 core machine with -Dpreview_mt=true and CRYSTAL_WORKERS set to 40.

This generally works pretty well, but, being used in mapping applications, traffic can be quite bursty: users are likely to make a rapid series of requests all within a short period of time as they pan/zoom around the map.

Every now and then the app freezes for a while and then crashes with the error:

Signals delivery fails constantly at GC #1644
Signals delivery fails constantly
Aborted

I run the app in a while true; do ... loop to restart it when it happens, but is there a better way to handle this, or prevent it? From what I can tell, the “Signals delivery fails constantly” error comes from bdwgc (bdwgc/pthread_stop_world.c at 9229da044bbc5f5f131741975c0c35522bed227d · ivmai/bdwgc · GitHub ) but this is a bit over my head as to what to actually do about it. It does seem like there’s a GC_RETRY_SIGNALS environment variable that I can alter to affect how many times (if at all) lost signals are re-sent, but I really have no idea what’s going on here.

Any ideas about what I might consider?

naqvis · May 22, 2023, 4:01pm

Sounds like a bdwgc issue itself. Refer to Signals delivery fails in gctest on Ubuntu Jammy if compiled with TSan · Issue #543 · ivmai/bdwgc · GitHub and you can try to build bdwgc manually and run gctest to see if you hit the same.

HIH

Craig · May 23, 2023, 5:43pm

oh good idea – thanks!

Topic		Replies	Views
Invalid memory access (signal 11) on 1.13.1, not present on 1.12.2? Help & Support	5	240	August 28, 2024
Issue with crashes/hangs with fibers/channel and PG DB Help & Support	11	558	March 13, 2021
Compiler release options causing segfaults Help & Support	12	478	March 13, 2020
Segfault on Linux Help & Support	12	625	June 11, 2021
I feel so helpless right now	8	533	September 10, 2019

"signals delivery fails constantly" in multi-threaded Crystal app

Related topics