In my opinion, it’s too easy to have an Unhandled exception in spawn while the rest of the program keeps chugging along. (I ran into this with a long-lived process, with a cleanup Fiber that died due to an unanticipated IO::Error, resulting in consequences later due to that cleanup not running…) I’d generally like to have my program die if there’s an unanticipated and uncaught exception in any Fiber, not just the main one.
I’m proposing spawn! which looks like this:
def spawn!(*args, &block)
spawn do
begin
block.call
rescue ex
STDERR.print "Unhandled exception in spawn!: "
ex.inspect_with_backtrace(STDERR)
STDERR.print "spawn!: Fatal. Dying..."
STDERR.flush
exit(1)
end
end
end
Alternatively, the docs might need to more loudly declare that every use of spawn should probably have a catch-all rescue block.
The use case certainly demands attention. But I’m not conviced about the proposed solution. spawn! is not very clear about what it’s doing and this behaviour does not seem very intuitive.
Termination of the entire process is only one possible reaction to an exception in a fiber that’s fatal and unrecoverable. There may be other, less grave conclusions in scenarios where the failure only affects part of the program or may be recovered.
I’d see a possible solution for this problem with structured concurrency (cf. [RFC] Structured Concurrency · Issue #6468 · crystal-lang/crystal · GitHub) which would allow to configure error handling for a concurrency scope.
which includes the spawn! method and macro, as well as some tests. We’re using this in production.
I agree that it usually makes sense to still do some exception handling within the Fiber. In some cases, that may be to intentionally ignore a specific exception, or even all exceptions. The spawn! really just changes the default behavior for an unhandled exception to a default that I personally prefer.
I’d rather explicitly ignore/recover like this:
spawn!
loop do
begin
# your code here
rescue ex : Exception
Log.warn { "Ignoring exception and restarting this fiber..." }
end
end
end
rather than having to remember to explicitly write a catch-all rescue in every spawn.
My operational assumption is that there’s a much higher chance of me seeing the unhandled exception if the process dies than if it’s just in the middle of the logs somewhere.
I’m not very familiar with structured concurrency, but it does sound like it could also be useful as another way of defining exception handling at spawn time!