Fibers, blocking IO, and C libraries

(X-post from Reddit)

I’m currently having trouble getting Crystal to play nicely with external C libraries that block on IO events. Specifically, it seems that Crystal isn’t able to properly switch to other fibers (for example, a Signal#trap handler) while waiting for a C function to return. For example, say I’m using libevdev and trying to write an event handler that has cleanup to run:

@[Link("evdev")]
lib LibEvdev
    # Exact pointer types aren't relevant here
    fun next_event = libevdev_next_event(dev : Void*, flags : LibC::UInt, event : Void*) : LibC::Int
end

Signal::INT.trap do
    puts "pretend there's important cleanup code in here"
    exit
end

loop do
    LibEvdev.next_event(device, flags, out event)
    puts "pretend there's #{event} handling code here"
end

This (hypothetical) program is never able to run the signal trapping code and simply does nothing on Ctrl-C.

What am I missing here? My current thought is that Crystal has no way of knowing the C library is blocking on IO, so it doesn’t know when to switch fibers. Is there some way to mark this call as IO-blocking or do something like select(2) with an IO::FileDescriptor so that Crystal handles this properly?

I’ve asked this a couple times on Gitter already but haven’t gotten any responses, any help or advice would be greatly appreciated!

1 Like

Default behavior of Crystal program is Single Threaded and since you are invoking a blocking call inside main thread, that will block the main thread. Crystal runtime won’t create/invoke fibers until unless you ask for them via spawn. Also keep in mind that spawning a Fiber isn’t same as spawning a new thread. Crystal parallelism is a feature which is still in preview mode and you will have to opt-in for that at compile time to let Crystal runtime know that.

So for tasks which you believe are going to be blocked, you should be designing your application in a way that those tasks are run in separate Fibers while leaving your main thread unblocked for response.

so a somewhat naive approach to achieve your desired outcome is moving that loop to a separate Fiber.

spawn {
loop do
    LibEvdev.next_event(device, flags, out event)
    puts "pretend there's #{event} handling code here"
end
}

and then waiting for Fiber to complete in your main thread with a call to either Fiber.yield or sleep

HIH.

In single threaded mode moving to it to a separate fiber is not going help either. Fibers aka coroutines are a cooperative concurrency model, there can’t be a blocking operation in a thread and somehow magically the very same thread continues to run another coroutine. Crystal’s standard library hides this by being built on top of evented IO using libevent. In effect it registers a callback with the OS to mark a coroutine as ready to resume once a blocking operation would no longer block, then marks the coroutine as blocked and switches to another coroutine.

If the library working against offers such functionality of registering a callback to get notified that a potentially blocking call would surely not block, there’s ways to be found to integrate it with Crystal’s runtime scheduler, albeit those APIs are internal and subject to change still.

1 Like

To add on that: If the library does not offer a non-blocking, callback-based API there’s no other solution than to call that API in a dedicated thread. The call will block that single thread, but the rest of the program including the signal handler (which btw. already runs through a fiber of its own) can continue.

I know Go does this automatically for any C call: create a Thread and run the C function there so it doesn’t block the rest of the things.

Maybe we can introduce something similar with an annotation on top of the C call, something like @[DontBlock] fun .... It could maybe even be done in single thread mode because all that’s need to be done is creating a thread, running the c code, then calling join on it (I think,) so it would be independent of the actual Threads used by the multithread runtime.

2 Likes

Assuming that I have access to the library’s internal fd, is there a way to hook into this evented IO for a similar effect?

If and only if the library can deal with O_NONBLOCK being set on the FD. Then you can just wrap it into an IO::FileDescriptor.

Interesting - in that case, is there a clean way to get Crystal signal handlers to run immediately? Directly using LibC.signal doesn’t allow Proc closures and I don’t see any way to do the Box trick.

I love this idea. Currently it’s not possible to use Fibers (without -Dpreview_mt) with an app using Glib main loop (a.k.a. a Gtk application) due to this. The current alternative is to add a Glib.idle_add { Fiber.yield } and watch your process eating all CPU.

Even using -Dpreview_mt problems happen because IIRC Fibers share a fixed amount of threads, so you need to call Thread.new anyway or one of your fibers could end up in the same thread that is stuck inside the glib mainloop.

With the @[DontBlock] we would be sure that a spawn call would never create a Fiber in the same threads where the Glib main loop is running. :slight_smile:

3 Likes

BTW just today I faced an issue with this, hehe. I had payed no attention that the default logger setting is to be assync using Crystal Fibers, so in my case I have a GTK UI to show notifications, I implemented notifications as Log entries with metadata saying it’s also a notification. So… some Fiber got in the same thread glib main loop was running and I received the notifications only when I exited the glib main loop, what in my case meant that the application finished. For some reason/coincidence this issue only show up when I ported the application to Crystal 1.0.0.

2 Likes

I found this and remembered this topic… it’s about people discussing this approach of creating a thread for blocking calls on ffi gem to work with ruby3 fibers.

1 Like

Would it be reasonable to expose some interface to the event loop (or is it the scheduler?) for cases when the file descriptor is known?

Usage might look something like this:

Fiber.yield_until(fd: my_great_fd)

It is already exposed.

If you instantiate a FileDescriptor from your raw fd, then you can use wait_readable or wait_writable which will wait until the condition is fulfilled.

Note though that I don’t think the necessary functions for that exist on Windows, if that is needed.

However, that doesn’t really help with what this thread is about.