Hi, I was playing with GLib main loop to try to let it integrate well with Crystal Fibers, so people could write GTK aplpications using crystal fibers.
So far, so good, I’m basically using the approach that I knew about in some @asterite comment in this forum, i.e. create a thread; call the blocking function; Fiber.yield; return.
If I compile my example to a executable it works, however if I try to wrap it in a test case it hangs and never returns unless -Dpreview_mt.
For the sake of simplicity I removed any GLib binding code and called the C functions directly.
require "spec"
@[Link("glib-2.0", pkg_config: "glib-2.0")]
lib LibGLib
fun g_main_context_new : Pointer(Void)
fun g_main_loop_new(context : Pointer(Void), is_running : LibC::Int) : Pointer(Void)
fun g_main_loop_run(this : Void*) : Void
fun g_main_loop_quit(this : Void*) : Void
fun g_main_loop_is_running(this : Void*) : LibC::Int
end
def main_loop_run(loop : Pointer(Void))
channel = Channel(Nil).new
# Need to use thread.new to be sure it will have its own thread and not
# share it with other fibers.
Thread.new do
puts "main loop run on thread #{Fiber.current.name}!"
LibGLib.g_main_loop_run(loop)
channel.send(nil)
puts "main loop thread finished"
Fiber.yield
end
puts "waiting for channel msg from main loop thread"
channel.receive
puts "Got the msg!"
end
describe "Fiber integration with GLib main loop" do
it "works" do
ctx = LibGLib.g_main_context_new
loop = LibGLib.g_main_loop_new(ctx, 0)
spawn(name: "quit") do
while LibGLib.g_main_loop_is_running(loop).zero?
puts "waiting main loop to start..."
Fiber.yield
end
puts "calling 'g_main_loop_quit' from thread #{Fiber.current.name}!"
LibGLib.g_main_loop_quit(loop)
puts "quit fiber finished!"
end
main_loop_run(loop)
puts "main loop call returned!"
puts "Why I don't quit!! whyyyy!!??? 😭️"
end
end
Running this with crystal spec fibers_spec.cr -Dpreview_mt works fine as expected, but if I remove the -Dpreview_mt it hangs after finish the test case:
$ crystal spec fibers_spec.cr
waiting for channel msg from main loop thread
main loop run on thread main!
calling 'g_main_loop_quit' from thread quit!
quit fiber finished!
main loop thread finished
Got the msg!
main loop call returned!
Why I don't quit!! whyyyy!!??? 😭️
.
Last night I ran into a very similar “why can’t this quit?!” issue and found it was because I had Channels with no size. Changing their size to 10 (an arbitrary value) fixed everything. I had forgotten that Channels are not like general purpose message passing queues, and if the Fiber that would normally drain the Channel was already dead, I couldn’t send a new message and things would just hang.
Don’t mix Threads you have created yourself with anything that involves the event loop. Ie both channels and yielding/sleeping will not behave the way you want as there is no run loop and the newly created thread is simply not integrating with the event loop. As soon as anything goes to sleep it won’t wake up again. If scheduled manually any fibers from those threads would either not come back at all or come back on one of the regular threads. What can be done in manually created threads is very, very limitited and they should probably be considered an internal interface. There has also been recent work in making them more private.
There is at least one library that provide threads that have run loops and integrate with the event loops, but that comes from monkey patching all over a lot of private interfaces and it can break (and has!) whenever a new release have been done.
In the code snippet I create the thread myself because with spawn, even with -Dpreview_mt isn’t guaranteed that the block will not share the thread with other fibers, in this case that I do a call to a blocking C function any other fiber running in this same thread will never run until the C function returns and I call Fiber.yield. So I believe that this is a use case where using Thread.new (that was soft removed from stdlib) is not only valid but the only way to make this work.
I’m not 100% sure, but this feels like the same reason that will/crystal-pg was ported to use Crystal for the wire protocol instead of continuing to use libpq under the hood. A C library that blocks the thread blocks the entire thread, including the Crystal fiber scheduler, until its blocking condition is met.
When I was doing GitHub - bcardiff/crystal-fswatch fswatch library also needs a dedicated thread. For using this in crystal single-thread mode something that won’t work is using channels to communicate between threads. I think this is something that could be affecting your code.
The runtime in single-thread mode does not allow channels to be used in custom threads. (Unless I missed some updates )
To workaround that, in the library I created something called ThreadPortal. It’s a plain wrapper on channel when MT, but a IO based sync for single thread.
Exactly, this is why I create the thread using Thread.new, to be sure the Crystal::Scheduler doesn’t play any role with this thread and let it be happy and isolated.
As I said, it works fine with/without -Dpreview_mt, but when I use the spec library it hangs, and hangs after all my code finish as show in the debug messages. I need to dig into spec code to find why…
Yes, because even on single thread Thread.new creates a new real thread no matter what.
If I comment the lines 1, 26, 27, 45 and 46 and run the code with/without -Dpreview_mt it works. So the issue seems inside the Spec code.
$ crystal run spec/fibers_spec.cr
waiting for channel msg from main loop thread
main loop run on thread main!
calling 'g_main_loop_quit' from thread quit!
quit fiber finished!
main loop thread finished
Got the msg!
main loop call returned!
Why I don't quit!! whyyyy!!??? 😭️
hugo ~/src/gi-crystal fibers 2.7.6p219 14:57:24
$ crystal run -Dpreview_mt spec/fibers_spec.cr
waiting main loop to start...
waiting main loop to start...
waiting main loop to start...
waiting main loop to start...
waiting main loop to start...
waiting main loop to start...
waiting main loop to start...
waiting for channel msg from main loop thread
main loop run on thread main!
calling 'g_main_loop_quit' from thread quit!
quit fiber finished!
main loop thread finished
Got the msg!
main loop call returned!
Why I don't quit!! whyyyy!!??? 😭️
hugo ~/src/gi-crystal fibers 2.7.6p219 14:57:33
I revisited this issue today because I need to do some network operations in a GTK application of mine without freeze the API, so I think I found a solution for GLib main loop integration with Crystal main loop without going through the deeps of Crystal::EventLoop libevent implementation.
@[Link("glib-2.0", pkg_config: "glib-2.0")]
lib LibGLib
fun g_main_context_default : Void*
fun g_main_context_iteration(ctx : Void*, may_block : Int32) : Int32
fun g_timeout_add_seconds(interval : UInt32, func : Void* -> Int32, data : Void*) : UInt32
end
def glib_counter(_data)
puts "glib counting"
1
end
def crystal_counter
loop do
sleep(1)
puts "crystal counting"
end
end
# Create a GTK main loop context.
ctx = LibGLib.g_main_context_default
# Start GLib counter, triggered by GLib main loop events.
LibGLib.g_timeout_add_seconds(1, ->glib_counter(Void*), Pointer(Void).null)
# Start a Crystal counter, triggered by Crystal main loop events.
spawn crystal_counter
# Wait 10 seconds then quit.
spawn { sleep(10); exit }
Thread.new do
loop do
# Let the glib main loop run, blocking the thread if there's no event yet.
LibGLib.g_main_context_iteration(ctx, 1)
end
end
Channel(Nil).new.receive
This POC works for this plain GLib code, next step is to change this to work with g_application_run
Nice! It works because threads happen to lazily start their local scheduler + dedicated event loop when you trigger a fiber reschedule or enqueue, which any IO will trigger.
You must enable -Dpreview_mtso the stdlib becomes thread safe (Schedulers, IO, Channel, …). You may disable the multiple worker threads with CRYSTAL_WORKERS=1 if you only need one thread for Crystal + another for GLib / Gtk. I’m doing just that in a tiny Gtk3 app (very small lived, yet capable to run HTTP requests without blocking the UI).
@hugopl Yes. The in progress implementation is at GitHub - ysbaddaden/execution_context and there’s a sample for Gtk+ 3 that uses ExecutionContext::Isolated. It works like a charm :)
GTK4 uses the very same GLib, Gio and GObject libraries, so your in progress implementation must work on GTK4.
I plan to let rtfm be able to download docsets, so I’ll probably test this with your in progress implementation to avoid having to create that _async operations in Gio. tijolo also have a bunch of things that must not be in the UI thread and are right now.