Multithreaded callback

Hello,

as an exercise to learn crystal, I am trying to write
a demo filesystem using libfuse and interface
some Crystal code, with the linux kernel.

In this context, I am wondering what is the expected behavior when I register
some crystal code as callback, and latter have that callback called simultaneously
from different thread by thirdpary code (here in case of libfuse, the kernel).

Putting aside, from necessary MT Safe Array & Hash access,
is there any other major issue, that I should be aware of in that scenario ?

My understanding is that GC, already run is own thread,
will it still behave properly if multiple thread allocate object ?

The first thing to be aware of is: who creates the thread for the callback?
The second one is: do you need IO / Channels / Fiber context switch in the callback?

The GC expects to be the one that creates them. That is what Thread.new does. The GC can be compiled and configured to accept registering additional threads that were not created by itself, but that is not the default.

The GC will stop all the thread to collect and from there it will travers the structures in the stack and the roots to know what to collect. So if a non-registered thread allocates things in the GC that are not reachable from a registered thread or roots then they will be collected.

Regarding the Crystal runtime itself, if the callback will not generate IO or a Fiber context switch, you can make it work even on single-thread. The multi-thread mode enables Fiber context switches, channels, etc. Yet, the GC MT support was already present. So if the callback threads use memory or Thread::Mutex (not ::Mutex!) then you can make it work without -Dpreview_mt.

I do encourage to use -Dpreview_mt, but I think that explaining the limits of the single-thread can offer some insights in this case.

If you have some more details of what you expect to do in the callback then there might be something more concrete to say.

1 Like

Thanks for your answer, it already give me a lot of what I was looking for.
I will try to give more context below.

The first thing to be aware of is: who creates the thread for the callback?

To summarize, the thread are created externally from crystal, and then callbacks are called concurrently.

Libfuse is a C library + linux kernel module that enable to write user space filesystem. To do so a program, call libfuse providing a struct that provide FUSE a set of callbacks the kernel can call to run the said filesystem. So the callback provided have to mimic open, readir, fstat and so on.

So in my case I am implementing the callback in crystal, and use c-binding to call libfuse and do the binding.

The libfuse C function I call to register the callback, become the main execution and (nearly) never return. The callback are called by thread created by the kernel/libfuse (I don’t know which exactly) when needed to operate the registered filesystem. Two different process accessing the same mount point, will likely use two different thread, and so call the same set of callbacks but in two different threads. There are options to constrain libfuse to use only one thread, which is fine for debugging or prototyping (like I do), but not a target on its own.

The second one is: do you need IO / Channels / Fiber context switch in the callback?

As the goal is to implement the various operation needed for a filesytem, callbacks need to do IO, but do not necessary do use Channels or Fiber themselve.

For filesystem I am considering using sqlite and regular IO for the implementation of the callback. I need also some global hashmap (but this is another story).

The GC expects to be the one that creates them. That is what Thread.new does. The GC can be compiled and configured to accept registering additional threads that were not created by itself, but that is not the default.

The GC will stop all the thread to collect and from there it will travers the structures in the stack and the roots to know what to collect. So if a non-registered thread allocates things in the GC that are not reachable from a registered thread or roots then they will be collected.

That the kind of issue I was seeking when asking this question, so many thanks for sharing this.

Having this in mind, I am currently looking at fuse wrapper for Go and D, wondering what they to do manage that behavior, I am not so sure this is handled properly.

I’m glad you find it useful.

You got it right, doing IO is not the same as doing IO via de Crystal runtime.

Regarding the GC you might need to play with https://github.com/ivmai/bdwgc/blob/ee900ca80085032156573517998c6cbfa1494377/include/gc.h#L1547 or be careful how those objects can be reached. Maybe you can add some structs as roots. I’m not sure what will be better.

If you need to compile the GC you will need this patch for using Crystal MT.

Thanks for these pointers.

I happen to found this gem in the FUSE documentation:

How should threads be started?
Miscellaneous threads should be started from the init() method. Threads started before fuse_main() will exit when the process goes into the background.

fuse_main() being the function I am calling to register the callback.

So about the Crystal GC, does it work the same at the Dlang GC, that is the GC is triggered by a new allocation, and as such use the current crystal running thread to stop the others ?

In which I case I should be fine, provided I am using the patch you mentioned.

Yes. More or less seems the same. The GC will start its own threads though. I’m not sure what hijacking the current thread means in the dlang reference. In bdwgc all “user threads” will be paused before collecting IIRC.