Fiber.yield

lemmuh · February 6, 2024, 11:49am

In my app I have a http-server fiber and a worker fiber.

I observed under heavy worker load (reading/parsing/writing files within a loop), that the http server is not responding. Right after the worker fiber is finished with its job, the http server responds.

I aware of the possibility to pass the control back to the fiber scheduler with Fiber.yield.

I inserted the Fiber.yield method in the worker loop, however the situation didn’t change. Still the http server responded after the worker job.

How does the fiber scheduler decides which fiber is going to be the next to run ?

Blacksmoke16 · February 6, 2024, 2:20pm

Do you have some example code? Otherwise best I can guess at the moment is that the worker fiber is doing something that is preventing execution from switching back to the main fiber that is running HTTP server. Which Is a bit interesting to me if you say it’s doing file operations as normally I would have thought the scheduler would process other fibers while waiting on IO when reading from the files. But it’s kinda hard to say for sure without some code to look at.

nogginly · February 8, 2024, 3:58am

Perhaps obvious question: are you compiling with -Dpreview_mt to enable multi-threading support for fibres?

lemmuh · February 8, 2024, 8:20am

No. I thought even when I don’t using multi-threading the fiber scheduler should/can switch to another fiber while waiting in IO operations.

What I forgot to mention in my first post, I have to use windows ? Is multi-threading also supported on windows ?

straight-shoota · February 8, 2024, 10:42am

What exactly does this mean?
Could you maybe share some code to show what you’re actually doing.

This should all work as you expect. Concurrency is fully supported on Windows.

lemmuh · February 8, 2024, 4:46pm

My app has in principle this structure:

start_channel = Channel(Nil).new(1)

spawn name: "http_server" do
  HTTP::Server.new do |context|
    case context.request.path
    when /^start$/
      start_channel.send nil
      context.response.status = HTTP::Status::OK
      context.response.content_type = "text/json"
      context.response << data.to_json
    else
      context.response.respond_with_status HTTP::Status::NOT_FOUND
    end
  end
end

spawn name: "worker" do
   loop do
     select
     when start_channel.receive
        io = IO::Memory.new
        get_big_file_list.each do |file|
           Compress::Zip::Writer.open(io) do |writer|
             writer.add_file File.basename(file), File.read(file)
           end
           ...
        end
        ...
     end
   end
end

sleep

jgaskins · February 8, 2024, 6:35pm

Sounds like a case of not all I/O operations being the same. File I/O isn’t blocking in the same way that socket I/O is. Blocking occurs when there isn’t data ready, but this is never the case for local files. Round trips to disk are often measured in nanoseconds so the CPU is never yielded.

This can even be the case for sockets when data is coming in over the wire as fast as or faster than you’re processing it. Since reading from a TCPSocket is really reading from a buffer in memory, if every TCPSocket#read requests less data than you currently have in the buffer (keeping in mind that the kernel also has its own socket buffers in addition Crystal IO::Buffered ones), you’re never actually waiting on the socket, so in some scenarios you never yield the CPU in socket I/O, either.

lemmuh · February 9, 2024, 10:56am

I understand. Thus I have to use the multi-threading option in order to keep the http server reachable while the worker fiber is doing its job.

straight-shoota · February 9, 2024, 11:09am

Parallelism would mostly improve throughput of your CPU-bound worker tasks. That might be meaningful if you need to run multiple workloads in parallel.
But multithreading is absolutely not necessary to have a snappy response from the HTTP server. Server and worker fiber should be perfectly able to coordinate sharing CPU time in single-thread concurrency.
If there are no reschedule points in a long-running tasks, that may require some strategically placed Fiber.yield calls to implement cooperative sharing.

You mentioned that you did already try that. Where did you put Fiber.yield in that example?

lemmuh · February 9, 2024, 11:59am

As the last line in the get_big_file_list.each block.

jgaskins · February 9, 2024, 11:46pm

Fiber.yield will ensure that your other fibers get a chance to run — it literally pushes the current fiber onto the end of the scheduler’s queue. Since we don’t see all of the code in your worker fiber, and since you’re putting explicit Fiber.yield calls into the loop, it sounds like what’s hogging the CPU might be elsewhere.

If your first thought was that you were getting stuck on file I/O, it sounds like you’re shoveling a lot of data into that IO::Memory instance. That would probably consume a lot of RAM, so it might even be worth checking whether you’re dipping into swap.
Is it possible it’s getting past that code and the lines you’ve omitted are what’s actually CPU-bound?
Are you doing anything with the IO::Memory buffer after all the files have been shoveled into it that could be CPU-bound?
Does that select statement have an else in the actual code? If so, it won’t block at all and will basically just be executing loop { }, which will definitely not let your HTTP server work until it comes upon something that yields the CPU. It might be worth sticking a Fiber.yield just inside the loop block, as well or, if the select does contain an else clause, replace it with when timeout(1.second) to long-poll the start_channel.

Topic		Replies	Views
Balance use between fibers Help & Support	12	867	March 19, 2024
Why fiber scheduler work different when switch fibers use Channel#send OR Fiber.yield? Help & Support	7	125	October 26, 2024
Fiber.yield switching issue Help & Support	6	109	October 21, 2024
How to avoid Kemal's request Fiber being assigned to a busy thread? Help & Support	7	618	December 6, 2019
Asynchronous HTTP requests Help & Support	13	2521	February 27, 2019

Fiber.yield

Related topics