Fiber.yield

In my app I have a http-server fiber and a worker fiber.

I observed under heavy worker load (reading/parsing/writing files within a loop), that the http server is not responding. Right after the worker fiber is finished with its job, the http server responds.

I aware of the possibility to pass the control back to the fiber scheduler with Fiber.yield.

I inserted the Fiber.yield method in the worker loop, however the situation didn’t change. Still the http server responded after the worker job.

How does the fiber scheduler decides which fiber is going to be the next to run ?

Do you have some example code? Otherwise best I can guess at the moment is that the worker fiber is doing something that is preventing execution from switching back to the main fiber that is running HTTP server. Which Is a bit interesting to me if you say it’s doing file operations as normally I would have thought the scheduler would process other fibers while waiting on IO when reading from the files. But it’s kinda hard to say for sure without some code to look at.

Perhaps obvious question: are you compiling with -Dpreview_mt to enable multi-threading support for fibres?

No. I thought even when I don’t using multi-threading the fiber scheduler should/can switch to another fiber while waiting in IO operations.

What I forgot to mention in my first post, I have to use windows ? Is multi-threading also supported on windows ?

What exactly does this mean?
Could you maybe share some code to show what you’re actually doing.

This should all work as you expect. Concurrency is fully supported on Windows.

My app has in principle this structure:

start_channel = Channel(Nil).new(1)

spawn name: "http_server" do
  HTTP::Server.new do |context|
    case context.request.path
    when /^start$/
      start_channel.send nil
      context.response.status = HTTP::Status::OK
      context.response.content_type = "text/json"
      context.response << data.to_json
    else
      context.response.respond_with_status HTTP::Status::NOT_FOUND
    end
  end
end

spawn name: "worker" do
   loop do
     select
     when start_channel.receive
        io = IO::Memory.new
        get_big_file_list.each do |file|
           Compress::Zip::Writer.open(io) do |writer|
             writer.add_file File.basename(file), File.read(file)
           end
           ...
        end
        ...
     end
   end
end

sleep 

Sounds like a case of not all I/O operations being the same. File I/O isn’t blocking in the same way that socket I/O is. Blocking occurs when there isn’t data ready, but this is never the case for local files. Round trips to disk are often measured in nanoseconds so the CPU is never yielded.

This can even be the case for sockets when data is coming in over the wire as fast as or faster than you’re processing it. Since reading from a TCPSocket is really reading from a buffer in memory, if every TCPSocket#read requests less data than you currently have in the buffer (keeping in mind that the kernel also has its own socket buffers in addition Crystal IO::Buffered ones), you’re never actually waiting on the socket, so in some scenarios you never yield the CPU in socket I/O, either.

I understand. Thus I have to use the multi-threading option in order to keep the http server reachable while the worker fiber is doing its job.

Parallelism would mostly improve throughput of your CPU-bound worker tasks. That might be meaningful if you need to run multiple workloads in parallel.
But multithreading is absolutely not necessary to have a snappy response from the HTTP server. Server and worker fiber should be perfectly able to coordinate sharing CPU time in single-thread concurrency.
If there are no reschedule points in a long-running tasks, that may require some strategically placed Fiber.yield calls to implement cooperative sharing.

You mentioned that you did already try that. Where did you put Fiber.yield in that example?

As the last line in the get_big_file_list.each block.

Fiber.yield will ensure that your other fibers get a chance to run — it literally pushes the current fiber onto the end of the scheduler’s queue. Since we don’t see all of the code in your worker fiber, and since you’re putting explicit Fiber.yield calls into the loop, it sounds like what’s hogging the CPU might be elsewhere.

  • If your first thought was that you were getting stuck on file I/O, it sounds like you’re shoveling a lot of data into that IO::Memory instance. That would probably consume a lot of RAM, so it might even be worth checking whether you’re dipping into swap.
  • Is it possible it’s getting past that code and the lines you’ve omitted are what’s actually CPU-bound?
  • Are you doing anything with the IO::Memory buffer after all the files have been shoveled into it that could be CPU-bound?
  • Does that select statement have an else in the actual code? If so, it won’t block at all and will basically just be executing loop { }, which will definitely not let your HTTP server work until it comes upon something that yields the CPU. It might be worth sticking a Fiber.yield just inside the loop block, as well or, if the select does contain an else clause, replace it with when timeout(1.second) to long-poll the start_channel.