Balance use between fibers

In a TCP server we create a fiber for each connection, but if the client writes data to the socket continuously our IO.read will never block, that means that this single fiber will run forever, and for instance no other TCP connection can be accepted or other existing connection get any time to read. The way we solve it today is that we run Fiber.yield every X bytes we’ve read, but are there other, smarter ways to achieve “fair use” between fibers?

When you have a tight loop, Fiber.yield offers a possibility for the scheduler to switch to a different fiber. So in general, this is the intended solution.

But your example is not a tight loop. Reading from an IO is usually super slow compared to CPU performance and thus involves waiting until data is received. This is a natural breaking point allowing a switch to a different fiber.

It’s very unlikely that reading from IO fully saturates the CPU.

Could you explain your implementation in more detail, maybe with some code?

Ok, yeah, here’s an example:

require "socket"

def handle(client, i)
  file = File.tempfile
  file.sync = true
  buf = uninitialized UInt8[4096]
  loop do
    cnt = client.read(buf.to_slice)
    break if cnt.zero?
    file.write(buf.to_slice[0, cnt])
    file.fsync
    print i
    # Fiber.yield
  rescue Errno
    break
  end
ensure
  client.close
  file.try &.delete
end

s = TCPServer.new("localhost", 3000)
puts "Started"
i = 0
while client = s.accept?
  spawn handle(client, i += 1)
end

Then point two clients to it like this:

cat /dev/zero | nc localhost 3000

And you’ll see that only 1s are printed at first, and only when you close the first client, 2s starts to show up.

This is of course a super simple artificial example, but we basically have the same problem in our app, that client.read never blocks or automatically yields to another fiber, the TCP buffer is always full.

Our problem is how to fairly balance how long a fiber can run, we could of course yield after each iteration, but that kills performance, right now we do something like Fiber.yield if iteration += 1 % 1000 == 0, but if there’s tens of thousands of connections that “every 1000nth” iteration is too seldom, but if there’s only a single or two connections it’s too often.

Just wondering if anyone else have come up with a smart idea to balance latency and performance when yielding.

I don’t think it’s a solution, but once we can run multiple threads at least there will be other threads that will be able to do stuff while that client is blocking one thread.

1 Like

If I understand @carlhoerberg correctly, this is just delaying the problem into a situation where number_of_cores == number_of_client_intensive_reads - or not?

Isn’t this example a little too artificial?
Reading from /dev/zero is faster than any “normal” IO.

Under which circumstances does this happen in your app?

as soon as you do anything computational with the data that you read this will happen. When you have 1-10GBit/s ingress you can’t do much before the TCP receive buffer is full again.

Right , that is a lot.
Did you have a look at nginx or others who should have the same problem?
Even if you add threads and processes, you should eventually run into the same problem there as @pfischer says, so these other projects must handle is somehow.

My guess would be some kind of “can I go on or does anyone else want some processing time? No? Alrighy then.”

That will maybe use the system more evenly on low loads because you don’t need to set the switching threshold too high.

(I am really just guessing here, I have never done or researched anything like this)

The language could introduce some Fiber.yield in some loops… I don’t know what would be the criteria for that, though. And it could make code that doesn’t require that slower.

So I think manually introducing Fiber.yield in this case is fine. It gives you the control you need.

Not sure what the issue is here? Maybe more related to local IPC with a socket tempfile?

The TCPServer example can handle thousands of connections np all while reading/writing to the stream.

The fiber will close and automatically be cleaned up (in the sense you don’t need to worry about a lingering fiber)

The 2s shows up, because it’s another fiber handling the new connection

I think the thing is that the 2s don’t show up until the first client disconnects, because the client sends so much so fast, that the IO is never waiting to read, and the fibers never get switched → second client never gets read.

I think there is a infinite loop issue with the example, it should allow both clients to connect and communicate in the fibers hmm…

I’m reviving an old topic, but I just stumbled upon it.

There is an alternative to yielding every N increments by using Time.monotonic to yield after some time has elapsed. The call ain’t too expensive (~22ns on my old laptop, or 44 million calls per second).

class Timer
  def initialize(@every : Int32, @interval : Time::Span)
    @timer = Time.monotonic
  end

  def yield
    now = Time.monotonic
    return if now - @timer < @interval

    @timer = now
    Fiber.yield
  end
end

timer = Timer.new(10.milliseconds)

loop do
  # ...

  timer.yield
end

The issue is that it’s oblivious on whether the fiber yielded within the loop, so the event loop might wait on read + the busy loop will yield because 10ns have elapsed while waiting (oops: double yield).

Maybe the fiber scheduler could help, with a conditional Fiber.yield? , but calling Time.monotonic on each and every fiber switch is probably too expensive :thinking:

1 Like