How many fibers can I run in parallel + hang if there is too many fibers

Hello!

I have this simple test program for getting URLs in parallel by fibers:

require "http"

NUM_FIBERS = 10

urls = [
  "https://www.apple.com/",
  "https://www.google.com/",
  "https://www.ibm.com/",
  "https://www.oracle.com/",
  "https://www.intel.com/",
  "https://www.sap.com/",
  "https://www.nytimes.com/",
  "https://cnn.com/",
  "https://www.nasa.gov/",
  "https://www.spacex.com/",
]

urls = urls * (NUM_FIBERS / urls.size).to_i32 # just make more urls...
puts "Getting #{urls.size} URLs..."
results_channel = Channel(HTTP::Client::Response | Exception).new # urls.size
urls.each do |url|
  spawn do
    begin
      response = HTTP::Client.get url
      results_channel.send response
    rescue ex
      results_channel.send ex
    end
  end
end

puts "Waiting for results channel..."
urls.size.times { results_channel.receive }
puts "Done"

When number of fibers is low, for example 10 or 100, everything is OK, but with the larger number of fiibers, for example 200, 300 or 1000:

NUM_FIBERS = 300
...

program never ends and gets stuck with the last message:

Getting 300 URLs...
Waiting for results channel...

Tried on the Ubuntu 23.10 and Alpine 3.19 (via docker). Behaviour is the same.

➤  crystal --version
Crystal 1.11.0 [95d04fab4] (2024-01-08)

LLVM: 15.0.7
Default target: x86_64-unknown-linux-gn

Where is the problem please? Thanks!

Can’t reproduce on an M2 with 1000 fibers (took 11 seconds).

I can reproduce. It seems to finish almost. In my trial, 297 fibers have completed. Then it’s blocked waiting for the remaining 3.

I retested it and changed real world URLs to my own localhost Kemal simple web app.
Now I am able to do this test with 10 000 URLs/fibers without a problem (no hang, pretty fast).

The problem with real world URLs is probably that some servers can detect excessive requests and do some tricks about it (they might make the request wait more time + network stuff like that, or my internet connection + ISP is a piece of…).

I realized, that HTTP::Client has no default timeouts (it means that client waits forever, right?), so I added timeout to the HTTP::Client instance and voila - I am now able to do it with 10 000 URLs/fibers (just out of curiosity) and it’s stable (memory usage was 6,5GB of RAM/RES).

So:

require "http"

NUM_FIBERS = 1000
TIMEOUT_SEC = 5

urls = [
  "https://www.apple.com/aaa?a=1",
  "https://www.google.com/aaa",
  "https://www.ibm.com/",
  "https://www.oracle.com/",
  "https://www.intel.com/",
  "https://www.sap.com/",
  "https://www.nytimes.com/",
  "https://cnn.com/",
  "https://www.nasa.gov/",
  "https://www.spacex.com/",
]

urls = urls * (NUM_FIBERS / urls.size).to_i32
puts "Getting #{urls.size} URLs..."
results_channel = Channel(HTTP::Client::Response | Exception).new # urls.size
urls.each do |url|
  spawn do
    begin
      HTTP::Client.new(URI.parse url) do |cli|
        cli.connect_timeout = cli.dns_timeout = cli.read_timeout = cli.write_timeout = TIMEOUT_SEC
        response = cli.get URI.parse(url).request_target
        results_channel.send response
      end
    rescue ex
      results_channel.send ex
    end
  end
end

puts "Waiting for results channel..."
urls.size.times { results_channel.receive }
puts "Done"
3 Likes