HTTP::Client and fibers

Xen · October 9, 2023, 9:30pm

So I’m fiddling with making a Zulip client (oh I wish there was a Crystal Zulip chat, oh well).

“Listening” is just making a HTTP request, which the server will then block for a while if there’s no new events.

So, I’ve made a class that just creates a HTTP::Client and loops over requests, works smoothly. But I spotted this in the docs:

WARNING A single HTTP::Client instance is not safe for concurrent use by multiple fibers.

Which I should take note of, as the idea is that I’ll be making other requests to the server while listening for new events. Of course, I could just use HTTP::Client.get, but it seems a bit excessive to create a new client for each request?

It does make sense that a single client instance can only handle one request at a time, but is that the gist of the warning, or is there something that makes it unsafe to use a client that’s been previously used in another fiber, even if that request is completed?

As the Zulip client will be firing of HTTP requests left and right, I would like some HTTP client reuse, to keep things tidy.

So what would the cool kids do?

Blacksmoke16 · October 9, 2023, 9:59pm

Cannot perform concurrent HTTPS GET requests from a single client session · Issue #12412 · crystal-lang/crystal · GitHub covers some of the reasoning for adding that warning.

Otherwise you could prob leverage GitHub - j8r/pool.cr: A simple thread-safe generic pool. to have a pool of HTTP::Client that are available to handle requests. I.e. so that each request would have its own client, while still being able to reuse them for efficiency reasons.

bcardiff · October 9, 2023, 10:01pm

crystal-db allows you to use a pooled http client also. Allow DB::Pool to be used a generic connection pool by jgaskins · Pull Request #131 · crystal-lang/crystal-db · GitHub

jgaskins · October 10, 2023, 12:20am

Seconding @bcardiff’s suggestion of crystal-db here — I put in the PR that enabled that usage specifically because DB::Pool is the single best connection pool implementation I’ve seen in Crystal.

Thread-safe if you need that
Customizable elasticity via max_idle_pool_size, so you aren’t stuck holding onto hundreds of connections just because your service had a brief surge in work to do
Disposing of connections is simply a matter of closing them

The first is a matter of mutex usage, but I don’t think any other implementation in Crystal meets those last 2 needs.

It was designed for RDBMS clients, but I also use it for redis and elasticsearch. I also have a couple other clients for Google and Microsoft APIs (which I still need to extract from one of my apps) that use it.

Blacksmoke16 · October 10, 2023, 12:27am

Wonder if it would make sense to split it out into its own shard? Just feels kinda weird requiring a DB shard just to use it to create a connection pool.

Xen · October 10, 2023, 6:27pm

So if pools work, it’s not that it’s fundamentally incompatible with fibers, it just can’t handle more than one request at a time. Which makes sense as there’s only one IO in there.

Oh, excellent. I was thinking of doing some sort of pooling, but this makes it pretty simple.

That’s handy.

Seconded. Would also improve visibility.

Blacksmoke16 · October 10, 2023, 7:41pm

I created Extract `pool` implementation to dedicated shard · Issue #190 · crystal-lang/crystal-db · GitHub to track that idea.

Xen · October 10, 2023, 8:40pm

I must be missing something obvious here, but take this example:

require "http/client"
require "pool"

class Test
  @pool : Pool(HTTP::Client)

  def initialize
    @pool = Pool.new {
      HTTP::Client.new("crystal-lang.org", tls: true)
    }

  end

  def test
    @pool.get { |client|
      res = client.get("/")
    }

    res.body
  end
end

p Test.new.test

This fails, because the compiler obviously can’t know that the @pool.get block is always executed, so it aren’t sure of the type of res. But if we move the res.body inside the block, test doesn’t return anything because @pool.get doesn’t have a return type. And it can’t have as there’s no way for it to know.

I figure that

client = @pool.get
res = client.get("/")
@pool.add client

Ought to work, but that’s a bit unpretty. And I could monkeypatch in my own variant of get that has a return type (or sub-class it, that would be a tad prettier). Or just a helper on the client using class.

Am I missing something, or is this just the price you have to pay for static typing?

rogerdpack · October 10, 2023, 9:00pm

What failure message? :)

Blacksmoke16 · October 10, 2023, 9:01pm

Why not just do everything in the #get block? Or maybe try the DB shard based one.

Xen · October 11, 2023, 6:14am

 19 | res.body
      ^--
Error: undefined local variable or method 'res' for Test

Defining the before the get block just exchanges that for Error: read before assignment to local variable 'res'. Which makes sense.

I believe that the DB one would behave similarly. And, well, doing everything in the #get block would hold on to the client for the duration of processing, which goes a bit against the original goal of efficiency. Not to mention that you’re pushing your business logic from the outside (ask for something and work with it) to the indside (which blocks helps out with, of course). And this is getting more theroretical, but you’re right, if we’re getting to the point where that becomes a factor, you’d probably do just the minimal handling of the result in the block and stuff something in a queue for further processing.

But a little helper method did wonders:

    protected def client(&)
      client = @pool.get
      res = with client yield
      @pool.add client
      res
    end

Which gives a nice little micro DSL:

      response = client do
        post("/api/v1/register", headers: @headers)
      end

straight-shoota · October 11, 2023, 8:11am

The DB pool implementation returns the value of the block.
I don’t know why GitHub - j8r/pool.cr: A simple thread-safe generic pool. doesn’t do that as well. It arbitrarily restricts the return type to Nil, but that seems to be completely unnecessary.

Xen · October 11, 2023, 8:34am

Oh, you’re right.

Oh, but I do, I was experimenting a bit. Because it uses rescue ... else, the return value of the begin block is actually the return value of add resource. The comment explains why it’s not using ensure but think this should work, though it’s not as pretty:

    begin
      res = yield resource
      add resource
      res
    rescue ex
      @total_resources.delete resource
      raise ex
    end

Topic		Replies	Views
Limit concurrency for HTTP::Client Help & Support	4	506	December 20, 2020
Generic connection pool - merge projects? Community	2	680	May 15, 2020
Pool.cr - A generic pool library News	5	849	January 27, 2021
Fibers, channels and blocking Help & Support	3	1498	July 8, 2020
Making curl-to-crystal tool Community	19	1381	October 7, 2023

HTTP::Client and fibers

Related topics