TL;DR: HTTP connection pooling is rad, API clients should do it automatically, so I wrote a shard for it that you can use anywhere you’d use a plain HTTP::Client
. The idea is you can add the shard, remove 2 :
characters from your code, and get connection pooling effectively for free.
Happy birthday to the ground
So many shards I’ve seen that talk to third-party services over HTTP seem to either only set up a single HTTP::Client
inside their client or use one-shot requests. I don’t think either of these is the right approach for general use. Using a single HTTP::Client
instance means that the application becomes responsible for handling (or not handling) concurrent requests. Executing one-shot requests take extra time and resources — a fresh TCP connection (including slow start) and TLS negotiation don’t seem like a big deal at low volume, but at scale it can add a lot of friction.
One thing I've been doing in every HTTP API client I've been building to avoid all of these problems is to let it support a connection pool. Click here for a surprisingly incomplete list.
- AWS
- Kubernetes
- GitHub
- Elasticsearch
- Anthropic (Claude AI)
- Voyage AI (embeddings for semantic search, RAG, and other AI)
There are several more, some of which are still vendored into apps and I haven’t yet extracted (Stripe, Postmark, MS Outlook, etc). Turns out I’ve written a lot of shards that talk to third-party services over HTTP.
To be clear, I don’t blame the authors of any API-client shards that don’t implement connection pooling. It is honestly irritating to write the boilerplate manually each time and, if you click through to those shards, you may notice that I haven’t implemented it consistently across them. There hasn’t been a good abstraction to use yet.
There’s been discussion around adding connection-pooling support (among other features) to HTTP::Client
, but it hasn’t materialized yet, so I’ve implemented it in a shard I’m un-creatively calling HTTPClient
. It wraps a DB::Pool(HTTP::Client)
(this entire pattern was the purpose of this PR) and transparently delegates requests to a connection in the pool.
There’s an example in the repo that demonstrates sending two concurrent requests (including an experiment that automatically deserializes responses)