Pooling HTTP connections automatically

jgaskins · July 21, 2024, 8:55pm

TL;DR: HTTP connection pooling is rad, API clients should do it automatically, so I wrote a shard for it that you can use anywhere you’d use a plain HTTP::Client. The idea is you can add the shard, remove 2 : characters from your code, and get connection pooling effectively for free.

Happy birthday to the ground

GIF showing a cake being thrown on the ground face-down. Text on the image reads "Happy birthday to the ground".

So many shards I’ve seen that talk to third-party services over HTTP seem to either only set up a single HTTP::Client inside their client or use one-shot requests. I don’t think either of these is the right approach for general use. Using a single HTTP::Client instance means that the application becomes responsible for handling (or not handling) concurrent requests. Executing one-shot requests take extra time and resources — a fresh TCP connection (including slow start) and TLS negotiation don’t seem like a big deal at low volume, but at scale it can add a lot of friction.

One thing I've been doing in every HTTP API client I've been building to avoid all of these problems is to let it support a connection pool. Click here for a surprisingly incomplete list.

AWS
Google
Kubernetes
GitHub
Elasticsearch
Anthropic (Claude AI)
Voyage AI (embeddings for semantic search, RAG, and other AI)

There are several more, some of which are still vendored into apps and I haven’t yet extracted (Stripe, Postmark, MS Outlook, etc). Turns out I’ve written a lot of shards that talk to third-party services over HTTP.

To be clear, I don’t blame the authors of any API-client shards that don’t implement connection pooling. It is honestly irritating to write the boilerplate manually each time and, if you click through to those shards, you may notice that I haven’t implemented it consistently across them. There hasn’t been a good abstraction to use yet.

There’s been discussion around adding connection-pooling support (among other features) to HTTP::Client, but it hasn’t materialized yet, so I’ve implemented it in a shard I’m un-creatively calling HTTPClient. It wraps a DB::Pool(HTTP::Client) (this entire pattern was the purpose of this PR) and transparently delegates requests to a connection in the pool.

There’s an example in the repo that demonstrates sending two concurrent requests (including an experiment that automatically deserializes responses)

Xen · July 21, 2024, 9:19pm

How about PooledHTTPClient? A bit more characters, but I think having two things so closely named is a bit dangerous.

I have a few projects that could use it, just on principle.

jgaskins · July 21, 2024, 10:42pm

HTTPClient wasn’t my first choice. I considered that exact name and several others. It was just the one that bothered me the least.

IMO, a client already pools connections if the wire protocol can’t multiplex a single TCP connection^* so calling it a pooled client would be redundant from that perspective. I acknowledge that the similarity to HTTP::Client isn’t great, but also I just wanted to publish it so I could stop rewriting the same irritating boilerplate. Otherwise, I would’ve been stuck on the name so long I would just never publish it.

^* The context behind my perspective is that a “client” isn’t just an implementation of the wire protocol on the client side of a client/server conversation. We have other words for that (driver, connection, etc), and we too often conflate the idea of a TCP connection with a client, which is higher-level — usually a more complete solution. With that perspective, I think HTTP::Client is actually a misnomer, albeit an understandable one. If I need to make 10 concurrent requests to a server, a single client can make them, even if it requires 10 TCP connections.

zw963 · November 13, 2024, 7:16am

I consider our stdlib builtin HTTP::Client should support this feature.

jgaskins · November 13, 2024, 10:49pm

Agreed. I proposed extracting DB::Pool to its own shard a few years ago. It’s by far the most robust connection pool in the Crystal ecosystem.

Topic		Replies	Views
HTTP::Client and fibers Help & Support	12	468	October 11, 2023
Limit concurrency for HTTP::Client Help & Support	4	506	December 20, 2020
Generic connection pool - merge projects? Community	2	680	May 15, 2020
Transport-agnostic HTTP client? Help & Support	2	412	February 21, 2019
Pool.cr - A generic pool library News	5	852	January 27, 2021

Pooling HTTP connections automatically

Related topics