I’m trying to iterate quickly on an app I’m working on in production. Deploying it was taking 5-6 minutes every time I made a change, which is actually pretty decent for continuous deployment but I’m impatient.
Since the --release
option is just -O3 --single-module
, I decided to check how the impact of --single-module
had vs the plain -O3
on this app for both compilation and runtime. I know LLVM can perform more optimizations if you put it all into a single module, but I’ve never seen what the magnitude of that is on the kinds of apps I write (usually web services and infrastructure tooling). Ary showed some numbers here, which are great, but his examples aren’t the kinds of apps that I work on, so I wasn’t sure how impactful it would be.
Using -O3
instead of --release
brought my build times on GitHub Actions from 5-6 minutes on paid runners to below 3 minutes (builds and deploys are not yet being gated behind CI — I’m running specs locally):
Build times this fast are worth a moderate tradeoff in runtime performance, especially since this app is relatively new. Latency between my house and the server makes generating synthetic load difficult (it takes a lot of concurrency to saturate the gaps created by that latency), so I ran it against the app running locally. I copied a request as a curl
command from my browser dev tools and converted it to wrk
, hitting an endpoint for an authenticated user that shows all organizations the user is a member of, which makes queries to both Redis and Postgres. I have live data in my local DB, so these are realistic requests.
For -O3
:
$ wrk 'http://localhost:3201/organizations' \
[HEADERS REMOVED FOR BREVITY]
Running 10s test @ http://localhost:3201/github_organizations
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.10ms 314.42us 6.54ms 98.37%
Req/Sec 4.63k 131.63 4.81k 75.74%
93073 requests in 10.10s, 456.41MB read
Requests/sec: 9214.29
Transfer/sec: 45.18MB
Over 9k requests per second is pretty solid, especially for requests that are running several real Redis and Postgres queries. I was like “wait, how much better could --release
be?”
Turns out, it’s pretty significant:
$ wrk 'http://localhost:3202/organizations' \
[HEADERS OMITTED FOR BREVITY]
Running 10s test @ http://localhost:3202/github_organizations
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 560.38us 1.53ms 40.94ms 97.71%
Req/Sec 11.76k 692.11 12.68k 89.11%
236375 requests in 10.10s, 1.13GB read
Requests/sec: 23402.55
Transfer/sec: 114.76MB
About 2.5x as fast for a web app running real database queries just by adding --single-module
. I knew LLVM performed some great optimizations, but that’s quite a bit more than I expected. I assumed it would be around 50% faster, which still would’ve been worth it for apps that need the performance, but 2.5x is incredible.