LLM-friendliness as a metric: porting 20 languages with an LLM

kostya · March 13, 2026, 3:48pm

added article LLM-friendliness as a metric: porting 20 languages with an LLM , spoiler crystal top1.

paulocoghi · March 13, 2026, 5:58pm

@kostya so much valuable information on your article. What a pleasant read! IMHO it deserves a post on its own here on the forum, to not be lost as an additional answer of an existing post.

kostya · March 13, 2026, 6:01pm

Thanks, I tried to post on reddit, but they always ban me. People don’t like ai related content.

paulocoghi · March 13, 2026, 6:06pm

In this case, reddit is the one losing. Here, in the star constellation of the Cristal community, your excellent work has made this constellation shine brighter.

I apologize because English is not my first language, and perhaps my expression here is not very good.

straight-shoota · March 13, 2026, 8:37pm

I appreciate that Crystal comes out on top, but I have a couple of concerns with this analysis.

It’s very astonishing that Golang is ranked among the most expressive languages.
It’s infamous for its verbosity. One of the most obvious features is the explicit error handling. But the comparison implementation doesn’t use much error handling at all.
I see that as an indicator that this is not representative of general Golang code bases. And the same probably applies to all other language ports: They are only representative of code bases that primarily implement algorithms.
The introduction states that the original implementation was in Crystal (I presume written by hand?) and an LLM ported it to other language. So the result is not necessarily idiomatic or ideal code. It’s a common observation that LLMs produce bloated code of subpar quality. So unless actual developers familiar with the respective languages have reviewed the ported code bases, I don’t think it’s fair to consider them representative of the language. They’re representative of whatever the LLM generates for that language (possibly also influenced by the original Crystal implementation)
The size divided by gzipped size can serve as a heuristic for repetitive texts. But I have doubts how accurate it is as a metric for boilerplate in source code. Compression works only lexically. But source code has a lot of variability there. And there is no account of semantic boilerplate.
A big factor of the comparison is not just the expressiveness of a language, but also availability of libraries. A language with a big standard library or easily available optional dependencies, typically requires less custom code for implementing algorithms and such. Language ecosystems also have different cultures on build vs. buy (or implement vs. import) decisions.

kostya · March 13, 2026, 8:57pm

Regarding Go, I found it surprising too, but the numbers show exactly that. Yes, the code doesn’t have many if err != nil — simply because the code is more about algorithms rather than real business logic where many subsystems return errors. Here it’s mostly math — get data, compute, get result. There aren’t many if err != nil cases. But this applies to all other languages as well.
This is addressed in the AI Critic section: ‘Topic 2: Lost in Translation?’
Yes, the metric is completely subjective — but I found that it aligns very well with relative comparisons. For example, Java and TypeScript are around zero — average languages, and the progression Java → Kotlin → Scala shows a clear trend.
Perhaps it’s not even the standard libraries, but rather the level of abstraction — the higher it is, the easier it is for an LLM to work with.

straight-shoota · March 13, 2026, 10:12pm

Yes, but in different ways because different languages have vastly different approaches to error handling and propagation.
And this applies to basically every other language feature that’s underrepresented in primarily math-focused algorithms.
These different code bases only represent a subset of all language features.

Where do I find that?

kostya · March 13, 2026, 10:17pm

LangArena - Programming Languages Benchmark Comparison , AI Critic tab.

zw963 · March 14, 2026, 5:33am

So surprised on this too. in my view, Go feels like a rather redundant language.

Fryguy · March 21, 2026, 9:10pm

This is pretty cool. However, I’m surprised you didn’t do Ruby, especially considering that it’s Crystal’s spiritual ancestor.

Fryguy · March 21, 2026, 9:14pm

Oh I just saw in the LangArena repo it says

Languages like Python, Ruby, or PHP are intentionally excluded to maintain a focused comparison within a similar performance bracket.

but then, Python was in the repo anyway.

kostya · March 21, 2026, 10:21pm

Python was added because it has a fast runtime — PyPy — which is ~6 times slower than C. If there’s something similar for Ruby, I can add it.

Fryguy · March 22, 2026, 10:43pm

Ah ok, that makes sense… I didn’t notice it was PyPy specifically.

I guess the closest thing on the Ruby side is CRuby with yjit enabled or JRuby/Truffleruby.

Topic		Replies	Views
Why Isn't Every Sane Developer Obsessed with Crystal? Community	39	809	March 18, 2026
Crystal for Agents v1.20.0 release News	18	275	April 23, 2026
Spoiled by Crystal's expressiveness	1	381	March 25, 2025
The core principles behind Crystal Community	34	4178	May 27, 2020
So far I really like Crystal Community	18	2302	December 15, 2021

LLM-friendliness as a metric: porting 20 languages with an LLM

Related topics