ONNX Runtime for Crystal

Hi

Today I used VSCode and Cline (powered by Anthropic’s Sonnet-3.7) to generate a Crystal binding for Onnx Runtime.

Onnx Runtime is Microsoft’s AI inference engine.
I’ve wanted to create a Crystal binding since around 2022, but gave up several times. The API is very Microsoft-style — using structs with function pointers and other compatibility-focused patterns — which makes it hard to auto-generate bindings with tools like LLVM.

Even if I had built the low-level bindings, writing a usable high-level wrapper would’ve required deep knowledge and a lot of time, which I didn’t have.

Recently, I gave it another shot using Cline, an AI coding assistant in VSCode. I provided the C headers and asked it to implement a Crystal binding. After a few hours (and a few iterations), it worked!

That said, it wasn’t free — I ended up spending around $50, topping up $10 each time I hit usage limits.

I don’t understand the generated code at all I totally knew what I was doing. (not really)

As far as I know, this might be the first practical AI-related binding for Crystal.

I also created an example with Kemal that lets you draw digits in your browser and classify them using an MNIST model.

Most ONNX models should work, although some adjustments will be needed.


On macOS, the following error shows up after execution:

libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument
Program received and didn't handle signal ABRT (6)

But everything runs fine until then.

(Translated from Japanese by ChatGPT)

6 Likes

I created a demo web application using CLINE + Claude.
The app is deployed on Koyeb and will remain accessible for a few days until it’s removed.

image

GitHub - GitHub - kojix2/mnist.cr

Maybe we could create a crystal binding for llama.cpp in the same way…

Actually, it has already been done.

It seems that creating a binding for llama.cpp was much easier for Claude than making one for ONNX Runtime. However, ONNX Runtime is more stable — if you specify a version, it will stay compatible in the future. On the other hand, llama.cpp does not guarantee this kind of compatibility. So, if we want to keep maintaining the binding (possibly with the help of AI), we may need to think about a good long-term strategy.

GitHub - https://github.com/kojix2/llama.cr/

3 Likes

Wow this looks amazing. I’ll definitely check it out. Thanks for spending the time to create the bindings.

2 Likes

Today I asked CLINE to write code to run yolov7 using crystal-vips and onnxruntime.cr.

The same thing has been possible in Ruby for a long time with ankane’s gem, and now we can finally do the same thing in Crystal. This is a post I wrote in 2022. Maybe I’ll write a Crystal version soon.

4 Likes

This is a great project @kojix2!

The same thing has been possible in Ruby for a long time with ankane’s gem, and now we can finally do the same thing in Crystal. This is a post I wrote in 2022. Maybe I’ll write a Crystal version soon.

If you decide to write this post, we would be happy to help you publish it in the official Crystal blog :slight_smile:

2 Likes

Thank you.

I hope to share something on the official blog someday.
However, since onnxruntime.cr is purely a product of “vibe coding,” I still need to watch carefully to see if it actually works well.

As for “vibe coding” itself, it’s still unclear whether it will disrupt existing ecosystems like Python’s, or if things will move in the opposite direction—where competition becomes about how much energy (electricity) can be spent, and big companies further strengthen existing ecosystems. It kind of reminds me of blockchain, where power often comes from sheer energy use.

Maybe both trends will happen at the same time.
I imagine there will be situations where Crystal’s speed and low memory usage become a significant advantage.
We’ll see!

1 Like

Great job!

I just test MNIST sample. It is works for me with some modifications.
Actually predict method doesn’t work without passing shape param (in code `shape: {“Input3” => [1_i64, 1_i64, 28_i64, 28_i64]}) and raise

ONNXRuntime Error: Invalid rank for input: Input3 Got: 1 Expected: 4 Please fix either the inputs/outputs or the model. (INVALID_ARGUMENT)
require "onnxruntime"

# Load the MNIST model
model = OnnxRuntime::Model.new("./src/models/mnist.onnx")

# Create a dummy input (28x28 image draw 1)
input_data = Array(Float32).new(28 * 28) { |i| (i % 14 == 0 ? 1.0 : 0.0).to_f32 }

# Run inference
result = model.predict({"Input3" => input_data}, ["Plus214_Output_0"], shape: {"Input3" => [1_i64, 1_i64, 28_i64, 28_i64]})

# Get the output probabilities
probabilities = result["Plus214_Output_0"].as(Array(Float32))

# Find the digit with highest probability
predicted_digit = probabilities.index(probabilities.max)
# Explicitly release resources
model.release
OnnxRuntime::InferenceSession.release_env

PS: model https://github.com/onnx/models/blob/main/validated/vision/classification/mnist/model/mnist-12.onnx

1 Like