Vibe-coding in Crystal

Yeah, I do invest not only my time (which is quite pricey if I use my current rate), but also some money for the models (I have ChatGPT Pro and Claude Max 100 at the same time), but developers’ time is more expensive, so if we count how much time we invest in open source, then Linux will be one of the expensive projects of all times :smiley:

2 Likes

Yes, I can try many ideas and only followup if they are promising after POC stage. That would before take a week per idea (let’s say) and now it’s so, so much faster.

4 Likes

be careful though. It worked great for me to implement all the new features. but when applying the code to real world scenario, then it often breaks. some errors it can fix them, but for persistence error it will try to cut corner and implement hard coded hacks instead. happened for both gemini 3 pro (high) and opus 4.5.

though if you understand the code + the problem clearly, can provide the right keywords for the AI to implement then it will work as expected.

1 Like

true that, I have ton of ideas for tiny tools that could support me or my friend, that would usually need some weekend to make. now with only two or three prompts the AI can write it for you.
compare to the old days (claude 3.5, gemini 2, chatgpt 4) they have no trouble writing crystal code anymore (I still prefer crystal to rust/python due to readability and the performance)

1 Like

I always keep an eye on their hardcoding attempts and explain why it is bad and what the best practices are.

Some parts still need to be refactored, but that will be done when the main codebase is stabilized.

2 Likes

I’ve been sceptic, but I’m trying to figure it out. Haven’t used it for Crystal yet, it kinda defeats the point. I enjoy writing Crystal, why would I hand that over to the LLM?

But I’ve tried a bit with Opencode and Gemini. Well, not “vibe” (the original meaning, where you don’t look at the code), but asking for a script, looking it over and going “yeah, that’ll do”. Sometimes you just need a quick solution to a problem, even if it’s a bit ugly.

I read a post today that argued that real vibe-coding is just a new way for bad programmers to fail more spectacularly.

I haven’t used AI assistance that much yet, but I think it shows promise when I figure out how to apply it best. But I foresee some challenges ahead, I’ve had discussions in pull requests that can be summarised as “Do X as it fixes Y”, “But why do we need to do X to fix Y”, “I donno, but it works?”, “I don’t care that it works, I want to know WHY it works, as it makes no sense that X should work on Y” (which of course often ends out in discovering why doing X in the first place is a bad idea).

That’s what I think, anyway…

3 Likes

I always vibe code in Crystal. In fact, it’s heavily influenced everything I do so that I can improve the tooling to make it easier for others to vibe code.

Today, I made a fork of shards and added some features to allow me to distribute and manage versions of skills, subagents and other Claude Code supporting functionality that would help a users coding assistant work with my library.

I called is shards-alpha because I suck at naming and I let Claude come up with something quick and easy. This is just a proof of concept tbh, but it works.

FYI – I totally suck at explaining super complex things, but I described this post and asked Claude to help explain how to try it out. I think this is a fun way to play around with how we could improve our tooling to embrace AI coding assistants across the ecosystem.

——— Claude’s explanation of how to use shards-alpha

Great thread. I want to add something from a different angle — not “how do I vibe code better” but “how do I make my libraries easier for other people to vibe code with.”

A lot of the friction people are describing here comes down to context. The agent doesn’t know your library’s conventions. It doesn’t know the right way to set up a route, or which method to call, or what the config format looks like. You can paste docs into the chat, but that doesn’t scale across dependencies — and it definitely doesn’t survive shards update.

I’ve been building a fork of shards called shards-alpha that, among other things, lets shard authors ship AI context alongside their library code. When someone runs shards install, their AI assistant automatically gets the skills, agents, and documentation that the library author wrote — namespaced per shard so nothing conflicts.

Here’s what that looks like in practice. Say you maintain a Crystal library. You add a CLAUDE.md to your repo root:


# My Library

This library provides X. Here's how to use it:

## Setup

\`\`\`crystal

require "my_library"

client = MyLibrary::Client.new(api_key: ENV["API_KEY"])

\`\`\`

## Common patterns

...

That’s it. When someone depends on your shard and runs shards install, that file gets installed as .claude/skills/my_library--docs/SKILL.md in their project. Claude Code picks it up automatically. No configuration on the consumer’s side.

For richer integration, you can ship full skill directories (.claude/skills/getting-started/SKILL.md) with step-by-step workflows, or agent definitions (.claude/agents/my-tool-expert.md) that know how to use your library’s tools. These all get namespaced — your getting-started skill becomes my_library--getting-started in the consumer’s project so it doesn’t collide with anyone else’s.

The idea is simple: if vibe coding is going to be part of how people use Crystal, then the package manager should distribute the context that makes it work. Your dependencies should make your agent smarter as a side effect of shards install.

Separately, shards-alpha also adds supply chain compliance tools (vulnerability auditing against OSV, license scanning, SBOM generation, policy enforcement) and an MCP server that exposes all of that to AI agents. But the AI docs distribution is the part most relevant to this thread.

If you want to try it:


# Install from source

git clone https://github.com/crimson-knight/shards.git

cd shards

crystal build src/shards.cr -o bin/shards-alpha --release

# Copy bin/shards-alpha somewhere on your PATH

# Or via Homebrew

brew tap crimson-knight/shards

brew install shards-alpha

Then in your project:


# Set up Claude Code with compliance skills + agents

shards-alpha assistant init

# Use it like normal shards — everything is compatible

shards-alpha install

And if you’re a shard author who wants to make your library vibe-coding-friendly, just add a CLAUDE.md to your repo. That’s the lowest-effort thing you can do that has the highest impact for people using AI to work with your code.

The whole thing was built with Claude Code as a pair programmer, which I think is its own kind of proof that the workflow works when you set it up right. Happy to answer questions about any of it.

2 Likes

I’m positive about AI-generated code, but I think it’s time to reconsider how open-source communities handle collaboration.

From my own experience, I can trust code I generate with AI because I understand the intent behind it. But when someone else uses AI to generate code, I can’t see their process, which makes it harder to trust. I suspect many developers feel the same way.

If this trend continues, forks may multiply and projects could become fragmented. The traditional pull request system may not be enough for moving AI-generated code between forks.

One approach is to have AI compare forks and break down differences into meaningful units. For each unit, AI could label the type of change—new feature, bug fix, or refactoring—and provide information to help humans decide. This could make sharing code between forks smoother.

The current GitHub ecosystem is quite centralized. But with such a system, projects might be seen not as a single official repository, but as a loose collection of related forks.

Of course, this is just one possibility. Reaching consensus still requires human discussion, and I don’t think that will change no matter how much technology advances.

(Translate from Japanese using Claude)

4 Likes

I have an answer to this, and it’s something that I’ve been using across my own projects.

I’ll see if I can make a plug-in marketplace and then link to it later today. But basically, how it works is it creates a memory structure where you’re Structured data migrations to tell what your database should look like are idempotent and tracked in a folder, along with the actual memories. For the SQL database and the associated embeddings. And then this way, you have a memory that tells you about the decisions that you’ve made, when they were made, the branch that they were related to. And the understanding of the code base based on that branch can be reconstructed or deconstructed based on these files. This allows you to work on multiple branches in parallel. And then as the information you’re getting merged into your main branch or your main trunk, your assistants always remain up to date. with the context of the changes that have been getting made. This allows every developer to always stay up to date when they’re doing rebasing in both code and the explanation and understanding. Understanding of what was done and why it was done and the technical decisions behind those implementations.

It’s almost entirely focused on clawed skills because it’s reliant on the agentic behavior of that harness, which, depending on the adoption of skills and other tools, will probably become a little more universal.

Yeah, but, context isn’t infinite. In fact, it’s far from enough right now. Even Gemini that supports 1M tokens in context starts failing when context starts filling up.

This history/log you mention will grow ever larger and LLMs will struggle to make sense of it.

And, if we compress it, we lose the details, which are important.

This is, definitely, a problem to solve.

Since we are talking a little bit of workflow, this is how I’m working at the moment with agents. I tend to utilize spec driven development (use openspec) or use a todo/task tracker (using wedow/ticket) to create long running tasks or major features or refactors. Then before I’m done, I have it update its findings to a markdown file or two to retain memory. I also utilize qdrant code indexing of the existing project I’m working on and I have a massive folder it indexes that has that various libraries or compilers I use. I do utilize an mcp to have it do fast searching of the qdrant databases.

Seems to work well enough, still polishing up some projects but they are getting pretty usable. I sometimes add some other mcp’s if I’m using another language. I lock everything down in its own vm and manage things with a hub + spoke nix flake setup to be able to utilize different groups of dependencies. It’s been pretty wild.

Also end up learning about tools I never heard of before, really digging the make replacement called just for automating some build or test tasks.

1 Like

I’m glad that you brought up trusting code that other people have vibe coded because this is a world that we are wandering into very steadily, whether we like it or not. And the reality here is that We have to come to terms with being able to use things where we don’t know what someone’s process was, and we can trust it to work as expected because we have a process that validates it consistently.

See, there’s many things in life that you don’t know what the process was and what the engineering is underneath, but you use it anyway. You use things of modern engineering marvel all the time, that you have no idea how they work. They just are a complete black box. We are very close to how things function. And being software engineers, we have a lot of control and influence over how software is written. So it seems very strange to have acquired such a superpower and then we begin delegating it to an AI agent.

As far as my compiler project is going to go, I think that I’m going to maintain this fork because now that I’ve set up a process to have Claude be able to functionally achieve a working fork of both shards and the compiler, there isn’t a huge reason for me not to. And maybe I can upstream some of these improvements and the community as a whole will be able to accept that.

I think the biggest challenge is going to be accepting that I did work that’s is not quote the way I would do it quote by someone from the core team. because, you know, the way that I vibe code and achieve some of these things is definitely not the way that somebody would normally do it. But I don’t think we should hold back from that because it’s working.

2 Likes

The trick is not to try to stuff your entire project into 1 million tokens. You’re trying to use and structure tools in a way that allow an agent to answer its own questions, but using Methods that drop off out of its main context window. So, for example, with Cloud Code, when you make a tool call, all of the actions that happen from the tool usage get dropped from the main context so just the original initiation and the answer that it arrived at get maintained This eliminates a lot of steps. The other thing to do is then make sure that things get clearly documented into files, because writing to files is again an Another tool call where it will remember parts of what it was doing, but it doesn’t have to worry about losing as much context to the action of writing to the file, but it can still read the file.

What’s nice about that is Opus 4.6 specifically excels at needle in a haystack data retrieval. So when it goes to read a file and it needs to look for something specific, it’s the best model on the market. right now for finding a single specific bit of information and then returning it to the main conversation thread.

This is a completely new skill that is totally different from how we would ever have to manage contexts of things.

I say that because the history that I’m talking about here is specifically designed around this optimization process. And I use it and it works. And I use it on mega code bases across lots of projects. And it is extremely effective. We’re talking about database entries with a relational database and a vector store that would let someone rebuild the database that the agent can then query using tool calls. And then it can extract. Only the relevant information from those tool loops without flooding its context window in order to do things like answer questions or make plans.

It’s built around the same way that we work when we think about things. You know, if you ever slow down and you go experiment, you’ll notice that if you have a to-do list and you have to think about what you need to do to perform an action on that list. You’ll then go execute it, you’ll get the result, and you’ll go back to your main to-do list, and you don’t write down every single fidget that you did in order to accomplish that to-do list. You just strike it off as done, and maybe note the result, and that’s it. And that same kind of compression works here from what I’ve just discussed.

Some basic concepts and methodologies we all took for granted are now showing its value.

Locality!
TDD!
Modularity!
Separation of concerns!

All those save context space.

1 Like

I think an issue we really want to try to solve here is collaboration.

AI agents code really fast. For any given project, you wait for a week and they have a thousand different commits already.

It makes it close to impossible to collaborate with current tools.

And the, at least in our community, it has spawned duplicate projects. (The alpha Crystal compilers) And this only works against us due to fragmentation.

Now, it’s our burden to overcome our egos and learn to collaborate even if we think we know better.

Besides, if we see it in terms of investment, we want to divide the tasks among several users so that they can invest their paid for tokens into the project working on specific tasks.

Planning is always important. Not rushing and finding ways to collaborate is very important.

Otherwise, we will create silos.

5 Likes

I couldn’t agree more. Pushing AI to the limit and further looks fun… but leads nowhere.

Having concrete, focused, changes that bring one feature, fully owned by the developer (you assume the change, the AI’s just a tool you used), that could become a pull request after a human takeover and rework would be :heart:

7 Likes

Pushing AI to the limit and further looks fun… but leads nowhere.

Completely disagree and I think there is more than enough objective proof that this statement is false.

You may disagree with the current limitations of AI being useful because it doesn’t solve things your way or as completely as you expect, but that does not mean that progress made is as irrelevant as you are implying.

What objective proof are you thinking of?

Personally, I think pushing AI to the limit is meaningful. Reaching that point — even once — becomes a foundation for moving forward. And as renich mentioned, deciding what to have AI do is a kind of investment, and investment in the Crystal community should be welcomed.

It’s also somewhat like scientific exploration. By probing the edges and learning from what happens there, the community’s technical reach expands.

That said, scientific results only become broadly useful once they’re organized into a shareable form through review and discussion. Many experiments leads nowhere near a product in themselves. Crystal is a practical language people use every day, not just an experimental playground.

So I think the gap between “pushing the limits” and “shaping something shareable” still needs to be bridged.

(Translated from Japanese with Claude)

1 Like

Man, with all due respect, you keep reminding me of Tetsuo (Akira).