I know vibe-coding is, at the same time, a popular and unpopular topic among programmers. Many see it as some new kind of zero-code programming, others as a powerful “autocompletion on steroids” tool.
In any case, I’d like to know what people here think about it. If you have used it and which problems have you faced.
I’ve done it a lot, honestly. I have many projects in Crystal that I’ve tried to develop using opencode.ai and gemini-cli.
My first paint point is that Crystal is not a mainstream language yet. That makes it complicated. Especially to use idiomatic crystal with agents.
That said, I have some projects that have “taken off” meaning that they actually work, even if I use them myself only. Others have been just impossible to code.
So, what’s up? Have you tried doing this? Is vibe-coding the devil in disguise? What have you stumbled upon?
I was against this for the longest time. Been listening and reading a lot though, it’s been helpful in getting those burnt out with coding or semiretired get back in. I have struggled with focus and haven’t kept projects going for a while and ai assisted coding is giving me new excitement lately. I’ve been honestly amazed at what I’m getting done in Crystal with this stuff.
I have three projects that it’s assisted me so far. I took an abandoned static site generator and almost got it to where it’s a drop in replacement for Jekyll now (still a lot of paper cuts I’m addressing). I had to make a new sass library for Crystal since the c bindings based one can’t be used due to libsass not active anymore. It helped me get a new dart based one going and working (it works on Mac, Linux arm/amd64, and FreeBSD). Then there is an app I’ve dreamed of making for a few months that is becoming kind of crazy how well it’s worked that does a dashboard type thing for rss reading.
Point being, none of this would be anywhere without it. It took me a few weeks to get workflow working and the right apps. Finding for now at least, zed editor + Claude Code + beads is my current setup. Beads helps keep memory and context along with tasks. It was a lot of struggle prior to that, I had tried cline and aider, but they just didn’t work. There are some new tools out but don’t support the cheaper models I use that are OpenAI compatible, once they do, will try that.
It’s been a wild ride, will probably be attempting some more projects soon. Some might be Elixir based unless I try my hand at at lucky or another crystal framework as I have a pretty complicated site I want to try. Not sure how my workflow works with other things, statically compiled probably helps a ton right now.
“Vibe coding” means different things to different people — anything from “input a problem description to the LLM and commit whatever comes out the other side” to “an LLM was used in the writing of the code but a human modified and is ultimately responsible for it”. Simon Willison has been beating the drum that only the former end of the spectrum should be considered “vibe coding”. I think he’s losing that battle to a massive and vocal group of people who only care about using a cool-sounding term, though. Either way …
If vibe coding to you means YOLOing LLM output into your project, then I can imagine that being useful for a proof of concept or as a way to scratch a personal itch, but I wouldn’t want to maintain it. I’ve worked with engineers that would respond to bugs in code written this way with more of the same “vibe coding” style and I’ve seen it spiral out of control quickly. The code ends up far more complicated than it has any right to be.
If you mean simply using any AI-generated code at all, I’m more open to that in a production environment, but at least one person on the team needs to be comfortable with taking responsibility for that code during an incident at 3am.
A huge part of this perspective is that I’ve been working in SRE for almost a decade and I almost always need to see the system as a whole rather than individual services, let alone specific endpoints or classes. Any nontrivial production code needs to have someone responsible for it because, during an incident, you need to be able to bring in someone with expertise. But who’s an expert in code that nobody wrote? Who understands why a specific pattern was used when nobody chose it? I’ve got a pretty decent track record when it comes to inferring context from human-written code, but code written by an LLM is often more verbose and/or more complicated, which makes this more difficult, especially during an incident.
Ultimately, it can be a lot like hiring a consultant to come in and build out a single feature. Once they leave, all context leaves with them unless they wrote documentation. The less effort someone puts into incorporating AI-written code, the less likely they also incorporated the LLM’s prose into documentation. Your team (and even the LLM) are stuck maintaining code with little to no insight into how it came to be in its current state. That’s not a place I like to be.
As far as crystal in particular, I’ve had good almost nothing but good outcomes so far. I don’t tmstray very far from small well-defined tasks with my ai use, though. It has struggled a little bit with macros and can only imagine it’d be much worse with heavt macros that create methods with dynamic names since it largely discovers the code base with grep
I ran into this with macros as well, but I have a ton of conventions and guides for crystal and elixir I use now. It keeps guardrails up, most issues come from the llm being overloaded and it gets stupid. I still try to use those to keep it away from macros though. Once it’s a light time, it fixes it.
The pace of AI-assisted coding is moving faster than I anticipated. While I was a skeptic a year ago, I’ve had to revise my stance after seeing the impact on complex tasks like debugging complex bugs, documentation maintenance and refactoring legacy codebases.
I recently used a ‘vibe coding’ approach to port a JSON validation gem from Ruby to Crystal. While it’s not production-ready yet, it compressed what would have been months of manual porting into a few hours. I’d highly recommend experimenting with LLM agents on your side projects—we’re only scratching the surface of how these tools can simplify our workflows.
Same here; skeptic at first, adopter now. Less voluminous though it may be, there seems to be enough Crystal material for the LLMs to be adequately trained.
I use Emacs in combination with gptel-aibo + gpt-40 and it’s saved me tons of work. I now rarely look at the recommendations and just hit apply…
iow I’ve been gradually settling in into my vibe coder’s chair without remorse and at remarkably low (not sustainable in the long term?) expense (20$ work of tokens bought a year ago).
This was really my first major use case for playing with it. I’m reworking the workflow a bit so slowed down a little, but think I’m getting things where they need to be and it’s less thinking loops now. I really can’t wait to see if this will help bring stuff that was missing in Crystal or else where and allow me to do more than before. Also been playing with it with elixir too and since it’s getting more typed, I think it’s helping as well. Statically typed languages just seem to fail faster but if it compiles it has more guarantees that it’s going to work.
And ten hours later there is a crystal starlark interpreter which AFAIK works just fine. I need to get the official test suite and check it harder but … it does seem to work.
(No, it’s not the whole language, this is not magic)
Ok, maybe it is magic, it now passes all the acceptance tests and I have the agent working in optional features like floats and sets
Great take. I’m in DevOps, so I relate to this a lot.
Using LLMs to write code, whether that’s entire features or smaller chunks, is completely fine as long as the human behind the keyboard is willing to take full ownership of the outcome. That’s the part people gloss over.
I also think a lot of the discourse misses an important distinction. There is a massive difference between how someone with no engineering background “vibe codes” and how experienced engineers use LLMs as part of a disciplined workflow.
In practice, for people who take their careers seriously, LLMs are tools, not authors of record.
For example, I use Claude Code, but I review the architecture and design decisions myself, even if not every single line. Then I have Codex run a thorough local review. On top of that, my GitHub pipeline includes Codecov, SonarQube, CodeRabbit, and roughly fifteen to twenty GitHub Actions running different forms of static analysis, security checks, tests, and quality gates to make sure nothing insane ever makes it to main.
That is very different from YOLOing LLM output into production and hoping for the best.
At the end of the day, the standard has not changed. Someone still needs to be comfortable owning that code during a three a.m. incident. If you cannot explain why a system looks the way it does or confidently debug it under pressure, then it does not matter whether a human or an LLM typed it.
LLMs can absolutely accelerate good engineers. They just do not replace responsibility.
This is key and I think of this very often. One needs to be accountable of whatever “you” produce; either via an Agent or yourself.
A common mistake would be to ask the agent to completely develop something without ever taking the time to review the code or implementation. Also, to not pay attention in the thinking process and try to read, even if lightly, the agent’s output.
I’ve cought my agent doing stupid things, like silencing ameba checks, even when I tell it not to do it.
That said, I relate to what you say. They’re tools and a professional will use them to accelerate the workflow but will check and vow for the code generated.
100%, I get the appeal for non-coders, but IMO if you are a coder half of the fun should be in understating the nitty-gritty implementation details! I take pride in my code ngl.
I’ve caught it doing that too I wrote a rule in .claude/rules/ just 30/40 lines to tell it to never edit the .ameba.yml file, never use magic comments to bypass ameba violations…etc, it started behaving better after
it works great for writing binding library, just give the original project plus some examples of how crystal binding works and it will finish it perfectly
it works decent porting some small utilities from another language to crystal, just need to tell it to extract all the original test cases. sometime it does try to cheat by hardcoding some fixes for that particular test cases, but tell it again to “follow original implementation” and it will work properly
making well defined input/output small scripts in large projects also work, as long as you point out what files it needs to look.
let it add more features to a large code base does not work, the result is a hot mess that will surely bite you in the future
overall it lets me save a lot of time implementing trivial libraries that would still cost me one or two weekends to do. and it adds up a lot.
I still do not believe in its output though, the models are great at cheating. always do it with rigor tests.
The problem is with SWE agents and model themselves. Some models are not that good for it, if they were not trained for that specifically.
I haven’t tried opencode.ai, but I worked with Gemini and Gemini.CLI, and they were making mistakes all the time.
On the other hand, I can say that Claude Code and Codex CLI are really great when you use Claude Opus 4.6 and GPT 5.3 Codex xhigh. These guys are making wonders, and I rely on them.
Even with them, you still need your custom LLM protocol to teach them how to think properly.
For me it works great to write the Crystal LSP and then to continue to work on a codegen. As of now it generates working binaries in --no-prelude mode, and I almost crushed most of the bugs in monomorphization and function lowering of some edge cases.
99.5% of the time they do it with no issues at all, and only sometimes I have to give them a hunch.
Also, I vibecoded Crystal TUI in about 10 hours, and I see that you are already using it in your projects. That is quite facinating how fast I can convert my wild ideas to the working code, and people start using it.