Figured I would leave an update on this. Since having tried this repeatedly, a lot of the world has changed, especially since April 2023. We’re almost three full years later, and basically, model capabilities have more than tripled in that time. Because of the radical increase, models have not had to specifically learn crystal as precisely as originally thought. They also probably have just improved how it is that they’re being trained. So I do think that the depth of knowledge about crystal itself has also just increased and not necessarily from any extra efforts on our part.
That being said, while I currently use Claude Opus 4.5 and I’m having a radically improved experience, it leads me to believe that we are going to see a general increase in all coding agents accessibility over the next six months. Generally, what’s happened is Opus is one of the frontier models that leads the way and sets a standard, and within three or four months of its release, other models start catching up.
That being said, I still don’t have a good formal training data set organized in a way that has been meaningfully successful fine-tuning a local model. Instead, I have found a considerable amount of progress by doing prompt engineering and agent orchestration.
We have spoken a lot about prompt engineering, but very little is actually being talked about when it comes to agent orchestration. I think that is a big miss, but it is the next step, and it’s frankly going to be the step of 2026 that starts spreading things like wildfire.
The reason for this is simple. By the time you’re trying to orchestrate multiple agents, you have had your assistant and coding agent escape the chat window. And what that means is that it no longer requires you to sit there and proactively or interactively work with it in order for it to accomplish tasks.
For example, I have set up a process where I can upload voice memos and recordings to a folder on my iCloud drive. Whisper then runs and transcribes that audio file into a raw transcript. Claude then runs and an intake agent reads that transcript and then does a bunch of executive tasking to break it up into different topics. Summarize it with the current projects that I have going, and other significant and very helpful organizational steps. To consistently and reliably maintain a knowledge base, it’s then capable of seeing that a task was requested of another agent that we have available, and it will start that process with that agent with its task that it needs to accomplish.
I no longer have to be physically present at my desk to get Claude to begin working on a project. And frankly, it’s quite wonderful. If any of my agents get stuck and they need help, they can actually use a communication agent that will then give me a call and talk to me about the issue and relay that back to the agent working on my local device, who then hopefully can get unstuck and continue on.
These days, I work almost exclusively in Crystal, especially whenever I’m using my own coding agent. However, I do like to periodically branch out to other languages to get a better idea of how effective other coding languages that are more popular and have content out there are. And I have to say that I’m quite impressed with how effective these other languages are. If they have good tooling and a lot of content out there, it’s kind of dangerous, actually.
The quality of the content in the world that the models are getting trained on is very important. I do consistently find that topics around languages like Rust and Go or C are significantly higher quality and can tackle better and harder concepts than JavaScript. I find that JavaScript has been flooded by YouTube bros peddling starting an agency, and it’s reflected in the quality of the code that models generally write.
So, I think that leads us down a path where we can all individually write anything that we want as long as we understand that what we write and publish out there is going to eventually end up in these models’ memories and influencing what it is that they do.
If you set up an agent with access to memory like I have, you can very easily build long, deep preferences that the agent is capable of following nearly indefinitely.
I think the biggest thing that we need to solve is how we distribute the libraries that we write to enable people and their coding assistants from the start. I just don’t know what that tooling looks like at this moment.