Using Software Engineering agents

Did you use SWE agents?
If so, which agent and model are your favourites?
I found that GPT 5.2 Codex is much better than Claude Opus 4.5, but they work quite well together in turns.

2 Likes

Due to the region I’m in, most of those SWE agents aren’t available where I live.

I did try GitHub Copilot in VS Code a long time ago (GPT / Claude etc). It was pretty dumb — bad suggestions, no real understanding of context. I ended up disabling Copilot completely.

I see.

Yes, at this moment I can recommend only GPT 5.2 (if you use Copilot), or Claude Opus 4.5, as all other models are pretty dumb, I’d say.

I personally use GPT 5.2 Codex xhigh (via Codex CLI) and Claude Code (with Claude Opus 4.5) as my backup, as GPT 5.2 shows the best results for me.

But I am paying quite a lot for ChatGPT Pro and Claude Max 100 subscriptions, so I can work in parallel on multiple big projects, where GPT and Claude recheck each other.

2 Likes

I’ve been using the new Gemini 3 with Crystal, but I’m mostly back and forth chat with it as I need to explore fixes or new features. I’m not letting it use an agent since I don’t want to deal with token exhaustion. For sure is a lot better than 2.5. Only been running into a few things it keeps wanting to do. Though it sucks on macros, but I would expect that.

I will tell you a truth: I built my LLM protocol, that helps models with proper reasoning, because I was tired of Gemini sycophancy and BS. It helped a little to Gemini, but surprisingly it was much better for Claude Opus, but GPT was the best achiever.

I am having a lot of fun with GLM from z.ai and it’s super cheap.

1 Like

I am building my own SWE agent, and I use Grok 4.1 Fast reasoning that is also super cheap ($0.20/$0.05/$0.50), but quite powerful.