It honestly has all kinda felt like more of the same ever since maybe GPT4? New ...

ifwinterco · 2026-04-24T07:04:18 1777014258

For coding Opus 4.5 in q3 2025 was still the best model I've used.

Since then it's just been a cycle of the old model being progressively lobotomised and a "new" one coming out that if you're lucky might be as good as the OG Opus 4.5 for a couple of weeks.

Subjective but as far as I can tell no progress in almost a year, which is a lifetime in 2022-25 LLM timelines

_air · 2026-04-25T05:09:29 1777093769

Opus 4.5 was released on Nov 24 last year. It’s only been 5 months!

ifwinterco · 2026-04-25T07:18:12 1777101492

Wow you're right, okay not so bad then.

That brief two week period when Opus could eat entire tickets was simultaneously fantastic and a bit alarming

dannyw · 2026-04-24T12:28:18 1777033698

Another annoyance (for more API use) is summarized/hidden reasoning traces. It makes prompt debugging and optimization much harder, since you literally don't have much visibility into the real thinking process.

hnfong · 2026-04-25T02:32:48 1777084368

I don't trust the benchmarks either, so I maintained a set of benchmarks myself. I'm mostly interested in local models, and for the past 2 years they have steadily gotten better.

Can't argue with subjective experience, but if there were some tasks that you thought LLMs can't do two years ago, maybe try again today. You might be surprised.