Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They can work really well if you put sufficient upfront engineering into your architecture and it's guardrails, such that agents (nor humans) basically can't produce incorrect code in the codebase. If you just let them rip without that, then they require very heavy baby-sitting. With that, they're a serious force-multiplier.


They don't work really well even on relatively small things and even with a virtually impractical upfront engineering: https://news.ycombinator.com/item?id=47752626

They just make a lot of mistakes that compound and they don't identify. They currently need to be very closely supervised if you want the codebase to continue to evolve for any significant amount of time. They do work well when you detect their mistakes and tell them to revert.


We've done the impractical upfront engineering, and they're working well for us :)


Since Anthropic weren't able to make them work even for something as simple and familiar as a C compiler then I would guess that:

1. You're supervising the agents closely, or

2. Your projects are very simple - simpler even than a C compiler, or,

3. They're not really working well; the catastrophic problems just haven't surfaced yet.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: