Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, it is evidence for that point. You could just rattle off every possible vulnerability and have the cheap model scan for it in the harness through a loop.

Note that I say cheap, not small, because small models may lack the reasoning needed, but some models are cheap enough but retain enough reasoning (ala Sonnet 3.7+)



That's not what they did.


They could write a post demonstrating that you can do that and surface the same bugs in the same codebases.

It would be way more informative than this one, which didn't do that.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: