Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Re: open-source harnesses, ForgeCode appears to be pretty good (currently, #1 on Terminal Bench 2.0 -- https://www.tbench.ai/leaderboard/terminal-bench/2.0). Re: open models, Kimi K2.6 might be a good place to start, but admittedly I'm not too sure how it'd compare to Sonnet / Opus for your use case.


It's terrible, don't waste your time like I did. That bench means nothing by the way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: