Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I find Claude significantly better than 3.5. I’d love to be able to make the case for that with data…


Since Chatbot Arena Leaderboard https://lmsys.org/blog/2023-05-10-leaderboard/ agrees with you, it's not just you.


There are 2 main claude models. I'm guessing it's claude-v1.3 aka claude plus that you find much better than 3.5 ? That tracks if so.


I've found for my use case that both claude-instant-* and claude-* are roughly on par with each other and gpt-3.5. claude-* seems to be the least inaccurate, but we also haven't put it into production like gpt-3.5, so it's hard to say for sure.

In either case, the claude models are very good. I think they'd do fine in a real product. But there's definitely issues that they all have (or that my prompt engineering has).




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: