I find Claude significantly better than 3.5. I’d love to be able to make the cas...

sanxiyn · on May 15, 2023

Since Chatbot Arena Leaderboard https://lmsys.org/blog/2023-05-10-leaderboard/ agrees with you, it's not just you.

famouswaffles · on May 15, 2023

There are 2 main claude models. I'm guessing it's claude-v1.3 aka claude plus that you find much better than 3.5 ? That tracks if so.

phillipcarter · on May 15, 2023

I've found for my use case that both claude-instant-* and claude-* are roughly on par with each other and gpt-3.5. claude-* seems to be the least inaccurate, but we also haven't put it into production like gpt-3.5, so it's hard to say for sure.

In either case, the claude models are very good. I think they'd do fine in a real product. But there's definitely issues that they all have (or that my prompt engineering has).