Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I find it funny how people say GPT-5 "bombed". I noticed a significant improvement in maths and coding with GPT-5. To quantify were I've found the models useful:

- GPT 3.5: Good for finding reference terms. I could not trust anything it said, but it could help me find some general terms in fields I was unfamiliar with.

- GPT 4: Good for cached, obscure knowledge. I generally could trust the stuff it said to be true, but none of its logic or conclusions.

- GPT 4.5: Good for reference proofs/code. I cannot trust its proofs or code, but I can get a decent outline for writing my own.

- GPT 5: Good for directed thinking. I cannot trust it to come up with the best solution on its own, but if I tell it what I'm working on, it's pretty decent at using all the tricks in its repertoire (across many fields) to get me a correct solution. I can trust its proofs or code to be about as correct as my own. My main issues are I cannot trust it to point out confusion or ask me, "is this actually the problem we should be solving here?" My guess is this is mostly a byproduct of shallow human feedback, rather than an actual issue with intelligence (as it will often ask me at the end of spending a bunch of computation if I want to try something mildly different).

For me, GPT 5 is way more useful than the previous models, because I don't have a lot of paper-pushing problems I'm trying to solve. My guess is the wider public may disagree because it's hard to tell the difference between something better at the task than you, and something much better.



> I find it funny how people say GPT-5 "bombed".

I used scare quotes for a reason. It didn't "bomb" in the sense of failing [insert metric], it bombed in the sense that OpenAI needed it to generate exponentially more hype and it just didn't. (And on a lesser level, GPT-5 was supposed to cut OpenAI's costs but has failed to do so)

> I can trust its proofs or code to be about as correct as my own.

I have little to say about this, as I find such claims to be broadly irreplicable. GPT-5 scores better on the metrics, but still has the same "classes" of faults.


Gemini 2.5 was the first breakthrough model, people didn't know how to use it but it's incredibly powerful. GPT5 is the second true breakthrough model, it's ability to deal with math/logic/etc complexity and its depth of knowledge in engineering/science is amazing. Every time I talk to someone who stans Claude and is down on GPT5 I know they're building derivative CRUD apps with simple business logic in Python/Typescript.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: