Hacker Newsnew | past | comments | ask | show | jobs | submit | pllbnk's commentslogin

My experience is that with every new release it's getting slower but not necessarily better. I have some projects where I review everything that the agents code - these projects look generally fine because I keep them in line. There are also a few projects that I just vibe code and focus on the result (sometimes I want to pull my hair out because of constant stream of stupid bugs) and don't look at the code.

Well, today I gave Fable a try on one of the vibe-coded projects. It simply had to write a couple Python scripts 400-500 lines each. It did and they worked after a few iterations but I decided to look at the code it produced. There were weird constants that might (and will) break the code when the requirements will change. The code itself is unreadable and a total mess. If it would write a well-structured code in the first place, I believe it would be more efficient in working with that code too.

I have serious considerations how far will I be able to go with just the pure vibe coding. My projects are small one-person projects and so far I am able to push through but I hardly see how far will I be able to go before technical debt outgrows the value the code produces.

I fondly remember the times of Opus 4.5 where it was still (to my memory) reasonably fast and malleable.


I’ve found that agents are obsessed with adding more lines of code. Even when asking them to simplify they’ll remove 50 lines of code and then add 100 more. You have to explicitly tell them you want less lines of code. So I just do that after iterating on a task for a few steps.

I think the problem is that agents are inherently stochastic. Their idea of simplification changes from message to message because whatever objective it’s operating on internally is inherently opaque and changes. No matter how much you prompt it, eventually it’s going to not do what you want it to do.

I built https://github.com/thempatel/mdlr for precisely this reason: externalize the objective and force the agent to meet it.


Interesting, I'll be testing your tool on my repos. You should publish to crates.io!

Thank you so much! I am open to any and all feedback. Please file an issue or discussion if you have things you'd like to share.

Getting this onto crates.io is a great suggestion, I will look into that!


I have been wondering whether Anthropic are just gaslighting everyone with new model releases while in reality it's just the same base model with some internal knobs tuned more and more up with every new release to provide longer and longer thinking threads and outputs.

My speculative assumption is that these long thinking threads and self-checking tend to produce somewhat better output at the price of huge price increases due to the token burn.


I imagine it's the same foundation model on the 4 series, with Fable 5/Mythos being a new or upgraded foundation model. Then the point releases are fine-tuning plus post-training alignment with desired outcomes. The "thinking" can involve multiple steps, eg. asking the model first what it thinks the user wants to do, why it wants to do it, rewriting the prompt to generate better outcomes, how it should do it, come up with a plan, etc. So when they announce each point release like Opus 4.8, they're probably adding new layers of thinking to try and get good results on benchmarks. And that of course has cost and speed implications.

Then Sonnet/Haiku are just attempts to quantise/distil down to an acceptable performance/cost ratio. The cynic in me says we probably won't see any more of those until post-IPO, keep people addicted to the most costly models to pump a quarter or two of revenue figures, unless a competitor starts seriously undercutting them on price/performance. Hence the recent requests to slow down model training worldwide with their competitors.

Of course it could be that Fable "5" is just a marketing bump to the version, not a new foundation model...


> Then Sonnet/Haiku are just attempts to quantise/distil down to an acceptable performance/cost ratio. The cynic in me says we probably won't see any more of those until post-IPO, keep people addicted to the most costly models to pump a quarter or two of revenue figures, unless a competitor starts seriously undercutting them on price/performance. Hence the recent requests to slow down model training worldwide with their competitors.

I'm guessing there'll be a Sonnet/Haiku 5 release just around IPO, to keep the news cycle going, and so that user numbers will get a boost.


Im pretty sure Anthropic have hired people with Industrial Organisation background and so have OAI.

If you read a decent text and look at the actions both firms have taken you'll quickly see its literally textbook.


Can you expand a bit for people unfamiliar with Industrial Organisation planning?

I have been building a small web app for my family recently. I was planning to host it on my own server and not do any fancy reactive and asynchronous stuff. It was a simple multi-page app with simple forms and links. And it sucked because we didn’t know who was doing what live, we needed to refresh pages needlessly just to see if something has changed. Funny that it seemed fine while I have been the only user testing the app but once we got more family members in what seemed like “production ready” it was immediately obvious that it needed interactivity.


I see where they are going but have doubts regarding the long-term success. Currently I use LLMs (definitely not Google) and search (mostly Google) to verify what LLMs say if I care by finding trusted sources.

Maybe it will work in the beginning until non-technical users realize that LLMs hallucinate very often (unless Google solved it somehow, but probably they didn't because they would have said so), they will lose trust in the results and go back to good old indexed search engines.

Maybe I am coping but thinking from my own experience.


I don't know what it is but I feel there is some sort of logical fallacy here.

Ed Zitron is an analyst. His viewpoint is that AI is bad for whatever reasons and he does his job by trying to uncover those reasons and does a solid work. He presents a lot of insider knowledge that would otherwise be left unheard.

What are his alternatives? To stop claiming that AI is bad and pivot to "AI is good" writing? To quit writing entirely? To continue writing but in the beginning of each article list the things that he was wrong about in the past? What if it's too early for the things that seem to have been predicted incorrectly by him to materialize and in the end he will appear correct?

I think it's a benefit for society to hear the other side. There are plenty of pro-AI advocates.


He could stop confidently opining about things he clearly doesn't have even a surface-level understanding of. He also employs a tactic beloved of Internet trolls: he writes extremely long posts to stud his bogus claims in; his readers only need the "vibe" of his pieces to get the value they came from, but actually discussing them requires you to get a pickaxe and shovel and start digging. It would be one thing if he'd evinced technical competence over the last year, but he has done the opposite: some of what he's written about software development makes it really clear he's got basically no exposure to it.

It's a bad combination. There are better AI skeptics to follow. Endorsing Zitron, though, has become a "tell".


An analyst shouldn't have a conclusion carved in stones and work backwards to support said conclusion, for the starter.


That’s not an analyst, that’s a pundit. An analyst can have a clear point of view that is different from yours and, very far off the consensus in any direction. But the value of an analyst is they have a consistent point of view that they apply to any situation and flag as their point of view evolves.

A pundit starts from a pre-declared conclusion and works backwards to generate the argument. An analyst lets the conclusion be dictated by the analysis.


I wonder if the song they used for the video is also AI-generated. It's pretty catchy.


Dude it's Depeche Mode


It was sarcasm. They are using song from 1980's to advertise their AI-everything 2020's dystopia.


> capital will spread like plasmodium fungus into every unoccupied crag and niche in the economy not yet touched by AI

I guess it's a metaphor for the "hugely successful" trickle-down economics we have been witnessing.


Never said it would trickle down. Just that the economy wouldnt collapse


Not necessarily. The government controls prices, the government assigns _everyone_ some work no matter how meaningless it is; so instead of one street sweeper we would get 10. Everybody is paid just enough to live and have work assigned to not slack (I don't believe in utopia where money is paid for doing nothing because this utopia sounds extremely dystopian). I know just a place where people lived like that for 40-something years - Soviet Union! It wasn't terrible for majority, it wasn't great either. Also, it didn't last that long because it was unsustainable.

Our AI overlords think that they will be able to just prompt their LLMs to optimize this regime and make it last but their stupid LLMs can't yet figure out whether to take a car to the carwash or to go by foot, so they are not even close to that.

What's bad for us is that they are now wealthy enough to keep dragging us into this dystopia for a while until something changes.


Simple - you ask an LLM to fix it. It would be the same hard dependency on a programmer if you hired someone to write code for you as they would need to maintain it and would cost you. LLMs might possibly be interchanged easier than human engineers.


I think it's simpler than that and isn't talked much. Ukraine has been on a direct path to join European Union. Russians and Ukrainians have had significant ties - parts of families living in one country, parts in another, marriages, shared language, given that all Ukrainians know Russian and a lot of them have even spoken Russian at home at least until the war broke out.

Putin couldn't let Ukrainians join the EU, start getting all the EU fund money and actually started living like Europeans. Russian population would see that at a large scale and start asking questions. He couldn't get back the influence over the country diplomatically so he resorted to terror.

Edit: I also wanted to add that this was the reason Putin and other Russian propagandists have been calling Ukrainians the brotherly nation (to show them how they care about them), the nazis (to show that their government is harmful) and that Ukraine doesn't even exist as a country (to show that they should all be the same people under the same borders).


Not trying to defend Russia in the least, but isn't their fear more about Ukrainian accession into NATO rather than the possibility of joining the EU?

EU membership isn't the golden ticket it used to be. Russia basically had an inside man in there for years with the Orban administration in Hungary. Member nations like Greece, Malta and Bulgaria also seem to have experienced more brain drain to the higher income countries in the bloc than they have in economic and industrial development.


My guess is as good as anyone's. But I think NATO was used as an excuse for war because it's a military (although defense) alliance. It would be impossible to justify war for country joining the EU.

As for the golden ticket metaphor, I agree, but when the country is so economically and institutionally behind than the rest of the EU, this would still benefit them a lot. All Eastern countries experienced big emigration but a lot of the citizens previously having emigrated are now returning.


Claude has a research mode. I tried using it multiple times in the domains that I know quite well. Basically, used it with the hopes to save me time by aggregating the information I needed. I used it multiple times with different approaches and it never did anything useful. Full of factually incorrect and outdated information. I know that I could never hope to even slightly trust it for anything I don't have knowledge in.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: