More

ToucanLoucan · 2026-05-01T12:05:22 1777637122

It is really, really genuinely concerning how many people think there are profound measurable differences between these things.

Like yeah tonally I guess there are. But with regard to references and information? You’re literally just using three different slot machines and claiming one is hot.

I suppose though I shouldn’t be that surprised then since Vegas and every other casino on Earth has been built on duping people in that exact way.

aembleton · 2026-05-01T13:45:16 1777643116

> You’re literally just using three different slot machines and claiming one is hot.

It's a fair point. I haven't tested many queries across them all and checked their answers, but if I want to ask one of them a question - right now its Grok just because I trust its answers more.

ToucanLoucan · 2026-05-01T13:53:30 1777643610

It's not a methodology problem, it's a test-ability problem. LLMs are not deterministic. You can ask the same question to the same LLM five times and you'll likely get at least 3 answers.

Again. Slot machine.

Ukv · 2026-05-01T14:03:09 1777644189

You can meaningfully test if one slot machine hits the jackpot more often than another, just that the methodology should involve a large number of repeats rather than a few anecdotes. There are some LLM leaderboard sites that do it with blind comparisons.

ToucanLoucan · 2026-04-30T11:40:59 1777549259

> LLMs are in this case enabling bad behavior

Yeah that seems to be their primary use case, if I'm honest. It's possible to use them ethically and responsibly, much in the same way it's possible to write one's own code, and more broadly, do one's own work. Most people however, especially in our current cultural moment and with the perverse incentives our systems have created, are not incentivized to be ethical or responsible: they are incentivized to produce the most code (or most writing, most emails, whatever), and get the widest exposure and attention, for the least effort.

Hence my position from the start: if you can't be bothered to create it, I'm not interested in consuming it.

kiba · 2026-04-30T14:57:57 1777561077

People who made use of LLM responsibly to create high quality output doesn't look like they're using AI.

For example, using AI as an editor. It doesn't write anything for you and you try and avoid suggestions unless you're stuck.

ToucanLoucan · 2026-04-29T16:54:48 1777481688

I would argue it goes even a step further: any org the size of Microsoft struggles to maintain the quality of... well, anything. And, added to that, Microsoft seems exceptionally bad at doing fucking anything now. Azure is a complete mess, Windows is an utter dumpster fire, the office suite feels like it gets just slightly worse with every update, Copilot is a fucking joke compared to every other AI on offer (and hilariously, will agree with everything I've said here!), they won't even use their own frameworks to develop software anymore!

Microsoft is literally too big to fail and it's their sole asset at this point. When companies like Github get bought by Microsoft, I just put a clock on the wall in my mind. Just a matter of time before the shit seeps in.

They can't help it. They are organizationally unable to function. It's so much worse than misaligned incentives and redundant management (though those are factors): they seem culturally, institutionally, unable to just... DO ANYTHING. Everything they do is 1 step forward and 4-20 steps back. They are too big and they should be broken up for their own good as well as the good of every user of their software.

ToucanLoucan · 2026-04-28T17:36:50 1777397810

> That said, even if a developer is burning $50/hr, many, many employees at large companies cost more than $100k/yr to employ all costs considered, so making them say 20-30% more productive can easily make that worth it for most. If the labs shave their margins ultimately to more like 20-30%, you'd have ~$15/hr in costs to use the services, and nearly every white collar job is way over 30k/yr to employ. If your salary is 80k, you probably cost the company 200k all in, so making you 15% more productive offsets the $15/hr cost.

Nobody including the connected article is making the argument that this cannot be profitable ever. People are saying "there is no way this admittedly quite interesting tool is going to be able to make back all of this money" and I think they are completely right to say that.

You can absolutely make money with this stuff, just not at this scale. The buildout for this shit has been certifiably crazy and a number of the involved firms are overleveraged for tens and even hundreds of billions of dollars.

How in the sweet fuck are you paying that off, plus giving investors dividends, selling this at $15/hour/user??? That math does not math. A quick google says there are between 1.5 and 4.4 million developers in the US alone, let's say it's 5 million, to be generous, and each of them is subbed to this for 8 hours per day, continuously. That's 600 million per year in revenue. If you took ALL that revenue, and put it towards paying down this debt, not leaving any for employee salaries, upkeep, ongoing development, it would take DECADES to pay down what OpenAI already owes.

And yes I'm sticking directly to code, because that's the only thing I've seen it be really good at. Are we really proposing that every knowledge worker on earth and every manager of such workers is going to have an autonomous agent running all the time!? To do what, make sure they don't have to read or write email? Which even just that example is bringing in a fucking mess of legal, compliance, and security violations because LLMs are not intelligent and are not capable of being properly secured.

Like I'm sorry, I cannot take this industry seriously when even the most basic back-of-napkin math is saying, nay, screaming from the rooftops that they are FUCKED.

belval · 2026-04-28T17:50:23 1777398623

> selling this at $15/hour/user??? That math does not math. A quick google says there are between 1.5 and 4.4 million developers in the US alone, let's say it's 5 million, to be generous, and each of them is subbed to this for 8 hours per day, continuously. That's 600 million per year in revenue

That math is not mathing. $15/hour/user, with 5M devs, 8hrs and 240 working days per year that is 144B in revenue.

vidarh · 2026-04-28T17:44:10 1777398250

By your numbers, it'd be $120/day per developer * 5 million = $600m per day, not per year.

Of course people don't work every day, but even with European-level holidays that number is off by a factor of 240 or so.

ToucanLoucan · 2026-04-28T17:52:06 1777398726

Quite right, honestly not sure how I fucked that up so bad but I'll own it. Okay so all we need is every coder + 0.6 million more or so in the United States, subscribed to this for 8 hours a day, and the business model can work.

That still feels incredibly optimistic given how split the community at large seems to be about how good this tech is, and it assumes all those developers also all work for firms large enough to pay for all of that.

However we are still very much in back of napkin math. We haven't even gone into what it costs to provide these services, how much it's going to cost yet for all these datacenters to be built, how much electricity and water they're going to rip through, their own employees and basic overhead, and all the rest. So IMO, we've now elevated it from "hopeless" to "this could work if a whole lot of other things line up really well."

asdfasgasdgasdg · 2026-04-28T18:36:26 1777401386

It's not just developers who are using this. My economist friends are. I bet most business analysts and general administration folks are or will be soon. Every normal person I know in my neighborhood is using AI for this thing or that. 50M people are currently subscribed to ChatGPT and it would be very surprising if this number goes down in the future.

I dunno I think about the language some people are using about AI investment and it is reminiscent of the many years where people were saying Amazon was a bad buy because they never turned a profit. Admittedly AI companies are investing more than the money they've already brought in, but I would be very hesitant to predict that it's all froth given the usefulness I've gleaned from the tools.

Don't get me wrong, I'm not unconcerned, but I think there are good reasons to suspect that at least some of the AI companies are making sound investments.

vidarh · 2026-04-29T08:43:27 1777452207

My fiancees company has no developers, yet everyone has a paid subscription to LLMs. Certainly not $15/hour, and I don't think it's likely they'll ever pay that for everyone, but I don't find it hard to picture the aggregate cost of subscriptions on a global basis to far exceed $600m/day between far more people on subscriptions cheaper than $15/hour but more expensive than today, and companies ending up paying far more than $15/hour averaged over their developers for additional use. E.g. I already run agents 24/7 just for me. I couldn't yet justify $15/hour, but the amounts I'm spending is steadily increasing as I manage to squeeze returns from more and more things.

Sure, it's back of napkin math, and I also think that several of the companies we see today won't survive and/or will only survive due to consolidation, but I also think the spend is going to be immense.

With respect to the datacentres, I expect we'll see inference costs crash over the coming years - we're only seeing the beginning of what dedicated ASICs will do to inference, and what work to make models more efficient will do to the need for the very largest models, and while that might drive down the spend on individual subscriptions, I think it will drive up the total spend dramatically as cheaper models become capable enough to put them "everywhere".

But, yeah, ultimately we're guessing. I'm happy to put my guesses on the record, though, and look forward to look back and see how wrong I got it in a couple of years.

Maxatar · 2026-04-28T18:44:58 1777401898

You wrote an entire wall of text when you could have just taken 10 seconds to review what you call the "most basic back-of-napkin math" and realized you were off by two and a half orders of magnitude.

strongpigeon · 2026-04-28T17:45:11 1777398311

> That's 600 million per year in revenue.

According to your math, that's $600 million per day

marcosdumay · 2026-04-28T17:59:32 1777399172

Yes, the GP wrote the wrong unit on this place. That supports his conclusion that the pay-off would take decades, if it was actually per year, it would take several centuries.

ToucanLoucan · 2026-04-28T17:21:18 1777396878

The only reason it hasn't is the sheer amount of credit being thrown at this tech. Both that and the valuations of the firms in question is stratospherically over-hyped and over-valued.

This tech has uses. It has quite a lot of them in fact. However there is no usage of ChatGPT or Claude that makes OpenAI or Anthropic worth anything fucking close to what they're valued at right now, and both firms are scrambling to figure out how to get down from the top of the AI house of cards without detonating in the process.

Meanwhile DeepSeek is coming out with more capable models that run on far less onerous hardware and with far less compute requirements that does basically exactly what the vast majority of users actually want it to do.

This is going to be a financial bloodbath. Not for anyone actually responsible for it, of course, they'll be fine. It'll be everyone else getting soaked which is the only reason I give two shits.

ToucanLoucan · 2026-04-27T14:50:06 1777301406

> Now one dude in India can flood multiple sock puppet media accounts with right wing content/images (actual example) at a scale previously unimaginable.

I have the faintest possible hope that such things are going to be the death knell of social media. Yeah a lot of credulous idiots are happily giving AI thirst traps their money for stroking their confirmation bias, but that's just who's left at this point. It feels like every social media app I use is gradually bleeding users who aren't hopelessly addicted to the dopamine treadmill, because what's left is just plain unappealing to them, which selects for the people who are most vulnerable to AI shit, which is far from ideal, but also means those platforms are comprised ever more of that vulnerable population and nobody else. And the problem with all these businesses going through that is without a diverse, growing audience, you just become InfoWars, slinging the same slop to the same people every day, and every ounce of said slop is great for what's left of your audience, but absolute garbage for getting anyone new in it. And it just goes on that way until you sputter out and die (or harass the wrong group of parents I guess).

I wish all social media sites a very haha die in a fire.

dpoloncsak · 2026-04-27T15:44:49 1777304689

Mate you're on a social media site right now that often has AI-generated content displayed at the top of whats "trending". Sure the general user-base does a better job here flagging that sort of stuff, as AI seems to be a shared interest in much of the community, but it still sneaks it's way by

Forgeties79 · 2026-04-27T19:48:37 1777319317

You’re technically right but I think we can all agree HN is significantly different from the major players. The vast majority of us see the same posts and comments, for starters. The churn of posts is also much slower. You log on 2-3 times spread out in a day and you see 90% of the main posts. Top posts linger for 24-48hrs regularly.

No media uploading, memes are few and far between (usually punished), etc.

ToucanLoucan · 2026-04-27T14:46:43 1777301203

Hell you mean a decade ago? I still see businesses running losses left right and center saying that they're gonna monetize user data, any day now.

Related "monetizing user data" seems to just mean ads. Ads on everything, forever, until the userbase gets fed up and moves to a new service that definitely won't do that, and the cycle repeats about every 3 years.

ToucanLoucan · 2026-04-27T14:27:21 1777300041

Whenever people describe the living conditions of your average person all I can think is what a colossal failure our system is, to imprison so many millions in such an utterly shit existence.

Like yes it's cool to have air conditioning and basically any food anywhere at any time, and many have transportation that can take us across the country at a moment's notice. There are marvels now that our ancestors would die of shock trying to comprehend. That said, it seems still that we've made a remarkably awful place for the vast majority of people to live and work in, more the latter than the former, while a handful of people basically live in a never-ending theme park.

ToucanLoucan · 2026-04-23T11:43:33 1776944613

You probably can’t sell tractors forever but that’s short-sighted: you can sell parts and service that’s reasonably priced. People don’t just refuse to buy OEM parts on principle, they do it because the prices are often outrageous and/or the procedure to do so sucks and/or is arbitrarily restrictive like needing dealer licenses or what have you.

And just because a tractor is low tech and designed to run forever doesn’t mean it won’t still need parts and service. Time comes for us all and that includes your wheel bearings, bushings and seals.

ToucanLoucan · 2026-04-21T12:29:36 1776774576

What's funny is I had personally settled on Anthropic as... the best of a bad situation, I guess? I found the tech useful even if I still deeply hate the industry and hype machine around it. Now though I can't get through a full discussion with Claude before the usage restrictions kick in, which has done a far better job getting me to kick the habit than anything else.

I still VERY occasionally use it (as I'm friggin able to anyway) but it's definitely nowhere near my usage previously. And I refuse to give them money, and besideswhich have no goddamn notion of whether it would even be worth it on the lowest paid tier.

Ah well. The free ride was fun but I knew it had a shelf life.

gavinray · 2026-04-21T12:44:28 1776775468

I pay the $20 sub for all of the Frontier models and just hop between them as performance changes

I will say that Codex high/x-high has consistently performed the best for me, but YMMV

ToucanLoucan · 2026-04-21T12:54:32 1776776072

See the thing is their storefront is so fucking vague. Right now I hit usage limits after about 4-6 messages during the day, depending on length. They say the low tier is 5x usage, so does that mean I can send 20-30 messages? Because that's not remotely worth $20 a month to me.

wafflemaker · 2026-04-21T14:43:25 1776782605

It used to be. But I consistently got to use more than that.

Funny thing (or I just imagined that), when I used ChatGPT for studying, it was quite generous about over usage. When I was just messing around, testing where the guardraila are or trying to get it to generate sexual prose about my siblings to send it to them for laughs, the limits were held much more strictly.

I remember when it went up from 25/3h to 50/3h. And I was like meh, because I've already used it over that limit multiple times.