More

sheepscreek · 2026-03-07T06:14:20 1772864060

I second another fellow commenter, you are my inspiration too! Thanks for sharing.

sheepscreek · 2026-03-06T20:30:20 1772829020

“Hey AI, clone yourself”

We’re getting there..

jgilias · 2026-03-06T21:18:11 1772831891

I’ve kind of done this. To an extent.

“Hey Claude, you have a bunch of skills defined, some mcps, and memory filled with useful stuff. I want to use you on a machine accessible over SSH at <host>, can you clone yourself over?”

sheepscreek · 2026-03-06T20:28:35 1772828915

I guess it proves you cannot unlobotomize a hole in the head.

sheepscreek · 2026-03-04T20:28:07 1772656087

Why are people thanking Google? That’s like another slap on the face of Epic who burned through their millions to put a (soft) end to Google and Apple’s dominance. They still get to keep a significant cut.

kgwxd · 2026-03-04T20:37:49 1772656669

Epic still deserves all the slaps it gets. They didn't do it for the good of the people. They just want to abuse their own position more efficiently.

sheepscreek · 2026-03-05T13:16:54 1772716614

Who cares, this is how the American justice system works. AFAIK - only DAs and govt (sometimes) look out for people’s interests, and the occasional class action lawsuits. Many civil liberties cases were one black man or black woman, or a woman, fighting against some company/establishment/govt and that ended up benefiting all others (equal rights, right to vote, right to abortion, etc).

In this case, all other businesses get the same terms as Epic. In my eyes that’s a win, better than the system that existed before.

ericmay · 2026-03-04T21:24:42 1772659482

Yep. Spot on. And the reason you know this is true is because the arguments about increasing prices for customers due to App Store fees, which is one of the primary arguments, once removed does not result in price reductions for customers.

It's just big billion dollar corporations deciding on who keeps what cut.

hogwasher · 2026-03-04T21:49:02 1772660942

I'm hardly a fan of Epic, but considering inflation and rising supply chain costs, a price that remains flat may be a price that would have otherwise risen.

They might also direct the money towards funding more exclusives. Epic's funding has enabled some games to be made that wouldn't have been otherwise, or that wouldn't have been as full featured without that up-front cash.

They sell gambling to children via lootboxes; I'm not saying they're the good guy corp. But removing Apple and Google's monopoly over phone apps and app stores would only be a good thing, in my opinion.

ericmay · 2026-03-04T21:56:44 1772661404

Sure but it's not just Epic. I've seen other services, ranging from Netflix to Spotify increase subscription prices.

I don't disagree with your point about inflation, but we also can't really run the counterfactual, and I'm personally not inclined to give the benefit of the doubt here. As an aside we generally have some level of inflation and so while this argument may have been more convincing during a period of rapid inflation, it becomes less convincing over time.

I think the reality is these services have massive margins and so there was never any intent on the part of Epic at least, to lower prices. It was always to just capture more value for their company. I don't blame them for doing that, I just find the "we're the good guys" approach to be suspicious at best.

Apple's monopoly (because I have an iPhone) has been of incredible value to me so I prefer that the monopoly continue to exist. As we remove that monopoly I see more consumer harm done than good.

knollimar · 2026-03-05T01:28:56 1772674136

Is it not in Apple's best interest to create a local optima around their otherwise harmful practices?

Design wise, they make some tradeoffs that somewhat benefit consumers so they get advocates

irishcoffee · 2026-03-04T21:57:16 1772661436

> considering inflation and rising supply chain costs

I just can't for the life of me figure out where this money goes. People bought the same type of things 10 years ago, and the cost now isn't proportional to the cost 10 years ago.

Where is the money ending up??

satvikpendem · 2026-03-04T23:15:31 1772666131

Who cares? Their lawsuit made it way better for mobile devs in the US, including me, for selling apps. Epic can do whatever they want for their own stores as far as I care.

zarzavat · 2026-03-04T21:03:49 1772658229

Honestly I believe they did it because Tim Sweeney has fuck you money and he got pissed off at Apple.

charcircuit · 2026-03-04T20:52:38 1772657558

Google gets a 0% cut on Fortnite purchases in this new model.

foobarchu · 2026-03-04T23:24:52 1772666692

That is false. Google gets a 20% cut instead of 30%, because the headline is completely wrong.

charcircuit · 2026-03-05T01:08:30 1772672910

Epic will not sell things as a Google Play IAP and will use their own system instead.

irishcoffee · 2026-03-04T21:55:08 1772661308

> Epic who burned through their millions

I wouldn't die on this hill. Epic is about as un-sympathetic character in the videogame space as you'll find anywhere. Epic wasn't trying to be altruistic.

sheepscreek · 2026-03-04T19:30:20 1772652620

That sounds awfully similar to what Opus 4.6 does on my tasks sometimes.

> Blah blah blah (second guesses its own reasoning half a dozen times then goes). Actually, it would be a simpler to just ...

Specifically on Antigravity, I've noticed it doing that trying to "save time" to stay within some artificial deadline.

It might have something to do with the system messages and the reinforcement/realignment messages that are interwoven into the context (but never displayed to end-users) to keep the agents on task.

jtonz · 2026-03-04T23:17:06 1772666226

As someone that started using Co-work, I feel like I am going insane with the frequency that I have to keep telling it to stay on task.

If you ask it to do something laborious like review a bunch of websites for specific content it will constantly give up, providing you information on how you can continue the process yourself to save time. Its maddening.

zzrrt · 2026-03-04T23:39:31 1772667571

That’s pretty funny when compared with the rhetoric like “AI doesn’t get tired like humans.” No, it doesn’t, but it roleplays like it does. I guess there is too much reference to human concerns like fatigue and saving effort in the training.

martin-t · 2026-03-05T00:15:58 1772669758

This is what happens when a bunch of billionaires convince people autocomplete is AI.

Don't get me wrong, it's very good autocomplete and if you run it in a loop with good tooling around it, you can get interesting, even useful results. But by its nature it is still autocomplete and it always just predicts text. Specifically, text which is usually about humans and/or by humans.

selcuka · 2026-03-05T02:33:51 1772678031

You are not wrong, but after having started working with LLMs, I have this feeling that many humans are simply autocomplete engines too. So LLMs might be actually close to AGI, if you define "general" as "more than 50% of the population".

goodmythical · 2026-03-05T17:07:02 1772730422

Humans are absolutely auto-complete engines, and regularly produce incorrect statements and actions with full confidence in it being precisely correct.

Just think about how many thousands of times you've heard "good morning" after noon both with and without the subsequent "or I guess I should say good afternoon" auto-correct.

jrumbut · 2026-03-05T02:59:28 1772679568

Well the essence of software engineering is taking this complex real world tasks and breaking them down into simpler parts until they can be done by simple (conceptually) digital circuits.

So it's not surprising that eventually autocomplete can reach up from those circuits and take on some tasks that have already been made simple enough.

I think what's so interesting is how uneven that reach is. Some tasks it is better than at least 90% of devs and maybe even superhuman (which, in this case, I mean better than any single human. I've never seen an LLM do something that a small team couldn't do better if given a reasonable amount of time). Other cases actual old school autocomplete might do a better job, the extra capabilities added up to negative value and its presence was a distraction.

Sometimes there is an obvious reason why (solving a problem with lots of example solution online vs working with poorly documented proprietary technologies), but other times there isn't. They certainly have raised the floor somewhat, but the peaks and valleys remain enormous which is interesting.

To me that implies there is both lots of untapped potential and challenges the LLM developers have not even begun to face.

root_axis · 2026-03-05T01:14:44 1772673284

Yep. The veil of coherence extends convincingly far by means of absurd statistical power, but the artifacts of next token prediction become far more obvious when you're running models that can work on commodity hardware

justinclift · 2026-03-05T09:58:18 1772704698

> As someone that started using Co-work, I feel like I am going insane with the frequency that I have to keep telling it to stay on task.

Used to have the same thing happening when using Sonnet or Opus via Windsurf.

After switching to Claude Code directly though (and using "/plan" mode), this isn't a thing any more.

So, I reckon the problem is in some of these UI/things, and probably isn't in the models they're sending the data to. Windsurf for example, which we no longer use due to the inferior results.

bandrami · 2026-03-05T00:25:39 1772670339

It really is like having an intern, then

throwup238 · 2026-03-04T23:29:06 1772666946

In my experience all of the models do that. It's one of the most infuriating things about using them, especially when I spend hours putting together a massive spec/implementation plan and then have to sit there babysitting it going "are you sure phase 1 is done?" and "continue to phase 2"

I tend to work on things where there is a massive amount of code to write but once the architecture is laid down, it's just mechanical work, so this behavior is particularly frustrating.

dripdry45 · 2026-03-05T05:19:10 1772687950

I hope you will excuse my ignorance on this subject, so as a learning question for me: is it possible to add what you put there as an absolute condition, that all available functions and data are present as an overarching mandate, and it’s simply plug and chug?

elcritch · 2026-03-05T07:16:12 1772694972

Recently it seems that even if you add those conditions the LLMs will tend to ignore them. So you have to repeatedly prompt them. Sometimes string or emphatic language will help them keep it “in mind”.

girvo · 2026-03-05T08:01:20 1772697680

Glad it's not just me then, it's been driving me slightly batty.

shinycode · 2026-03-05T07:33:00 1772695980

If found it better to split in smaller tasks from a first overall analysis and make it do only that subtask and make it give me the next prompt once finished (or feed that to a system of agents). There is a real threshold from where quality would be lost.

beepbooptheory · 2026-03-05T05:53:01 1772689981

Why keep using it then? I simply still read websites. It's not always great but sounds better than whatever that weird dynamic is!

wood_spirit · 2026-03-04T19:47:06 1772653626

Yeah that happened to me with Claude code opus 4.6 1M for the first time today. I had to check the model hadn’t changed. It was weird. I was imagining that maybe anthropic have a way of deciding how much resource a user actually gets and they had downgraded me suddenly or something.

e1g · 2026-03-04T20:28:40 1772656120

Claude Code recently downgraded the default thinking level to “medium”, so it’s worth checking your settings.

joecool1029 · 2026-03-05T06:59:03 1772693943

recently being within the past 24 hours lol: https://github.com/anthropics/claude-code/releases/tag/v2.1....

darkwater · 2026-03-05T07:46:46 1772696806

> Re-introduced the "ultrathink" keyword to enable high effort for the next tur

Doh.

nekitamo · 2026-03-04T23:02:51 1772665371

Thank you. The difference was quite noticeable today.

wood_spirit · 2026-03-05T06:41:08 1772692868

Thank you thank you you give me hope :)

But how do you see the current thinking level and how do you change it? I’ve been clicking around and searching and adding “effortLevel”:”high” to .claude/settings.json but no idea if this actually has any effect etc.

varshar · 2026-03-05T08:51:20 1772700680

As per Anthropic support (for Mac and Linux respectively) -

  $ echo 'export ANTHROPIC_EFFORT="high"' >> ~/.zshrc source ~/.zshrc
  $ echo 'export ANTHROPIC_EFFORT="high"' >> ~/.bashrc source ~/.bashrc

I prefer settings.json (VSCode) -

  "claudeCode.environmentVariables": [
    { "name": "ANTHROPIC_MODEL", "value": "claude-opus-4-6" },
    { "name": "CLAUDE_CODE_EFFORT_LEVEL", "value": "high" }
  ], ...

nnoremap · 2026-03-05T14:12:33 1772719953

Or the 2026 version: 'Hey Claude set your thinking level to high.'

jasonjmcghee · 2026-03-05T14:58:57 1772722737

I've found antigravity to be completely unusable.

It's amazing how much foundational prompting and harness matters.

mavamaarten · 2026-03-05T17:07:28 1772730448

Haha yeah I've had this happen to me too (inside copilot on GitHub). I ask it to make a field nullable, and give it some pointers on how to implement that change.

It just decided halfway that, nah, removing the field altogether means you don't have to fix the fallout from making that thing nullable.

Lmao.

varispeed · 2026-03-05T16:47:22 1772729242

Opus 4.6 found in my documentation how to flash the device and wanted to be clever and helpful and flash it for me after doing series of fixes. I've got used to approving commands and missed that one. So it bricked it. Then I wrote extra instructions saying flashing of any kind is forbidden. Few days later it did again and apologised...

sheepscreek · 2026-02-28T21:06:20 1772312780

> Rust happens to be an extremely good tool. There

Sir (or ma’am), you stole literally the line I came to write in the comments!

To anyone new picking up Rust, beware of shortcuts (unwrap() and expect() when used unwisely). They are fine for prototyping but will leave your app brittle, as it will panic whenever things do not go the expected way. So learn early on to handle all pathways in a way that works well for your users.

Also, if you’re looking for a simpler experience (like Rust but less verbose), Swift is phenomenal. It does not have a GC, uses ARC automatically. I spent months building a layer on top of Rust that removed ownership and borrow considerations, only to realize Swift does it already and really well! Swift also has a stable ABI making it great for writing apps with compiled dynamic components such as plugins and extensions. It’s cross platform story is much better today and you can expect similar performance on all OS.

For me personally, this relegates rust for me to single threaded tasks - as I would happily take the 20% performance hit with Swift for the flexibility I get when multithreading. My threads can share mutable references, without fighting with the borrow checker - because it’s just a bad use case for Rust (one it was not designed for). A part of my work is performance critical to that often becomes a bottleneck for me. But shouldn’t be a problem for anyone else using RwLock<Arc<…>>. Anyway - they’re both great languages and for a cli tool or utility, you can’t go wrong with either.

sheepscreek · 2026-02-25T23:48:00 1772063280

Are you guys affiliated with Meta’s ex-CTO in any way? I remember he famously implied that LLMs hyped. The demos are very impressive. Does this use an attention based mechanism too? Just trying to understand (as a layman) how these models handle context and if long contexts lead to weaker results. Could be catastrophic in the real world!

sheepscreek · 2026-02-25T23:50:39 1772063439

I think in the long run, we may need something like a batch job that compresses context from the last N conversations (in LLMs) and applies that as an update to weights. A looser form of delayed automated reinforcement learning.

Or make something like LoRA mainstream for everyone (probably scales better for general use models shared by everyone).

sheepscreek · 2026-02-21T01:54:22 1771638862

We need one of these things running an OSS vision model. Having super-fast agentic computer access would be so worthwhile!

sheepscreek · 2026-02-20T17:07:26 1771607246

Curious about the financials behind this deal. Did they close above what they raised? What’s in it for HuggingFace?

sheepscreek · 2026-02-19T22:11:30 1771539090

If it’s any consolation, it was able to one-shot a UI & data sync race condition that even Opus 4.6 struggled to fix (across 3 attempts).

So far I like how it’s less verbose than its predecessor. Seems to get to the point quicker too.

While it gives me hope, I am going to play it by the ear. Otherwise it’s going to be - Gemini for world knowledge/general intelligence/R&D and Opus/Sonnet 4.6 to finish it off.

UPDATE: I may have spoken too soon.

  > Fixing Truncated Array Syncing Bug
  > I traced the missing array items to a typo I made earlier! 
  > When fixing the GC cast crash, I accidentally deleted the assignment..
  > ..effectively truncating the entire array behind it.

These errors should not be happening! They are not the result of missing knowledge or a bad hunch. They are coming from an incorrect find/replace, which makes them completely avoidable!

On a lighter note, every time it happens, I think about this Family Guy: https://youtu.be/HtT2xdANBAY?si=QicynJdQR56S54VL&t=184

sigmoid10 · 2026-02-19T22:18:29 1771539509

For me it's Opus 4.6 for researching code/digging through repos, gpt 5.3 codex for writing code, gemini for single hardcore science/math algorithms and grok for things the others refuse to answer or skirt around (e.g. some security/exploitability related queries). Get yourself one of those wrappers that support all models and forget thinking about who has the best model. The question is who has the best model for your problem. And there's usually a correct answer, even if it changes regularly.

bdelmas · 2026-02-22T15:03:55 1771772635

Yes I came to the same conclusion. Just to add: be careful with Opus 4.6 guys. It’s expensive…

scrollop · 2026-02-20T08:30:38 1771576238

Using simtheory.ai which is very good, you can switch models within a conversation and use mcps

replwoacause · 2026-02-21T02:39:46 1771641586

Are you associated with this somehow?

qnleigh · 2026-02-20T09:08:15 1771578495

Interesting, I've had similar issues. It seems to be very clumsy when using its internal tooling. I've seen diffs where it accidentally garbled significant amounts of code, which it then had to go in and manually fix. It's also introduced bugs into features that it wasn't supposed to be touching, and when I asked it why it was making changes to I the other code, it answered that it had failed to copy-paste since large blocks of code correctly.

sheepscreek · 2026-02-20T17:15:28 1771607728

Yeah, I whole heartedly agree with this. Even Codex does this sometimes, although it has been consistently much better than the others at following instructions.

The problem is again that you can’t ever fully trust an agent did exactly what you asked for and in the exact manner that you had hoped.

It works just like you’re dealing with a human companion. Trust takes time to build. Over the period you realize the other individuals weaknesses and support them there.

What makes it a bit challenging right now is the pace of innovation. By the time we get used to a model’s personality, a new update comes out that alters it in unknown ways. Now you’re back to square one.

I’ve been experimenting with asking one frontier model to check on another’s work. That’s proven to be better than doing nothing. Usually they’ll have some genuinely useful feedback.