If smaller models can find these things, that doesn’t mean mythos is worse than we thought. It means all models are more capable.
Also if pointing models at files and giving them hints is all it takes to make them find all kinds of stuff, well, we can also spray and pray that pretty well with llms can’t we.
It just points to us finding a lot more stuff with only a little bit more sophistication.
Hopefully the growing pains are short and defense wins
No one seems to have actually read the system card all the way through.
The reason they didn't publish it was that it's orders of magnitude more successful at writing exploits vs Opus 4.6, which only managed it something like 2% of the time.
Sure, I think it’s reasonable to tell Anthropic the barn door is already open.
Though, like, I guess I expect that when this comes out, all the opus traffics will move over. It does appear to be much more capable, just jury is out about how much more capable
If smaller models can find these things, that doesn’t mean mythos is worse than we thought. It means all models are more capable.
Also if pointing models at files and giving them hints is all it takes to make them find all kinds of stuff, well, we can also spray and pray that pretty well with llms can’t we.
It just points to us finding a lot more stuff with only a little bit more sophistication.
Hopefully the growing pains are short and defense wins