I like to dunk on Meta as much as the next guy, but I think this makes sense: deterministic verification like this is not, and should never be, the LLM’s job. The tools it has access to should enforce the permissions layer, ensuring that the LLM can never perform actions the user themselves should not be allowed to perform. In this case, the tool failed to do that.
> But when humans handled it, this was not as much as a problem.
In fact it's arguably a feature. The ability of support staff to short-circuit nitpicky rules when there's an obvious external validation happening (e.g. you're on the phone with a user who's presenting ID in real time and correlating it with previous use of the account, etc...) makes for better data quality and happier customers.
Obviously, yes, you can then human-engineer an authentication breach. But that was very difficult, because people are "common-sense careful" in a way we haven't been able to tease out of AI yet.
Maybe that’s because I work with agentic AI in my day job, but this seems utterly obvious to me: no reasonable person would ever claim that LLMs are better at keeping secrets or enforcing rules than human employees.
This notice is not about comparing humans and LLMs. It seems that the system was designed in the only reasonable way: with a deterministic permissions layer separate from the agent. But that layer failed to work properly.
So the notice is comparing the difference between how the system was supposed to work and how it actually worked in reality. Normal post-mortem stuff.
It helps set expectations for the fix. "The bug was in an external system that has now been fixed" means we it's probably fine going forward. "The LLM got tricked but we are gonna train it super hard not to do that again" means it will break again and again as people find new angles to convince it.