I didn’t read the whole thing, but the first part about Tom Cruise and his mothe...

jameshart · on Nov 15, 2023

LLM training teaches it that ‘Mary’,’Lee’, and ‘Pfeiffer’ are words that appear in association with ‘Tom’, ‘Cruise’, and ‘mother’. Probably also in association with ‘son’, and ‘family’.

But the association between the words ‘Mary’, ‘Lee’, ‘Pfeiffer’ and ‘son’ or ‘mother’ point to other words more strongly than they do to ‘Tom’ or ‘Cruise’. Sure, Tom Cruise is probably in there but so is probably John Henry Kelley (Michelle Marie Pfeiffer’s son), and George Washington Custis Lee (whose father was Robert E Lee and whose mother was called Mary).

A random piece of text containing the words Mary Lee Pfeiffer is just, based on its training, not likely to be about Tom Cruise. There’s nothing there anchoring it particularly to those words.

It’s possible that the ‘Pfeiffer’ in there screws it up more, as well - it’s like it vaguely knows there’s a Hollywood connection but of course it crosses its wires and thinks it’s probably Michelle Pfeiffer.

Be honest: if a pub quiz question came up asking “which Hollywood superstar’s mother is Mary Lee Pfeiffer“, you’d probably not guess Tom Cruise either.

But here’s the thing: once we go beyond training into specific completions, if you ask GPT ‘who is Tom Cruise’s mother’, it answers correctly; and then once it has that A is B in context, you can ask it ‘who is Mary Lee Pfeiffer’s son?’ And of course it knows how to complete that B is A.

So I definitely agree with the piece here - there’s no ‘reversal curse’, there’s just asymmetric information relevance.