This presumes that the value was created by the authors and not the people who found a way to use the structure in the training set to create intelligence. Like crediting wi-fi router owners with creating Wi-Fi Positioning System instead of the people who realized they could wardrive around to create maps and extract a new kind of value that would never have existed without them. Or Egyptians for deciphering hieroglyphics instead of the Frenchman who realized the Rosetta Stone they were using to hold up a wall could actually be used to do much more.
My thinking here is coming from the paper "From Entropy to Epiplexity"[1] which partly discussed why you can train on synthetic data: it's the structure of the data that enables learning, not just the amount of "information". Authors of images and videos may have worked just as hard as authors of text training sets, but they didn't contribute to AI as much because there just wasn't the same kind of structure to discover there. It's the people who found the usable structure, not the people who accidentally generated it, who created the value.
> This presumes that the value was created by the authors and not the people who found a way to use the structure in the training set to create intelligence.
This presumes it's binary.
> Or Egyptians for deciphering hieroglyphics instead of the Frenchman who realized the Rosetta Stone they were using to hold up a wall could actually be used to do much more.
Yes, I think a lot more than 1 person deserves credit for years of painstaking research.
> It's the people who found the usable structure, not the people who accidentally generated it, who created the value.
You're naively assuming the structure is accidental.