Language models transmit behavioural traits through hidden signals in data | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		Language models transmit behavioural traits through hidden signals in data (nature.com)
		4 points by armcat 34 days ago \| hide \| past \| favorite \| 2 comments

zahra_lahrsson 34 days ago | [–]

Related to this: https://www.nature.com/articles/d41586-026-00906-0 (LLMs can subliminally learn malicious behavior through distilling)

pop_mccoy 34 days ago | | [–]

Explains the high performance of distilled models then (e.g. Chinese ones).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact