Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Q, K and V are a way of filtering the relevant aspects for the task at hand from the token embeddings.

"he was red" - maybe color, maybe angry, the "red" token embedding carries both, but only one aspect is relevant for some particular prompt.

https://ngrok.com/blog/prompt-caching/



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: