Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Image patches are projected directly into an embedding that goes into the decoder Transformer. The same thing could be done for audio.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: