Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Right, but no separate image encoder + half the size could be very helpful for many applications.


The 7B LLaVa model is smaller, even considering the image encoder (CLIP-L).




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: