Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Question is why even use these small models?

When you've Google Flash which is lightening fast and cheap.

My brother implemented it in option-k : https://github.com/zerocorebeta/Option-K

It's near instant. So why waste time on small models? It's going to cost more than Google flash.



Sometimes you don’t want to share all your data with the largest corporations on the planet.


What is Google Flash? Do you mean Gemini Flash? If so, then the article talks about that general purpose LLMs are worse than this specialized LLM for Markdown conversion.


In this case it is not, though. As much as I'd like a self-hostable, cheap and lean model for this specific task, instead we have a completely inflexible model that I can't just prompt tweak to behave better in even not-so-special cases like above.

I'm sure there are good examples of specialised LLMs that do work well (like ones that are trained on specific sciences), but here the model doesn't have enough language comprehension to understand plain English instructions. How do I tweak it without fine-tuning? With a traditional approach to scraping this is trivial, but here it's unfeasible to the end user.


Small models often do a much better job when you have a well-defined task.


Privacy, Cost, Latency, Connectivity.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: