Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Building LLM Applications for Production (huyenchip.com)
3 points by sebg on April 13, 2023 | hide | past | favorite | 1 comment


This is a very insightful and useful article.

The impossibility of cost + latency analysis for LLMs The LLM application world is moving so fast that any cost + latency analysis is bound to go outdated quickly. Matt Ross, a senior manager of applied research at Scribd, told me that the estimated API cost for his use cases has gone down two orders of magnitude over the last 6 months. Latency has significantly decreased as well. Similarly, many teams have told me they feel like they have to do the feasibility estimation and buy (using paid APIs) vs. build (using open source models) decision every week.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: