Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sounds like you're describing mixture of experts, the architecture being used in openai's gpt-4 and mistral's mixtral series of models.


Not really, MoE is trained all at once and the 'experts' don't have pre-defined specializations. They end up being more like "punctuation expert" and "pronoun expert" than "math expert" and "french expert"


Haven't tried any yet, but it sounds like parent may be interested in an LLM router. https://github.com/lm-sys/RouteLLM




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: