Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Even if you pin the seed and spin up your own local LLM, changes to continuous batching at the vLLM level or just a different CUDA driver version will completely break your bitwise float convergence. Reproducibility in ML generation is a total myth, in prod we only work with the final output anyway
 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: