When I personally use chatgpt and friends, I am not seeing any slowdowns or anything, meaning that their servers can handle the loads just fine. So then, why are these companies spending so much building new capacity if the current capacity is enough?
Frontier labs flagship models are ~2T params at the moment, but they intend to ship 10T models like Claude Mythos, which would require substantial datacenter expansion. Same thing for training.
I assume to save on resources, even if your algorithm is not much more taxxing on silicon, maybe the designers at intel and amd just didn't think optimizing split locks was worth it
There is a limit on how much copilot can do in one request, pretty generous but after some time vscode will say "this request is taking very long, do you want to continue" and that would count as a seperate request
reply