That article shows that it takes about 50x as long to train gpt-3 with intel's offering vs Nvidia. At least in the current environment, if you are training llms I think almost no amount of cost savings can justify that.
That 50X is only if you can afford one thousand NVIDIA H100.
There cannot be more than a handful of companies in the entire world that could afford such a huge price (tens of millions of $).
In comparison with a still extremely expensive cluster of 64 NVIDIA H100, the difference in speed would reduce to only two to three times, and paying several times less for the entire training becomes very attractive.
The problem is not having so much money available.
Such a big expense only makes sense for a company where spending that amount would bring hundreds of millions of $ of additional revenue.
I doubt that any of the companies that have already spent such amounts have recovered even a small part of their expenses. It is more likely that they bet on future revenues, but it remains to be seen who will succeed to achieve that.
Kinda, companies of that scale regularly spend more than that on (often) random R&D.
Sure if there is a plausible ROI, they’d have no issues dropping that much money (actually far more). Revenues for fortune 500’s are going to be in the 10’s of billions anyway, and it wouldn’t be hard to make an argument that random AI project could increase that by a couple percent or decrease costs a couple percent, which would more than provide that ROI.
Their biggest issue is usually having anyone in leadership that has a clue enough to even propose something plausible, let alone get a team together to give it a plausible go.
The funny thing is that this fact has been shown inadvertently by NVIDIA:
https://www.servethehome.com/nvidia-shows-intel-gaudi2-is-4x...