Anything that reduces the footprint of LLM's is welcome, however...
- making LLM compute cheaper in datacentres won't mean lower total power/cooling/space/water consumption, like adding lanes and traffic, it will just mean more usage as it gets cheaper (and a short-term bump in margins for the LLM owners)
- these are still highly dedicated chips that are always going to be bound up in the mega-scale datacentre deployments
- what happens if there is a paradigm shift in the exact compute architecture? Loads of junk servers and no applications able to make use of such a glut
- these do nothing to push LLMs out of the datacenter and into non-corporate hands, which is the only spot where we might see fewer privacy concerns, less corporate control etc
If we're stuck with the current compute/corporate paradigms, at least alternatives nibling at the unhealthy dominance of nVidia and the cloud giants is some small benefit.