Earlier today Google announced a new large language model that will compete with ChatGPT, DeepSeek and others. It highlights a trend of huge efficiency gains that will require fewer computer chips to do the same thing as prior models.
It’s called Gemma 3 and is really a collection of models of various sizes that can be run locally and from a single H-100 chip.
“Gemma 3 delivers state-of-the-art performance for its size,
outperforming Llama-405B, DeepSeek-V3 and o3-mini in preliminary human
preference evaluations on LMArena’s leaderboard. This helps you to
create engaging user experiences that can fit on a single GPU or TPU
host,” the release says.
The nightmare scenario for Nvidia and others is that even throwing huge amounts of compute at LLMs only provides marginal improvements, something that seems to be the case with GPT 4.5. Meanwhile, much cheaper and smaller models are proving to be capable of doing most of the things that users want.
That could mean a cliff in demand for GPUs and a decline in Nvidia revenues. Shares are up 4.3% today but were higher by 7% earlier and have tumbled to $113 from a high of $153 in early January.