Google developing inference AI chips to rival Nvidia

Google $GOOGL is developing new chips dedicated to AI inference in partnership with Marvell Technology, positioning Alphabet to more directly compete with Nvidia $NVDA in a semiconductor category driven by surging demand for AI software, according to Bloomberg. After a model is trained, inference is the stage where it actually does its job โ€” fielding…


Google developing inference AI chips to rival Nvidia

Google $GOOGL is developing new chips dedicated to AI inference in partnership with Marvell Technology, positioning Alphabet to more directly compete with Nvidia $NVDA in a semiconductor category driven by surging demand for AI software, according to Bloomberg.

After a model is trained, inference is the stage where it actually does its job โ€” fielding queries and producing outputs. Google plans to announce a new generation of its tensor processing units, known as TPUs, at the Google Cloud Next conference in Las Vegas this week, with inference-focused chips expected to follow.

“The battleground is shifting towards inference,” Gartner analyst Chirag Dekate told Bloomberg. Google Chief Scientist Jeff Dean said in an interview that as AI demand grows, “it now becomes sensible to specialize chips more for training or more for inference workloads.”

Amin Vahdat, who oversees Google‘s AI infrastructure and chip work, declined to comment on specific inference chip plans but said more details would likely be shared “in the relatively near future.”

According to Partha Ranganathan, a vice president and engineering fellow at the company, Google weighed the idea of distinct training and inference chips in its early days before ultimately deciding against it. That approach may be changing as the broader AI spending cycle shifts from training toward inference workloads.

Entering the inference market, Google can draw on advantages built over years of in-house chip development, substantial revenue from its search business, and an unusually close relationship with the AI models its hardware is meant to run. No other leading AI developer manufactures its own chips at comparable volume, a structural edge that tightens the loop between the people building Google‘s models and those designing the silicon they run on.

Demand for Google‘s TPUs has grown substantially. Meta $META struck a multibillion-dollar agreement to procure TPUs via Google Cloud, and Santosh Janardhan, who leads Meta‘s infrastructure operations, said that initial results point to possible performance gains on inference tasks. Anthropic, which expanded its TPU access to as many as 1 million chips, also signed a separate deal with Broadcom $AVGO โ€” Google‘s TPU manufacturing partner โ€” for chips enabling roughly 3.5 gigawatts of computing power starting in 2027.

A person familiar with the matter told Bloomberg that Google has been piloting an arrangement under which enterprise customers, Anthropic among them, could deploy TPU hardware on-premises instead of relying solely on Google‘s cloud infrastructure. The company has also opened TPU access to outside tools such as PyTorch, moving away from a purely proprietary software environment.

Nvidia is still the leader in AI chips, especially for training. Nvidia CEO Jensen Huang said at the company’s GTC conference earlier this year that its chips can handle applications “you can’t do with TPUs.” Google uses both TPUs and Nvidia GPUs for its own AI projects.

Supply constraints may complicate Google‘s ambitions. An unnamed startup executive described chip scarcity as a real obstacle, telling Bloomberg the company had little access to TPUs. Hassabis, for his part, confirmed that available supply is being steered toward leading AI organizations โ€” the cohort he described as “the more elite teams.”

๐Ÿ“ฌ Sign up for the Daily Brief

Source link