Home / Technology
Google unveils eighth-generation TPUs at Cloud Next 2026
Google designed the TPU 8t specifically for massive frontier model training tasks
Google has accelerated its custom silicon strategy with the announcement of its eighth-generation Tensor Processing Units (TPUs) at the Google Cloud Next 2026 event.
For the first time in the product's history, the company has bifurcated its hardware into two distinct chips: the TPU 8t, engineered for massive frontier model training, and the TPU 8i, a "reasoning engine" purpose-built for high-scale inference.
Google Cloud CEO Thomas Kurian described the move as a natural evolution necessitated by power efficiency constraints as AI continues to scale.
Both chips, which follow the seventh-generation "Ironwood" series, are expected to be available for customers later this year.
The TPU 8t is designed to shorten development cycles for the largest AI models, offering nearly three times the compute performance per pod compared to its predecessor.
Meanwhile, the TPU 8i directly addresses the "memory wall" bottleneck by tripling on-chip SRAM to 384 MB, allowing AI agents to reason and act with significantly lower latency.
Hosted by Google’s custom Axion ARM-based CPUs, these chips represent a structural shift toward supporting "agent swarms" and continuous applications.
While Google continues to expand its custom offerings, it remains a key partner for Nvidia, announcing plans to provide access to the upcoming Vera Rubin GPU series.
However, the introduction of TPU-compatible tooling for Nvidia-friendly frameworks like PyTorch suggests a tightening competition.
This announcement comes as the industry shifts its economic focus from training models to running them at scale, a market recently heated by Nvidia’s $20 billion licensing agreement with inference rival Groq.
