
Scaling AI Up, Out, and Across: A Decision Framework for Networking Transitions Reshaping AI Infrastructure Economics
How leading AI operators move from raw GPU spend to model-advancing utilization through next-generation network design
As AI infrastructure spend crosses $700 billion in 2026, operator success is no longer defined by GPU volume, but by the ability to move data efficiently across scale-up, scale-out, and scale-across fabrics with different physics, economics, and failure modes.
This 60-minute virtual panel examined how operators raise effective GPU utilization by treating network observability as a first-class concern, distinguishing fault tolerance from failure recovery, and designing fabrics around the divergent demands of pre-training, fine-tuning, and inference workloads.
Watch On-Demand
The Panelists

Suresh Vasudevan
CEO, Clockwork.io
Previously CEO of Sysdig and Nimble Storage (acquired by HPE) — both category-defining infrastructure companies. Now leads Clockwork's software-driven AI fabric platform.
Panelists


Balaji Prabhakar
Co-Founder, Clockwork.io; Professor of CS and EE, Stanford University
VMware Founders Professor at Stanford. Co-invented QCN — the congestion control protocol underpinning today's RoCEv2 AI fabrics — and the Huygens clock synchronization system that forms the foundation of Clockwork's FleetIQ platform.
