Skip to content

Scaling AI Up, Out, and Across: A Decision Framework for Networking Transitions Reshaping AI Infrastructure Economics

How leading AI operators move from raw GPU spend to model-advancing utilization through next-generation network design

As AI infrastructure spend crosses $700 billion in 2026, operator success is no longer defined by GPU volume, but by the ability to move data efficiently across scale-up, scale-out, and scale-across fabrics with different physics, economics, and failure modes.

This 60-minute virtual panel examined how operators raise effective GPU utilization by treating network observability as a first-class concern, distinguishing fault tolerance from failure recovery, and designing fabrics around the divergent demands of pre-training, fine-tuning, and inference workloads.

Watch On-Demand


The Panelists

Suresh

Suresh Vasudevan

CEO, Clockwork.io

Previously CEO of Sysdig and Nimble Storage (acquired by HPE) — both category-defining infrastructure companies. Now leads Clockwork's software-driven AI fabric platform.

 
 
 

Panelists

Roy

Roy Chua

Founder & Principal Analyst, AvidThink
 
Lead author of the 2026 Data Center Networking Report (third annual edition). Previously co-founder and Head of Research at SDxCentral. 20+ years in enterprise cloud infrastructure and networking.
Balaji

Balaji Prabhakar

Co-Founder, Clockwork.io; Professor of CS and EE, Stanford University

VMware Founders Professor at Stanford. Co-invented QCN — the congestion control protocol underpinning today's RoCEv2 AI fabrics — and the Huygens clock synchronization system that forms the foundation of Clockwork's FleetIQ platform.


Get Access to the Replay