Quackmaster Industries LLC · quackmaster.io · Reference Architecture · Network TCO

The Switch Tax Is Paid In Optics

A monolithic 3-tier fat tree buys the cheapest switch count at every scale as folded-Clos is mathematically minimal. It then pays that saving back many times over in super-spine transceivers. The Fordist model replicates self-contained Dragonfly+ pods linked by passive co-located DAC: more switches yet far fewer optics, zero east-west storage contention and lower latency.

SWITCH · NVIDIA Q3400-RA (72×1.6T OSFP) GPU · NVIDIA B200|GB200|GB300|VR200 (72/RACK) PRICING · FS / NADDOD / FiberMall, May 2026

Total Fabric Cost of Ownership

At 32,256 GPUs · 8 pods
$100M
$200M
$300M
$400M
Monolithic
3-Tier Fat Tree
SINGLE FAULT DOMAIN
Fordist DF+
Replicated Pods
N ISOLATED PODS
Switches 1.6T Transceivers Global DAC (0 optics) Breakout ACC+DAC 800G VAST Optics Fiber Patch

Scale

Live Assumptions

Adjust and the chart recomputes. Defaults are street estimates.

$60K$200K
$1.5K$6K
$150$900
$800$4K
$40$400
$400$2.5K
0% (all passive)100%
Line itemFat Tree qtyDF+ qtyFat Tree $Fordist $Winner

The Cost the Switch Count Hides

A monolithic fabric has no good place to put storage.

VAST CNodes must terminate somewhere, and the comparison above already equalizes the storage optics on both sides. But where the monolith lands that storage is a forced choice between two penalties yet neither of which the Fordist design pays.

Monolith · Option A
Leaf-local on the compute fabric

CNodes hang off compute leaves. Synchronous, bursty checkpoint and dataset-staging I/O then crosses the same super-spine core that carries all-reduce east-west collectives. In a non-blocking 3 level fat tree with 72 GPUs per rack and Q3400-RA switch this really isn't possible as all leaf local ports are full.

→ tail-latency spikes, depressed MFU, idle GPU-hours
Monolith · Option B
Separate Ethernet storage fabric

Stand up an entirely parallel network: its own VXLAN overlay, routing convention, and lossless PFC/ECN tuning, plus a second control plane that Slurm and Kubernetes must reconcile against the compute fabric.

→ another fabric, another failure domain, huge capex, permanent opex tax
Dragonfly+ · Resolved
DF3 leafless storage group

Storage terminates inside the same InfiniBand fabric, on seperate DF+ group that serves as the UGAL pressure-relief path. It uses idle bandwidth without contention with east-west training collectives.

→ one fabric, one convention, no east-west storm