Quackmaster Industries LLC · quackmaster.io · Reference Architecture · Network TCO

The Switch Tax Is Paid In Optics

A monolithic 3-tier fat tree buys the cheapest switch count at every scale as folded-Clos is mathematically minimal. It then pays that saving back many times over in super-spine transceivers. The Fordist model replicates self-contained Dragonfly+ pods linked by passive co-located DAC: more switches yet far fewer optics, zero east-west storage contention and lower latency.

SWITCH · NVIDIA Q3400-RA (72×1.6T OSFP) GPU · NVIDIA B200|GB200|GB300|VR200 (72/RACK) PRICING · FS / NADDOD / FiberMall, May 2026

Total Fabric Cost of Ownership

At 32,256 GPUs · 8 pods

$100M

$200M

$300M

$400M

—

Monolithic
3-Tier Fat Tree

SINGLE FAULT DOMAIN

—

Fordist DF+
Replicated Pods

N ISOLATED PODS

Switches 1.6T Transceivers Global DAC (0 optics) Breakout ACC+DAC 800G VAST Optics Fiber Patch

—

Scale

Live Assumptions

Adjust and the chart recomputes. Defaults are street estimates.

Q3400-RA switch $120K

$60K$200K

1.6T XDR transceiver $3,600

$1.5K$6K

1.6T passive DAC $400

$150$900

800G VAST optic $2,000

$800$4K

MPO fiber patch cord $120

$40$400

1.6T ACC active cable $1,100

$400$2.5K

Intra-rack active (ACC) share 30%

0% (all passive)100%

Line item	Fat Tree qty	DF+ qty	Fat Tree $	Fordist $	Winner

The Cost the Switch Count Hides

A monolithic fabric has no good place to put storage.

VAST CNodes must terminate somewhere, and the comparison above already equalizes the storage optics on both sides. But where the monolith lands that storage is a forced choice between two penalties yet neither of which the Fordist design pays.

Monolith · Option A

Leaf-local on the compute fabric

CNodes hang off compute leaves. Synchronous, bursty checkpoint and dataset-staging I/O then crosses the same super-spine core that carries all-reduce east-west collectives. In a non-blocking 3 level fat tree with 72 GPUs per rack and Q3400-RA switch this really isn't possible as all leaf local ports are full.

→ tail-latency spikes, depressed MFU, idle GPU-hours

Monolith · Option B

Separate Ethernet storage fabric

Stand up an entirely parallel network: its own VXLAN overlay, routing convention, and lossless PFC/ECN tuning, plus a second control plane that Slurm and Kubernetes must reconcile against the compute fabric.

→ another fabric, another failure domain, huge capex, permanent opex tax

Dragonfly+ · Resolved

DF3 leafless storage group

Storage terminates inside the same InfiniBand fabric, on seperate DF+ group that serves as the UGAL pressure-relief path. It uses idle bandwidth without contention with east-west training collectives.

→ one fabric, one convention, no east-west storm