--- id: equations-and-bounds title: "Equations and Bounds" status: established source_sections: "Derived from context files and official specifications" related_topics: [gb10-superchip, memory-and-storage, ai-workloads, connectivity] key_equations: [flops-fp4, memory-bandwidth, model-memory-estimate, nvlink-c2c-bandwidth, storage-throughput] key_terms: [tflops, pflop, bandwidth, throughput, fp4, fp8, fp16, fp32] images: [] examples: [llm-memory-estimation.md] open_questions: - "Sustained vs. peak TFLOPS under real workloads" - "Actual memory bandwidth under mixed CPU+GPU access patterns" --- # Equations and Bounds Reference for all quantitative specifications, formulas, and validation ranges for the Dell Pro Max GB10. ## 1. Compute Performance ### Peak TFLOPS by Precision | Precision | Peak TFLOPS | Source | Notes | |-----------|-------------|----------|------------------------------------| | FP4 | 1,000 | T0 Spec | Headline figure, 1 PFLOP | | FP8 | ~500 | T3 Infer | Typical 2:1 ratio from FP4 | | FP16 | ~250 | T3 Infer | Typical 4:1 ratio from FP4 | | FP32 | ~125 | T3 Infer | Typical 8:1 ratio from FP4 | *Note: FP8/FP16/FP32 values are inferred from typical Blackwell architecture ratios. Actual values not yet independently confirmed.* ### GPU Cores - **CUDA cores:** 6,144 (T0 Spec) - **Tensor Cores:** 5th generation (count TBD) ## 2. Memory ### Bandwidth - **Memory bandwidth:** 273 GB/s (T0 Spec, LPDDR5X at 9,400 MT/s) - **NVLink-C2C bandwidth:** 600 GB/s bidirectional (T0 Spec, CPU-GPU interconnect) ### Capacity - **Total unified memory:** 128 GB LPDDR5X (T0 Spec) - **Usable for models:** ~109-115 GB (T3 Infer, after OS/framework/KV cache overhead) ## 3. Model Memory Estimation ### Formula: Memory Required for Model Weights ``` Memory (GB) = Parameters (billions) × Bytes_per_parameter ``` | Precision | Bytes/Param | Formula | |-----------|-------------|-----------------------------------| | FP4 | 0.5 | Params_B × 0.5 | | FP8/INT8 | 1.0 | Params_B × 1.0 | | FP16 | 2.0 | Params_B × 2.0 | | FP32 | 4.0 | Params_B × 4.0 | ### Total Inference Memory (approximate) ``` Total Memory ≈ Model_Weights + KV_Cache + Activation_Memory + Framework_Overhead ``` Rule of thumb: budget **1.2-1.5x** the raw model weight size for total inference memory. ### Maximum Model Sizes (single unit, 128 GB) | Precision | Max Params (raw) | Max Params (practical, ~110 GB usable) | |-----------|-------------------|----------------------------------------| | FP4 | 256B | ~200B | | FP8/INT8 | 128B | ~100B | | FP16 | 64B | ~55B | | FP32 | 32B | ~27B | ## 4. Networking Bounds | Interface | Bandwidth | Direction | |---------------------|--------------------|-----------------| | NVLink-C2C | 600 GB/s | Bidirectional | | LPDDR5X memory | 273 GB/s | System memory | | QSFP (per port) | 200 Gbps (25 GB/s) | Network | | QSFP (total) | 400 Gbps (50 GB/s) | 2 ports combined| | 10 GbE Ethernet | 10 Gbps (1.25 GB/s)| Network | | USB-C (per port) | 20 Gbps (2.5 GB/s) | I/O | ## 5. Power Bounds | Parameter | Value | |---------------------|---------| | PSU rating | 280W | | System TDP | ~140W | | Power delivery | USB-C PD| ## 6. Physical Bounds | Parameter | Value | |---------------|---------------| | Volume | ~1.15 L | | Weight | 1.31 kg | | Footprint | 150 × 150 mm | | Height | 51 mm | ## 7. Validation Rules When checking calculations: - Model size estimates should not exceed 128 GB (single) or 256 GB (stacked) - TFLOPS claims must specify precision — reject unqualified "1 PFLOP" statements - Memory bandwidth (273 GB/s) is the system memory bus, NOT the NVLink-C2C (600 GB/s) - Network bandwidth (QSFP) is in Gbps, not GB/s — divide by 8 for bytes