You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
11 KiB
11 KiB
| id | title | status | source_sections | related_topics |
|---|---|---|---|---|
| open-questions | Open Questions | active | Aggregated from all context files | [gb10-superchip memory-and-storage connectivity dgx-os-software ai-frameworks ai-workloads multi-unit-stacking physical-specs setup-and-config skus-and-pricing] |
Open Questions
Catalog of known unknowns, research gaps, and unresolved questions about the Dell Pro Max GB10.
Hardware
GB10 Superchip
- Q: What are the exact clock speeds for CPU and GPU dies under sustained load?
- Status: Unknown. No official boost/base clocks published.
- Would resolve: Performance prediction, thermal modeling
- Q: What is the detailed per-precision TFLOPS breakdown (FP4/FP8/FP16/FP32/FP64)?
- Status: FP4 = 1,000 TFLOPS (official). FP64 HPL = ~675 GFLOPS (benchmarked). Others inferred.
- Would resolve: Accurate workload performance estimation
Memory
- Q: Is the LPDDR5X soldered or socketed?
- Status: Resolved — see Resolved Questions below.
Storage
- Q: Which specific SSD model/brand is used in Dell vs. DGX Spark units?
- Status: Unknown. Only form factor (M.2 2242 PCIe Gen5) confirmed.
- Would resolve: Drive performance expectations, replacement sourcing
- Q: What are the exact sequential and random IOPS?
- Status: Unknown. No benchmarks published.
- Would resolve: Storage performance expectations
Software
DGX OS
- Q: Can stock Ubuntu 24.04 ARM be installed instead of DGX OS?
- Status: Likely possible but unsupported. Not documented. Requires NVIDIA kernel.
- Would resolve: OS flexibility
- Q: Full list of pre-installed NVIDIA packages and exact versions?
- Status: Major components known (CUDA, cuDNN, Docker, NGC, AI Enterprise, DGX Dashboard). Exact versions not published.
- Would resolve: Development environment baseline
- Q: Update cadence and EOL timeline details?
- Status: 2-year guarantee mentioned (Jeff Geerling). Exact cadence unknown.
- Would resolve: Long-term maintenance planning
AI Frameworks
- Q: TensorFlow support status on ARM GB10?
- Status: Unknown. Official vs. community builds unclear.
- Would resolve: Framework selection for TF users
- Q: Full NGC catalog availability for GB10?
- Status: Unknown. Which containers have ARM64 builds.
- Would resolve: Software ecosystem breadth
- Q: vLLM or other inference server support on ARM Blackwell?
- Status: Unknown.
- Would resolve: Production inference deployment options
- Q: JAX support status?
- Status: Unknown.
- Would resolve: Framework selection for JAX users
Networking / Multi-Unit
- Q: Can more than 2 units be stacked?
- Status: Only 2-unit documented. Slurm/K8s support suggests possible. Not confirmed.
- Would resolve: Maximum scaling potential
- Q: Performance overhead of inter-unit communication (quantified)?
- Status: 200GbE RDMA link, but no latency/overhead benchmarks published.
- Would resolve: Stacked vs. single performance expectations
- Q: Actual tokens/sec for 405B models on stacked config?
- Status: Unknown.
- Would resolve: Real-world stacking value proposition
Physical / Environmental
- Q: VESA mount compatibility?
- Status: Unknown.
- Would resolve: Mounting options
- Q: Exact heatsink dimensions and material?
- Status: Dual-fan + dense heatsink confirmed, but exact specs unknown.
- Would resolve: Aftermarket cooling or case modding potential
Performance Benchmarks
- Q: Tokens/sec for Llama 3.3 70B specifically?
- Status: Only Llama 3.2 3B (~100 tok/s) and GPT-OSS-120B (~14.5 tok/s) benchmarked.
- Would resolve: Most common use case performance
- Q: Fine-tuning time estimates for common model sizes?
- Status: Partially resolved — scripts and methods documented (Full SFT 3B, LoRA 8B, QLoRA 70B) but wall-clock times not published.
- Would resolve: Training workflow planning
- Q: Stable Diffusion / image generation performance?
- Status: Partially resolved — ComfyUI confirmed working with SD 1.5. Quantitative benchmarks (images/sec) not published.
- Would resolve: Non-LLM AI workload suitability
- Q: Speculative decoding speedup factor?
- Status: EAGLE-3 and Draft-Target methods documented. Quantitative speedup (tokens/sec improvement) not published.
- Would resolve: Inference optimization ROI
Resolved Questions
| Date | Question | Resolution | Source |
|---|---|---|---|
| 2026-02-14 | Is the M.2 SSD user-replaceable? | Yes — FRU. 4 Torx screws + 1 M2x2. Supports M.2 2230 AND 2242. PCIe Gen4. | Dell Owner's Manual Rev A01 |
| 2026-02-14 | Memory channel configuration? | 256-bit interface, 16 channels LPDDR5X 8533 | NVIDIA DGX Spark User Guide |
| 2026-02-14 | Does DGX OS include Docker? | Yes — Docker + NVIDIA Container Runtime pre-installed | NVIDIA DGX Spark User Guide |
| 2026-02-14 | Operating temperature range? | Dell: 0-35°C; NVIDIA: 5-30°C | Dell Owner's Manual + NVIDIA UG |
| 2026-02-14 | Humidity range? | 10-90% non-condensing | NVIDIA DGX Spark User Guide |
| 2026-02-14 | Operating altitude? | Up to 3,000m (9,843 ft) | NVIDIA DGX Spark User Guide |
| 2026-02-14 | Noise levels under load? | Never exceeded 40 dB at 1-1.5m (non-stress) | ServeTheHome |
| 2026-02-14 | Cooling solution? | Dual-fan + dense heatsink, front-to-back airflow | Jeff Geerling, ServeTheHome |
| 2026-02-14 | BIOS/firmware update procedure? | apt upgrade + fwupdmgr, or DGX Dashboard GUI | ServeTheHome |
| 2026-02-14 | Network boot (PXE)? | Supported via UEFI → Advanced → Network Stack Configuration | NVIDIA DGX Spark User Guide |
| 2026-02-14 | First-boot wizard steps? | 10-step wizard documented (language, account, network, etc.) | NVIDIA DGX Spark User Guide |
| 2026-02-14 | QSFP cables for stacking? | Amphenol NJAAKK-N911/0006 or Luxshare LMTQF022-SD-R | NVIDIA DGX Spark User Guide |
| 2026-02-14 | Software config for stacking? | Netplan + SSH + MPI + NCCL v2.28.3 | NVIDIA DGX Spark User Guide |
| 2026-02-14 | Does stacking appear as single device? | No — 2-node distributed cluster, requires multi-node code | NVIDIA DGX Spark User Guide |
| 2026-02-14 | Can QSFP ports be used for general networking? | Yes — Ethernet config, 200GbE RDMA capable | Jeff Geerling |
| 2026-02-14 | Tokens/sec for common LLMs? | Llama 3.2 3B: ~100 tok/s; GPT-OSS-120B: ~14.5 tok/s | Jeff Geerling, ServeTheHome |
| 2026-02-14 | Thermal throttling behavior (Dell)? | Dell design prevents throttling; quieter than DGX Spark | Jeff Geerling |
| 2026-02-14 | Is LPDDR5X soldered? | Yes — soldered, not upgradeable | Form factor / LPDDR5X standard |
| 2026-02-14 | OTA update mechanism? | apt + fwupdmgr (CLI) or DGX Dashboard (GUI) | ServeTheHome |
| 2026-02-14 | HDMI version on Dell? | HDMI 2.1a (not 2.1b). Max 8K@30. | Dell Owner's Manual Rev A01 |
| 2026-02-14 | SSD interface speed? | PCIe Gen4 NVMe up to 64 GT/s (NOT Gen5 as forums suggested) | Dell Owner's Manual Rev A01 |
| 2026-02-14 | SSD form factors supported? | Both M.2 2230 and M.2 2242 | Dell Owner's Manual Rev A01 |
| 2026-02-14 | Max display resolution? | USB-C DP: 8K@60; HDMI: 8K@30 | Dell Owner's Manual Rev A01 |
| 2026-02-14 | BIOS entry method? | Delete key for BIOS; F7 for one-time boot | Dell Owner's Manual Rev A01 |
| 2026-02-14 | Full BIOS menu structure? | Main/Advanced/Security/Boot/Save&Exit fully documented | Dell Owner's Manual Rev A01 |
| 2026-02-14 | Wireless module model? | AzureWave AW-EM637, 2.4/5/6 GHz, Wi-Fi 7, BT 5.4 | Dell Owner's Manual Rev A01 |
| 2026-02-14 | Network controller model? | Realtek RTL8127-CG (10GbE) + NVIDIA ConnectX-7 (QSFP) | Dell Owner's Manual Rev A01 |
| 2026-02-14 | Processor cache? | 16 MB | Dell Owner's Manual Rev A01 |
| 2026-02-14 | Weight range? | 1.22-1.34 kg depending on configuration | Dell Owner's Manual Rev A01 |
| 2026-02-14 | Power adapter dimensions? | 23 x 78 x 162 mm, multi-voltage output (5V-48V) | Dell Owner's Manual Rev A01 |
| 2026-02-14 | USB-C MST support? | Not supported (single display per port only) | Dell Owner's Manual Rev A01 |
| 2026-02-14 | Service tools required? | Phillips #0, T5 or T8 Torx screwdriver | Dell Owner's Manual Rev A01 |
| 2026-02-14 | CUDA compute capability / SM architecture? | sm_121 (compile with -DCMAKE_CUDA_ARCHITECTURES="121") |
build.nvidia.com/spark |
| 2026-02-14 | CUDA toolkit version? | CUDA 13.0 (PyTorch wheels: cu130) | build.nvidia.com/spark |
| 2026-02-14 | DGX Dashboard URL/port? | http://localhost:11000 |
build.nvidia.com/spark |
| 2026-02-14 | TensorRT-LLM availability? | Confirmed — container tensorrt-llm/release:1.2.0rc6 |
build.nvidia.com/spark |
| 2026-02-14 | Fine-tuning methods supported? | Full SFT (3B), LoRA (8B), QLoRA 4-bit (70B), FSDP multi-node | build.nvidia.com/spark |
| 2026-02-14 | Image generation support? | ComfyUI confirmed (SD, SDXL, Flux) on port 8188 | build.nvidia.com/spark |
| 2026-02-14 | Ollama / Open WebUI support? | Yes — Docker container, port 12000 (Sync) or 8080 (direct) | build.nvidia.com/spark |
| 2026-02-14 | NVIDIA Sync details? | Cross-platform app, SSH key automation, VS Code/Cursor/Dashboard launch, port forwarding | build.nvidia.com/spark |
| 2026-02-14 | PyTorch NGC container? | nvcr.io/nvidia/pytorch:25.11-py3 (ARM64) |
build.nvidia.com/spark |
| 2026-02-14 | Speculative decoding methods? | EAGLE-3 (built-in drafting) and Draft-Target (8B+70B) | build.nvidia.com/spark |