--- id: whole-body-control title: "Whole-Body Control" status: established source_sections: "reference/sources/paper-groot-wbc.md, reference/sources/paper-h2o.md, reference/sources/paper-omnih2o.md, reference/sources/paper-humanplus.md, reference/sources/paper-softa.md, reference/sources/github-groot-wbc.md, reference/sources/github-pinocchio.md" related_topics: [locomotion-control, manipulation, motion-retargeting, push-recovery-balance, learning-and-ai, joint-configuration] key_equations: [com, zmp, inverse_dynamics] key_terms: [whole_body_control, task_space_inverse_dynamics, operational_space_control, centroidal_dynamics, qp_solver, groot_wbc, residual_policy, h2o, omnih2o, pinocchio] images: [] examples: [] open_questions: - "Can GR00T-WBC run at 500 Hz on the Jetson Orin NX?" - "Does the stock locomotion computer expose a low-level override interface?" - "What is the practical latency penalty of the overlay (residual) approach vs. full replacement?" --- # Whole-Body Control Frameworks and architectures for coordinating balance, locomotion, and upper-body motion simultaneously. This is the unifying layer that enables motion capture playback with robust balance. ## 1. What Is Whole-Body Control? Whole-body control (WBC) treats the entire robot as a single coordinated system rather than controlling legs (balance) and arms (task) independently. A WBC controller solves for all joint commands simultaneously, subject to: - **Task objectives:** Track a desired upper-body trajectory (e.g., from mocap) - **Balance constraints:** Keep the center of mass (CoM) within the support polygon - **Physical limits:** Joint position/velocity/torque limits, self-collision avoidance - **Contact constraints:** Maintain foot contact, manage ground reaction forces The key insight: balance is formulated as a *constraint*, not a separate controller. This lets the robot move its arms freely while the optimizer automatically adjusts the legs to stay stable. [T1 — Established robotics paradigm] ## 2. The G1 Architectural Constraint The G1 has a **dual-computer architecture** (see [[locomotion-control]]): [T0] ``` ┌─────────────────────────┐ ┌─────────────────────────┐ │ Locomotion Computer │ │ Development Computer │ │ 192.168.123.161 │ │ 192.168.123.164 │ │ (proprietary, locked) │◄───►│ Jetson Orin NX 16GB │ │ │ DDS │ (user-accessible) │ │ Stock RL balance policy │ │ Custom code runs here │ └─────────────────────────┘ └─────────────────────────┘ ``` This creates two fundamentally different WBC approaches: ### Approach A: Overlay (Residual) — Safer - Keep the stock locomotion controller running on the locomotion computer - Send high-level commands (velocity, posture) via the sport mode API - Add upper-body joint commands for arms via `rt/lowcmd` for individual arm joints - The stock controller handles balance; you only control the upper body - **Pro:** Low risk, stock balance is well-tuned - **Con:** Limited authority — can't deeply coordinate leg and arm motions, stock controller may fight your arm commands if they shift CoM significantly ### Approach B: Full Replacement — Maximum Control - Bypass the stock controller entirely - Send raw joint commands to ALL joints (legs + arms + waist) via `rt/lowcmd` at 500 Hz - Must implement your own balance and locomotion from scratch - **Pro:** Full authority over all joints, true whole-body optimization - **Con:** High risk of falls, requires validated balance policy, significant development effort ### Approach C: GR00T-WBC Framework — Best of Both (Recommended) - Uses a trained RL locomotion policy for lower body (replaceable, not the stock one) - Provides a separate interface for upper-body control - Coordinates both through a unified framework - **Pro:** Validated on G1, open-source, designed for this exact use case - **Con:** Requires training a new locomotion policy (but provides tools to do so) [T1 — Confirmed from developer documentation and GR00T-WBC architecture] ## 3. GR00T-WholeBodyControl (NVIDIA) The most G1-relevant WBC framework. Open-source, designed specifically for Unitree humanoids. [T1] | Property | Value | |---|---| | Repository | NVIDIA-Omniverse/gr00t-wbc (GitHub) | | License | Apache 2.0 | | Target robots | Unitree G1, H1 | | Integration | LeRobot (HuggingFace), Isaac Lab | | Architecture | Decoupled locomotion (RL) + upper-body (task policy) | | Deployment | unitree_sdk2_python on Jetson Orin | ### Architecture ``` ┌──────────────────┐ │ Task Policy │ (mocap tracking, manipulation, etc.) │ (upper body) │ └────────┬─────────┘ │ desired upper-body joints ┌────────▼─────────┐ │ WBC Coordinator │ ← balance constraints │ (optimization) │ ← joint limits └────────┬─────────┘ │ ┌──────────────────┼──────────────────┐ │ │ │ ┌──────▼──────┐ ┌──────▼──────┐ ┌───────▼──────┐ │ Locomotion │ │ Arm Joints │ │ Waist Joint │ │ Policy (RL) │ │ │ │ │ │ (lower body) │ │ │ │ │ └─────────────┘ └─────────────┘ └──────────────┘ ``` ### Key Features - **Locomotion policy:** RL-trained, handles balance and walking. Can be retrained with perturbation robustness. - **Upper-body interface:** Accepts target joint positions for arms/waist from any source (mocap, learned policy, teleoperation) - **LeRobot integration:** Data collection via teleoperation → behavior cloning → deployment, all within the GR00T-WBC framework - **Sim-to-real:** Trained in Isaac Lab, deployed on real G1 via unitree_sdk2 - **G1 configs:** Supports 29-DOF and 23-DOF variants ### Why This Matters for Mocap + Balance GR00T-WBC is the most direct path to the user's goal: the locomotion policy maintains balance (including push recovery if trained with perturbations) while the upper body tracks mocap reference trajectories. The two are coordinated through the WBC layer. ### Deployment on Dell Pro Max GB10 — Verified (2026-02-14/15) [T1] GR00T-WBC has been **successfully deployed on a real G1 robot** via Dell Pro Max GB10 (NVIDIA Grace Blackwell, aarch64, Ubuntu 24.04). The robot stands and balances autonomously. **Pre-trained ONNX Policies:** - `GR00T-WholeBodyControl-Balance.onnx` — standing balance (15 lower-body joint targets) - `GR00T-WholeBodyControl-Walk.onnx` — locomotion with velocity commands - Both: 516-dim observation → 15-dim action. Pre-trained by NVIDIA (training code not open-sourced). - Training: PPO via RSL-RL in Isaac Lab, domain randomization, zero-shot sim-to-real. Exact reward function and perturbation curriculum not published. - **Training code available separately** via WBC-AGILE (nvidia-isaac/WBC-AGILE) for retraining/fine-tuning **ONNX Policy Details:** - Observation layout (86 dims per history step, 6 steps = 516): - `[0:3]` = velocity commands × cmd_scale - `[3:4]` = height command - `[4:7]` = [roll_cmd, pitch_cmd, yaw_cmd] - `[7:10]` = angular velocity × 0.5 - `[10:13]` = gravity orientation (from IMU quaternion) - `[13:42]` = (joint_pos - defaults) × 1.0 (29 DOF) - `[42:71]` = joint_vel × 0.05 (29 DOF) - `[71:86]` = previous action (15) - Action transform: `cmd_q = action * 0.25 + default_angles` - Action bounds: **No clipping** — policy outputs can exceed [-1,1], this is intentional for push recovery. Do NOT add np.clip(). NVIDIA reference code does not clip. - Policy selection: Balance when `np.linalg.norm(cmd) < 0.05`, Walk otherwise **Performance on GB10:** - ~3.5 ms per control loop iteration at 50 Hz (sync mode) — only 17.5% of time budget - 401% CPU usage (4 cores) — MuJoCo physics dominates - Both Balance and Walk policies load and execute successfully **Critical Fixes Required for Real Robot Deployment:** 1. **IMU pitch offset calibration** — The G1's pelvis IMU has a physical mounting offset (~6°) that sim doesn't model. Must rotate quaternion before gravity computation. See [[sensors-perception]] §4. Without this fix, robot leans backward persistently. 2. **Negative KD bug** — `configs.py` has `MOTOR_KD[14] -= 10` which makes waist_pitch KD negative (-5). Comment out this line. 3. **Dynamic mode_machine detection** — GR00T-WBC hardcodes `mode_machine=5`. Apply PR #11 to read from `rt/lowstate` instead. 4. **GLFW crash on headless** — `simulator_factory.py` eagerly imports mujoco. Make BaseSimulator import lazy + run Xvfb :99. 5. **CYCLONEDDS_URI** — Must set explicit network interface: `address="192.168.123.100"` for GB10. 6. **Do NOT clip actions** — ONNX policy outputs intentionally exceed [-1,1]. Clipping causes policy saturation at clip boundaries with no room for balance corrections. **Critical Fixes Required for GB10 Simulation (aarch64):** 1. **CycloneDDS buffer overflow:** The `` XML section in `unitree_sdk2py/core/channel_config.py` triggers a glibc FORTIFY_SOURCE buffer overflow on aarch64. Fix: remove the `` section entirely. (See [[dev-environment]] for patch details.) 2. **ROS2 Python path:** venv needs `.pth` file pointing to `/opt/ros/jazzy/lib/python3.12/site-packages/` 3. **ROS2 shared libraries:** `export LD_LIBRARY_PATH=/opt/ros/jazzy/lib:$LD_LIBRARY_PATH` 4. **Sync mode bug:** `run_g1_control_loop.py` checks for sim thread in sync mode where none exists. Patch: add `not config.sim_sync_mode` guard. **Keyboard Control (Real Robot via tmux):** - Keys sent via `tmux send-keys -t groot "key"` from remote machine - `]`=activate policy, `o`=deactivate policy - `w/s`=fwd/back, `a/d`=strafe, `q/e`=rotate, `z`=stop - `1/2`=height up/down, `5/6`=pitch, `3/4`=roll, `7/8`=yaw - `9/0`=IMU pitch offset ±1° (custom addition for calibration) **Visualization (Simulation):** - GLFW passive viewer freezes on virtual/remote displays (Xvfb, NoMachine) after a few seconds - VNC (x11vnc) cannot capture OpenGL framebuffer updates - Working solution: NoMachine virtual desktop (NX protocol) — viewer works initially but GLFW stalls - Best solution: Web-based MJPEG streaming via MuJoCo offscreen renderer (bypasses all X11/GLFW issues) ## 4. Pinocchio + TSID (Model-Based WBC) An alternative to RL-based WBC using classical optimization. [T1 — Established framework] | Property | Value | |---|---| | Library | Pinocchio (stack-of-tasks/pinocchio) | | Language | C++ with Python bindings | | License | BSD-2-Clause | | Key capability | Rigid body dynamics, forward/inverse kinematics, Jacobians, dynamics | ### Task-Space Inverse Dynamics (TSID) Pinocchio + TSID solves a QP at each timestep: ``` minimize || J_task * qdd - x_task_desired ||^2 (track task) subject to CoM ∈ support polygon (balance) q_min ≤ q ≤ q_max (joint limits) tau_min ≤ tau ≤ tau_max (torque limits) contact constraints (feet on ground) ``` - **Pros:** Interpretable, respects physics exactly, no training required - **Cons:** Requires accurate dynamics model (masses, inertias), computationally heavier than RL at runtime, less robust to model errors - **G1 compatibility:** Needs URDF with accurate dynamics. MuJoCo Menagerie model or unitree_ros URDF provide this. ### Use Cases for G1 - Offline trajectory optimization (plan mocap-feasible trajectories ahead of time) - Real-time WBC if dynamics model is accurate enough - Validation tool: check if a retargeted motion is physically feasible before executing ## 5. RL-Based WBC Approaches ### SoFTA — Slow-Fast Two-Agent RL Decouples whole-body control into two agents operating at different frequencies: [T1 — Research] - **Slow agent (lower body):** Locomotion at standard control frequency, handles balance and walking - **Fast agent (upper body):** Manipulation at higher frequency for precision tasks - Key insight: upper body needs faster updates for manipulation precision; lower body is slower but more stable ### H2O — Human-to-Humanoid Real-Time Teleoperation (arXiv:2403.01623) [T1 — Validated on humanoid hardware] - Real-time human motion retargeting to humanoid robot - RL policy trained to imitate human demonstrations while maintaining balance - Demonstrated whole-body teleoperation including walking + arm motion - Relevant to G1: proves the combined mocap + balance paradigm works ### OmniH2O — Universal Teleoperation and Autonomy (arXiv:2406.08858) [T1] - Extends H2O with multiple input modalities (VR, RGB camera, motion capture) - Trains a universal policy that generalizes across different human operators - Supports both teleoperation (real-time) and autonomous replay (offline) - Directly relevant: could drive G1 from mocap recordings ### HumanPlus — Humanoid Shadowing and Imitation (arXiv:2406.10454) [T1] - "Shadow mode": real-time mimicry of human motion using RGB camera - Collects demonstrations during shadowing, then trains autonomous policy via imitation learning - Complete pipeline from human motion → robot imitation → autonomous execution - Validated on full-size humanoid with walking + manipulation ## 6. WBC Implementation Considerations for G1 ### Compute Budget - Jetson Orin NX: 100 TOPS (AI), 8-core ARM CPU - RL policy inference: typically < 1ms per step - QP-based TSID: typically 1-5ms per step (depends on DOF count) - 500 Hz control loop = 2ms budget per step - RL approach fits comfortably; QP-based requires careful optimization ### Sensor Requirements - IMU (onboard): orientation, angular velocity — available - Joint encoders (onboard): position, velocity — available at 500 Hz - Foot contact sensing: NOT standard on G1 — must infer from joint torques or add external sensors [T3] - External force estimation: possible from IMU + dynamics model, or add force/torque sensors [T3] ### Communication Path ``` Jetson Orin (user code) ──DDS──► rt/lowcmd ──► Joint Actuators ◄──DDS── rt/lowstate ◄── Joint Encoders + IMU ``` - Latency: ~2ms DDS round trip [T1] - Frequency: 500 Hz control loop [T0] - Both overlay and replacement approaches use this same DDS path ## Key Relationships - Builds on: [[locomotion-control]] (balance as a component of WBC) - Enables: [[motion-retargeting]] (WBC provides the balance guarantee during mocap playback) - Enables: [[push-recovery-balance]] (WBC can incorporate perturbation robustness) - Uses: [[joint-configuration]] (all joints coordinated as one system) - Uses: [[sensors-perception]] (IMU + encoders for state estimation) - Trained via: [[learning-and-ai]] (RL training for locomotion component) - Bounded by: [[equations-and-bounds]] (CoM, ZMP, joint limits)