---
id: whole-body-control
title: "Whole-Body Control"
status: established
source_sections: "reference/sources/paper-groot-wbc.md, reference/sources/paper-h2o.md, reference/sources/paper-omnih2o.md, reference/sources/paper-humanplus.md, reference/sources/paper-softa.md, reference/sources/github-groot-wbc.md, reference/sources/github-pinocchio.md"
related_topics: [locomotion-control, manipulation, motion-retargeting, push-recovery-balance, learning-and-ai, joint-configuration]
key_equations: [com, zmp, inverse_dynamics]
key_terms: [whole_body_control, task_space_inverse_dynamics, operational_space_control, centroidal_dynamics, qp_solver, groot_wbc, residual_policy, h2o, omnih2o, pinocchio]
images: []
examples: []
open_questions:
  - "Can GR00T-WBC run at 500 Hz on the Jetson Orin NX?"
  - "Does the stock locomotion computer expose a low-level override interface?"
  - "What is the practical latency penalty of the overlay (residual) approach vs. full replacement?"
---

# Whole-Body Control

Frameworks and architectures for coordinating balance, locomotion, and upper-body motion simultaneously. This is the unifying layer that enables motion capture playback with robust balance.

## 1. What Is Whole-Body Control?

Whole-body control (WBC) treats the entire robot as a single coordinated system rather than controlling legs (balance) and arms (task) independently. A WBC controller solves for all joint commands simultaneously, subject to:

- **Task objectives:** Track a desired upper-body trajectory (e.g., from mocap)
- **Balance constraints:** Keep the center of mass (CoM) within the support polygon
- **Physical limits:** Joint position/velocity/torque limits, self-collision avoidance
- **Contact constraints:** Maintain foot contact, manage ground reaction forces

The key insight: balance is formulated as a *constraint*, not a separate controller. This lets the robot move its arms freely while the optimizer automatically adjusts the legs to stay stable. [T1 — Established robotics paradigm]

## 2. The G1 Architectural Constraint

The G1 has a **dual-computer architecture** (see [[locomotion-control]]): [T0]

```
┌─────────────────────────┐     ┌─────────────────────────┐
│  Locomotion Computer     │     │  Development Computer    │
│  192.168.123.161         │     │  192.168.123.164         │
│  (proprietary, locked)   │◄───►│  Jetson Orin NX 16GB     │
│                          │ DDS │  (user-accessible)       │
│  Stock RL balance policy │     │  Custom code runs here   │
└─────────────────────────┘     └─────────────────────────┘
```

This creates two fundamentally different WBC approaches:

### Approach A: Overlay (Residual) — Safer
- Keep the stock locomotion controller running on the locomotion computer
- Send high-level commands (velocity, posture) via the sport mode API
- Add upper-body joint commands for arms via `rt/lowcmd` for individual arm joints
- The stock controller handles balance; you only control the upper body
- **Pro:** Low risk, stock balance is well-tuned
- **Con:** Limited authority — can't deeply coordinate leg and arm motions, stock controller may fight your arm commands if they shift CoM significantly

### Approach B: Full Replacement — Maximum Control
- Bypass the stock controller entirely
- Send raw joint commands to ALL joints (legs + arms + waist) via `rt/lowcmd` at 500 Hz
- Must implement your own balance and locomotion from scratch
- **Pro:** Full authority over all joints, true whole-body optimization
- **Con:** High risk of falls, requires validated balance policy, significant development effort

### Approach C: GR00T-WBC Framework — Best of Both (Recommended)
- Uses a trained RL locomotion policy for lower body (replaceable, not the stock one)
- Provides a separate interface for upper-body control
- Coordinates both through a unified framework
- **Pro:** Validated on G1, open-source, designed for this exact use case
- **Con:** Requires training a new locomotion policy (but provides tools to do so)

[T1 — Confirmed from developer documentation and GR00T-WBC architecture]

## 3. GR00T-WholeBodyControl (NVIDIA)

The most G1-relevant WBC framework. Open-source, designed specifically for Unitree humanoids. [T1]

| Property | Value |
|---|---|
| Repository | NVIDIA-Omniverse/gr00t-wbc (GitHub) |
| License | Apache 2.0 |
| Target robots | Unitree G1, H1 |
| Integration | LeRobot (HuggingFace), Isaac Lab |
| Architecture | Decoupled locomotion (RL) + upper-body (task policy) |
| Deployment | unitree_sdk2_python on Jetson Orin |

### Architecture

```
                    ┌──────────────────┐
                    │  Task Policy      │  (mocap tracking, manipulation, etc.)
                    │  (upper body)     │
                    └────────┬─────────┘
                             │ desired upper-body joints
                    ┌────────▼─────────┐
                    │  WBC Coordinator  │  ← balance constraints
                    │  (optimization)   │  ← joint limits
                    └────────┬─────────┘
                             │
          ┌──────────────────┼──────────────────┐
          │                  │                   │
   ┌──────▼──────┐   ┌──────▼──────┐   ┌───────▼──────┐
   │ Locomotion   │   │ Arm Joints  │   │ Waist Joint  │
   │ Policy (RL)  │   │             │   │              │
   │ (lower body) │   │             │   │              │
   └─────────────┘   └─────────────┘   └──────────────┘
```

### Key Features
- **Locomotion policy:** RL-trained, handles balance and walking. Can be retrained with perturbation robustness.
- **Upper-body interface:** Accepts target joint positions for arms/waist from any source (mocap, learned policy, teleoperation)
- **LeRobot integration:** Data collection via teleoperation → behavior cloning → deployment, all within the GR00T-WBC framework
- **Sim-to-real:** Trained in Isaac Lab, deployed on real G1 via unitree_sdk2
- **G1 configs:** Supports 29-DOF and 23-DOF variants

### Why This Matters for Mocap + Balance
GR00T-WBC is the most direct path to the user's goal: the locomotion policy maintains balance (including push recovery if trained with perturbations) while the upper body tracks mocap reference trajectories. The two are coordinated through the WBC layer.

### Deployment on Dell Pro Max GB10 — Verified (2026-02-14/15) [T1]

GR00T-WBC has been **successfully deployed on a real G1 robot** via Dell Pro Max GB10 (NVIDIA Grace Blackwell, aarch64, Ubuntu 24.04). The robot stands and balances autonomously.

**Pre-trained ONNX Policies:**
- `GR00T-WholeBodyControl-Balance.onnx` — standing balance (15 lower-body joint targets)
- `GR00T-WholeBodyControl-Walk.onnx` — locomotion with velocity commands
- Both: 516-dim observation → 15-dim action. Pre-trained by NVIDIA (training code not open-sourced).
- Training: PPO via RSL-RL in Isaac Lab, domain randomization, zero-shot sim-to-real. Exact reward function and perturbation curriculum not published.
- **Training code available separately** via WBC-AGILE (nvidia-isaac/WBC-AGILE) for retraining/fine-tuning

**ONNX Policy Details:**
- Observation layout (86 dims per history step, 6 steps = 516):
  - `[0:3]` = velocity commands × cmd_scale
  - `[3:4]` = height command
  - `[4:7]` = [roll_cmd, pitch_cmd, yaw_cmd]
  - `[7:10]` = angular velocity × 0.5
  - `[10:13]` = gravity orientation (from IMU quaternion)
  - `[13:42]` = (joint_pos - defaults) × 1.0 (29 DOF)
  - `[42:71]` = joint_vel × 0.05 (29 DOF)
  - `[71:86]` = previous action (15)
- Action transform: `cmd_q = action * 0.25 + default_angles`
- Action bounds: **No clipping** — policy outputs can exceed [-1,1], this is intentional for push recovery. Do NOT add np.clip(). NVIDIA reference code does not clip.
- Policy selection: Balance when `np.linalg.norm(cmd) < 0.05`, Walk otherwise

**Performance on GB10:**
- ~3.5 ms per control loop iteration at 50 Hz (sync mode) — only 17.5% of time budget
- 401% CPU usage (4 cores) — MuJoCo physics dominates
- Both Balance and Walk policies load and execute successfully

**Critical Fixes Required for Real Robot Deployment:**
1. **IMU pitch offset calibration** — The G1's pelvis IMU has a physical mounting offset (~6°) that sim doesn't model. Must rotate quaternion before gravity computation. See [[sensors-perception]] §4. Without this fix, robot leans backward persistently.
2. **Negative KD bug** — `configs.py` has `MOTOR_KD[14] -= 10` which makes waist_pitch KD negative (-5). Comment out this line.
3. **Dynamic mode_machine detection** — GR00T-WBC hardcodes `mode_machine=5`. Apply PR #11 to read from `rt/lowstate` instead.
4. **GLFW crash on headless** — `simulator_factory.py` eagerly imports mujoco. Make BaseSimulator import lazy + run Xvfb :99.
5. **CYCLONEDDS_URI** — Must set explicit network interface: `address="192.168.123.100"` for GB10.
6. **Do NOT clip actions** — ONNX policy outputs intentionally exceed [-1,1]. Clipping causes policy saturation at clip boundaries with no room for balance corrections.

**Critical Fixes Required for GB10 Simulation (aarch64):**
1. **CycloneDDS buffer overflow:** The `<Tracing>` XML section in `unitree_sdk2py/core/channel_config.py` triggers a glibc FORTIFY_SOURCE buffer overflow on aarch64. Fix: remove the `<Tracing>` section entirely. (See [[dev-environment]] for patch details.)
2. **ROS2 Python path:** venv needs `.pth` file pointing to `/opt/ros/jazzy/lib/python3.12/site-packages/`
3. **ROS2 shared libraries:** `export LD_LIBRARY_PATH=/opt/ros/jazzy/lib:$LD_LIBRARY_PATH`
4. **Sync mode bug:** `run_g1_control_loop.py` checks for sim thread in sync mode where none exists. Patch: add `not config.sim_sync_mode` guard.

**Keyboard Control (Real Robot via tmux):**
- Keys sent via `tmux send-keys -t groot "key"` from remote machine
- `]`=activate policy, `o`=deactivate policy
- `w/s`=fwd/back, `a/d`=strafe, `q/e`=rotate, `z`=stop
- `1/2`=height up/down, `5/6`=pitch, `3/4`=roll, `7/8`=yaw
- `9/0`=IMU pitch offset ±1° (custom addition for calibration)

**Visualization (Simulation):**
- GLFW passive viewer freezes on virtual/remote displays (Xvfb, NoMachine) after a few seconds
- VNC (x11vnc) cannot capture OpenGL framebuffer updates
- Working solution: NoMachine virtual desktop (NX protocol) — viewer works initially but GLFW stalls
- Best solution: Web-based MJPEG streaming via MuJoCo offscreen renderer (bypasses all X11/GLFW issues)

## 4. Pinocchio + TSID (Model-Based WBC)

An alternative to RL-based WBC using classical optimization. [T1 — Established framework]

| Property | Value |
|---|---|
| Library | Pinocchio (stack-of-tasks/pinocchio) |
| Language | C++ with Python bindings |
| License | BSD-2-Clause |
| Key capability | Rigid body dynamics, forward/inverse kinematics, Jacobians, dynamics |

### Task-Space Inverse Dynamics (TSID)
Pinocchio + TSID solves a QP at each timestep:

```
minimize    || J_task * qdd - x_task_desired ||^2     (track task)
subject to  CoM ∈ support polygon                      (balance)
            q_min ≤ q ≤ q_max                          (joint limits)
            tau_min ≤ tau ≤ tau_max                     (torque limits)
            contact constraints                         (feet on ground)
```

- **Pros:** Interpretable, respects physics exactly, no training required
- **Cons:** Requires accurate dynamics model (masses, inertias), computationally heavier than RL at runtime, less robust to model errors
- **G1 compatibility:** Needs URDF with accurate dynamics. MuJoCo Menagerie model or unitree_ros URDF provide this.

### Use Cases for G1
- Offline trajectory optimization (plan mocap-feasible trajectories ahead of time)
- Real-time WBC if dynamics model is accurate enough
- Validation tool: check if a retargeted motion is physically feasible before executing

## 5. RL-Based WBC Approaches

### SoFTA — Slow-Fast Two-Agent RL
Decouples whole-body control into two agents operating at different frequencies: [T1 — Research]

- **Slow agent (lower body):** Locomotion at standard control frequency, handles balance and walking
- **Fast agent (upper body):** Manipulation at higher frequency for precision tasks
- Key insight: upper body needs faster updates for manipulation precision; lower body is slower but more stable

### H2O — Human-to-Humanoid Real-Time Teleoperation
(arXiv:2403.01623) [T1 — Validated on humanoid hardware]

- Real-time human motion retargeting to humanoid robot
- RL policy trained to imitate human demonstrations while maintaining balance
- Demonstrated whole-body teleoperation including walking + arm motion
- Relevant to G1: proves the combined mocap + balance paradigm works

### OmniH2O — Universal Teleoperation and Autonomy
(arXiv:2406.08858) [T1]

- Extends H2O with multiple input modalities (VR, RGB camera, motion capture)
- Trains a universal policy that generalizes across different human operators
- Supports both teleoperation (real-time) and autonomous replay (offline)
- Directly relevant: could drive G1 from mocap recordings

### HumanPlus — Humanoid Shadowing and Imitation
(arXiv:2406.10454) [T1]

- "Shadow mode": real-time mimicry of human motion using RGB camera
- Collects demonstrations during shadowing, then trains autonomous policy via imitation learning
- Complete pipeline from human motion → robot imitation → autonomous execution
- Validated on full-size humanoid with walking + manipulation

## 6. WBC Implementation Considerations for G1

### Compute Budget
- Jetson Orin NX: 100 TOPS (AI), 8-core ARM CPU
- RL policy inference: typically < 1ms per step
- QP-based TSID: typically 1-5ms per step (depends on DOF count)
- 500 Hz control loop = 2ms budget per step
- RL approach fits comfortably; QP-based requires careful optimization

### Sensor Requirements
- IMU (onboard): orientation, angular velocity — available
- Joint encoders (onboard): position, velocity — available at 500 Hz
- Foot contact sensing: NOT standard on G1 — must infer from joint torques or add external sensors [T3]
- External force estimation: possible from IMU + dynamics model, or add force/torque sensors [T3]

### Communication Path
```
Jetson Orin (user code) ──DDS──► rt/lowcmd ──► Joint Actuators
                         ◄──DDS── rt/lowstate ◄── Joint Encoders + IMU
```
- Latency: ~2ms DDS round trip [T1]
- Frequency: 500 Hz control loop [T0]
- Both overlay and replacement approaches use this same DDS path

## Key Relationships
- Builds on: [[locomotion-control]] (balance as a component of WBC)
- Enables: [[motion-retargeting]] (WBC provides the balance guarantee during mocap playback)
- Enables: [[push-recovery-balance]] (WBC can incorporate perturbation robustness)
- Uses: [[joint-configuration]] (all joints coordinated as one system)
- Uses: [[sensors-perception]] (IMU + encoders for state estimation)
- Trained via: [[learning-and-ai]] (RL training for locomotion component)
- Bounded by: [[equations-and-bounds]] (CoM, ZMP, joint limits)