Observation Types

June 11, 2025 · View on GitHub

State-based

The default observation (obs_buf) is state-based, which includes the following:

goal specification
scaled joint position (q, |A|-dim)
scaled joint velocity (dq, |A|-dim)
last actions (self.actions, |A|-dim)
scaled root angular velocity (self.base_ang_vel * self.obs_scales.ang_vel, 3-dim)
scaled root euler angle (self.base_euler_xyz * self.obs_scales.quat, 3-dim)

If you want to change the observation structure (and obs/noise scales), make sure to override the corresponding functions in the subclass.

We also support vision-based observation. To use ego-centric color and depth for policy learning, set the following in config:

enable_sensor = True to enable vision observations in the environment and enable vision input in the PPO training. You can also edit camera resolution, camera pose, etc.
policy_class_name = 'ActorCriticVision' or 'ActorCriticHierarchicalVision' for vision-based policies.
num_envs = 4 or other small numbers for training efficiency.