ARX Inference
February 14, 2026 · View on GitHub
Deploy trained OpenPi policies on ARX-X5 dual-arm robots (RealSense cameras, ROS2). Inference runs by sending observations to a policy server (on a GPU host) and applying the returned action chunks on the robot, with optional temporal smoothing [2] or RTC (real-time chunking) [1].
Prerequisites: ARX-X5 and ROS2 setup
Follow the official ARX-X5 setup so arms and cameras work correctly:
- ARX_X5 (ARX Robotics)
Clone the repo, follow the docs for:- ROS2 workspace (e.g.
ROS2/X5_ws), building packages - Arm control and feedback topics
- RealSense camera setup
- ROS2 workspace (e.g.
After the ARX-X5 and robot hardware are set up, install the Python 3.10 inference environment (below) and build the bimanual package in this repo.
Python 3.10 inference environment (one-time setup)
On the machine that runs the inference client (IPC or same host as ROS2), use a dedicated Python 3.10 environment.
1. Create and activate conda env
conda create -n kai0_inference python=3.10
conda activate kai0_inference
2. Install dependencies
Install PyTorch (if needed for local use), OpenCV, NumPy, and other deps. From the repository root you can reuse the Agilex IPC requirements if present:
pip install -r train_deploy_alignment/inference/agilex/requirements_inference_ipc.txt
(or install opencv-python, numpy, pyrealsense2, etc. as needed.)
3. Install OpenPi client (editable)
From the repository root:
cd packages/openpi-client
pip install -e .
cd ../..
The inference scripts use openpi_client to talk to the policy server.
4. Build the bimanual package (ARX)
From the ARX inference directory:
cd train_deploy_alignment/inference/arx
./build.sh
This builds the bimanual package (e.g. cd bimanual && ./build.sh). Ensure your environment can load the built libraries (e.g. LD_LIBRARY_PATH as in setup.sh).
5. ROS2 and ARX messages
Source your ROS2 workspace and ensure the arx5_arm_msg (or equivalent) package is built and sourced so that RobotStatus / RobotCmd are available. The inference scripts import arx5_arm_msg.msg.
Inference setup (startup sequence)
Inference uses two machines: a GPU host (policy server) and the client machine (ROS2 + inference script). Follow the order below.
On the GPU host — start the policy server
From the repository root on the GPU machine:
uv run scripts/serve_policy.py policy:checkpoint --policy.config=<train_config> --policy.dir=<checkpoint_dir> [--port=8000]
Use the same training config and checkpoint as your trained model. For RTC inference, use an RTC config (e.g. pi0_rtc_aloha_sim or pi05_rtc_flatten_fold_inference); see RTC mode below.
On the client — full startup steps
Before running inference (or DAGGER), CAN must be configured and up (per the ARX official repo; this repo does not provide CAN scripts), and you must enable both master and slave arms. Order:
-
Ensure CAN is configured and up (follow ARX official ARX_X5 / ARX_CAN setup).
-
Enable both master and slave arms. Start the master and slave controller nodes (e.g. in separate terminals, or use the DAGGER arx_start.sh in dagger/arx to start both):
ros2 launch arx_x5_controller open_remote_master.launch.py ros2 launch arx_x5_controller open_remote_slave.launch.pyWait until nodes are up (e.g.
ros2 node listshows the arm nodes). -
Source ROS2 and (if needed) your conda env:
source /path/to/ros2_ws/install/setup.bash conda activate <your_inference_env> -
Source ARX setup (for
LD_LIBRARY_PATHso bimanual libs load). From the arx directory:cd train_deploy_alignment/inference/arx source setup.sh -
Run the inference script from the arx directory, with
--hostset to the GPU host IP:cd train_deploy_alignment/inference/arx source setup.sh cd inference python arx_openpi_inference_rtc.py --host <policy_server_ip> --port 8000 --rtc_mode --chunk_size 50Or run another script (see Inference scripts below). Replace
<policy_server_ip>with your policy server IP.
RTC (real-time chunking) mode
RTC stands for real-time chunking [1]. For RTC inference, the policy server must load the RTC model (Pi0RTC). Start the server with an RTC config, e.g.:
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi0_rtc_aloha_sim --policy.dir=<path_to_jax_checkpoint> [--port=8000]
Then run the ARX inference script with --rtc_mode (e.g. arx_openpi_inference_rtc.py --rtc_mode). RTC uses JAX checkpoints only.
Prompt and AWBC
Set the language prompt in the inference script to match training. In the scripts, set the global lang_embeddings at the top (e.g. lang_embeddings = "hang the cloth"). For AWBC-trained models, use the same advantage format as in stage_advantage (e.g. "<task>, Advantage: positive").
Inference scripts
Run from train_deploy_alignment/inference/arx after source setup.sh, then cd inference:
| Script | Description | Example command |
|---|---|---|
inference/arx_openpi_inference_rtc.py | RTC [1] with --rtc_mode; without it, same as temporal smoothing. Server must use RTC config for RTC. | python arx_openpi_inference_rtc.py --host <IP> --port 8000 --rtc_mode --chunk_size 50 |
inference/arx_openpi_inference_temporal_smooth.py | Temporal smoothing; async inference + stream buffer. | python arx_openpi_inference_temporal_smooth.py --host <IP> --port 8000 |
inference/arx_openpi_inference_sync.py | Sync: blocking infer every chunk, then execute step-by-step (like Agilex sync). | python arx_openpi_inference_sync.py --host <IP> --port 8000 |
inference/arx_openpi_inference_temporal_ensembling.py | Temporal ensembling [2]: --smooth_method naive_async or temporal_ensembling, --exp_weight_m for aggregation. | python arx_openpi_inference_temporal_ensembling.py --host <IP> --smooth_method temporal_ensembling --exp_weight_m 0.01 |
--host: GPU host IP.--port: server port (default 8000).lang_embeddings: Set in the script (or inarx_openpi_inference_rtcfor scripts that import it) to match training.
References
-
Black, K., Galliker, M. Y., & Levine, S. (2025). Real-Time Execution of Action Chunking Flow Policies. arXiv preprint arXiv:2506.07339.
-
Zhao, T. Z., Kumar, V., Levine, S., & Finn, C. (2023). Learning fine-grained bimanual manipulation with low-cost hardware. arXiv preprint arXiv:2304.13705.
BibTeX:
@misc{black2025realtime,
author = {Black, Kevin and Galliker, Manuel Y. and Levine, Sergey},
title = {Real-Time Execution of Action Chunking Flow Policies},
year = {2025},
eprint = {2506.07339},
archivePrefix = {arXiv},
primaryClass = {cs}
}
@misc{zhao2023learning,
author = {Zhao, Tony Z. and Kumar, V. and Levine, Sergey and Finn, Chelsea},
title = {Learning fine-grained bimanual manipulation with low-cost hardware},
year = {2023},
eprint = {2304.13705},
archivePrefix = {arXiv},
primaryClass = {cs}
}