Dora

April 28, 2026 · View on GitHub

English | 简体中文

Website | Python API | Rust API | Guide | Discord

Build and test crates.io docs.rs PyPI License

Dora

Agentic Dataflow-Oriented Robotic Architecture -- a 100% Rust framework for building real-time robotics and AI applications.

User Guide | 用户指南 (中文)

Built and maintained with agentic engineering -- code generation, reviews, refactoring, testing, and commits are driven by autonomous AI agents.


Table of Contents

Features

Performance

  • 10-17x faster than ROS2 Python -- 100% Rust internals with zero-copy shared memory IPC for messages >4KB, flat latency from 4KB to 4MB payloads
  • Zenoh SHM data plane -- nodes publish directly via Zenoh shared memory, bypassing the daemon for 35% lower latency and 3-10x higher throughput on large payloads; automatic network fallback for cross-machine
  • Apache Arrow native -- columnar memory format end-to-end with zero serialization overhead; optional Arrow IPC framing for self-describing wire format; shared across all language bindings
  • Non-blocking event loop -- Zenoh publishes offloaded to a dedicated drain task; control commands respond in <500ms even under high data throughput

Developer experience

  • Single CLI, full lifecycle -- dora run for local dev, dora up/start for distributed prod, plus build, logs, monitoring, record/replay all from one tool
  • Declarative YAML dataflows -- define pipelines as directed graphs, connect nodes through typed inputs/outputs, optional type annotations with static validation, override with environment variables
  • Multi-language nodes -- write nodes in Rust, Python, C, or C++ with native APIs (not wrappers); mix languages freely in one dataflow
  • Reusable modules -- compose sub-graphs as standalone YAML files with typed inputs/outputs, parameters, optional ports, and nested composition (compile-time expansion, zero runtime overhead)
  • Hot reload -- live-reload Python operators without restarting the dataflow
  • Programmatic builder -- construct dataflows in Python code as an alternative to YAML

Production readiness

  • Fault tolerance -- per-node restart policies (never/on-failure/always), exponential backoff, health monitoring, circuit breakers with configurable input timeouts
  • Distributed by default -- local shared memory between co-located nodes, automatic Zenoh pub-sub for cross-machine communication, SSH-based cluster management with label scheduling, rolling upgrades, and auto-recovery
  • Coordinator HA -- persistent redb-backed state store (default), daemon auto-reconnect with exponential backoff, dataflow records survive coordinator restart (running dataflow reclaim-across-restart is partial, see the open issue tracker)
  • Dynamic topology -- add and remove nodes from running dataflows via CLI (dora node add/remove/connect/disconnect) without restarting
  • Soft real-time -- optional --rt flag for mlockall + SCHED_FIFO; per-node cpu_affinity pinning in YAML; comprehensive tuning guide for memory locking, kernel params, and container deployment
  • OpenTelemetry -- built-in structured logging with rotation/routing, metrics, distributed tracing, and zero-setup trace viewing via CLI

Debugging and observability

  • Record/replay -- capture dataflow messages to .drec files, replay offline at any speed with node substitution for regression testing
  • Topic inspection -- topic echo to print live data, topic hz TUI for frequency analysis, topic info for schema and bandwidth
  • Resource monitoring -- dora top TUI showing per-node CPU, memory, queue depth, network I/O, restart count, and health status across all machines; --once flag for scriptable JSON snapshots
  • Trace inspection -- trace list and trace view for viewing coordinator spans without external infrastructure
  • Dataflow visualization -- generate interactive HTML or Mermaid graphs from YAML descriptors

Ecosystem

  • Communication patterns -- built-in service (request/reply), action (goal/feedback/result), and streaming (session/segment/chunk) patterns via well-known metadata keys; no daemon or YAML changes required
  • ROS2 bridge -- bidirectional interop with ROS2 topics, services, and actions; QoS mapping; Arrow-native type conversion
  • Pre-packaged nodes -- node hub with ready-made nodes for cameras, YOLO, LLMs, TTS, and more
  • In-process operators -- lightweight functions that run inside a shared runtime, avoiding per-node process overhead for simple transformations

Installation

cargo install dora-cli           # CLI (dora command)
pip install dora-rs              # Python node/operator API

From source

git clone https://github.com/dora-rs/dora.git
cd dora
cargo build --release -p dora-cli
PATH=$PATH:$(pwd)/target/release

# Python API (requires maturin >= 1.8: pip install maturin)
# Must run from the package directory for dependency resolution
cd apis/python/node && maturin develop --uv && cd ../../..

Platform installers

macOS / Linux:

curl --proto '=https' --tlsv1.2 -LsSf \
  https://github.com/dora-rs/dora/releases/latest/download/dora-cli-installer.sh | sh

Windows:

powershell -ExecutionPolicy ByPass -c "irm https://github.com/dora-rs/dora/releases/latest/download/dora-cli-installer.ps1 | iex"

Build features

FeatureDescriptionDefault
tracingOpenTelemetry tracing supportYes
metricsOpenTelemetry metrics collectionYes
pythonPython operator support (PyO3)No
redb-backendPersistent coordinator state (redb)Yes
cargo install dora-cli --features redb-backend

Quick Start

1. Run a Python dataflow

Important: The PyPI package is dora-rs, not dora. The import name is dora (from dora import Node), but pip install dora installs an unrelated package.

cargo install dora-cli            # or use install script below
pip install dora-rs numpy pyarrow
git clone https://github.com/dora-rs/dora.git && cd dora
dora run examples/python-dataflow/dataflow.yml

This runs a sender -> transformer -> receiver pipeline. Here's what the Python node code looks like:

# sender.py -- sends 100 messages
from dora import Node
import pyarrow as pa

node = Node()
for i in range(100):
    node.send_output("message", pa.array([i]))
# receiver.py -- receives and prints messages
from dora import Node

node = Node()
for event in node:
    if event["type"] == "INPUT":
        print(f"Got {event['id']}: {event['value'].to_pylist()}")
    elif event["type"] == "STOP":
        break

See the Python Getting Started Guide for a full tutorial, or the Python API Reference for complete API docs.

2. Run a Rust dataflow

cd examples/rust-dataflow
dora run dataflow.yml

3. Distributed mode (ad-hoc)

# Terminal 1: start coordinator + daemon
dora up

# Terminal 2: start a dataflow (--debug enables topic inspection)
dora start dataflow.yml --attach --debug

# Terminal 3: monitor
dora list
dora logs <dataflow-id>
dora top

# Stop or restart
dora stop <dataflow-id>
dora restart --name <name>
dora down

4. Managed cluster

# Bring up a multi-machine cluster from a config file
dora cluster up cluster.yml

# Start a dataflow across the cluster
dora start dataflow.yml --name my-app --attach

# Check cluster health
dora cluster status

# Tear down
dora cluster down

See the Distributed Deployment Guide for cluster.yml configuration, label scheduling, systemd services, rolling upgrades, and operational runbooks.

CLI Commands

Lifecycle

CommandDescription
dora run <PATH>Run a dataflow locally (no coordinator/daemon needed)
dora upStart coordinator and daemon in local mode
dora downTear down coordinator and daemon
dora build <PATH>Run build commands from a dataflow descriptor
dora start <PATH>Start a dataflow on a running coordinator
dora stop <ID>Stop a running dataflow
dora restart <ID>Restart a running dataflow (stop + re-start)

Monitoring

CommandDescription
dora listList running dataflows (alias: ps)
dora logs <ID>Show logs for a dataflow or node
dora topReal-time resource monitor (TUI); also dora inspect top
dora topic listList topics in a dataflow
dora topic hz <TOPIC>Measure topic publish frequency (TUI)
dora topic echo <TOPIC>Print topic messages to stdout
dora topic info <TOPIC>Show topic type and metadata
dora node listList nodes in a dataflow
dora node info <NODE>Show detailed node status, inputs, outputs, and metrics
dora node add --from-yaml <FILE>Add a node to a running dataflow
dora node remove <NODE>Remove a node from a running dataflow
dora node connect <SRC> <DST>Add a live mapping between nodes
dora node disconnect <SRC> <DST>Remove a live mapping between nodes
dora node restart <NODE>Restart a single node within a running dataflow
dora node stop <NODE>Stop a single node within a running dataflow
dora topic pub <TOPIC> <DATA>Publish JSON data to a topic
dora param list <NODE>List runtime parameters for a node
dora param get <NODE> <KEY>Get a runtime parameter value
dora param set <NODE> <KEY> <VALUE>Set a runtime parameter (JSON value)
dora param delete <NODE> <KEY>Delete a runtime parameter
dora trace listList recent traces captured by the coordinator
dora trace view <ID>View spans for a specific trace (supports prefix matching)
dora record <PATH>Record dataflow messages to .drec file
dora replay <FILE>Replay recorded messages from .drec file

Cluster management

CommandDescription
dora cluster up <PATH>Bring up a cluster from a cluster.yml file
dora cluster statusShow connected daemons and active dataflows
dora cluster downTear down the cluster
dora cluster install <PATH>Install daemons as systemd services
dora cluster uninstall <PATH>Remove systemd services
dora cluster upgrade <PATH>Rolling upgrade: SCP binary + restart per-machine
dora cluster restart <NAME>Restart a dataflow by name or UUID

Setup and utilities

CommandDescription
dora doctorDiagnose environment, connectivity, and dataflow health
dora statusCheck system health (alias: check)
dora newGenerate a new project or node
dora graph <PATH>Visualize a dataflow (Mermaid or HTML)
dora expand <PATH>Expand module references and print flat YAML
dora validate <PATH>Validate dataflow YAML and check type annotations
dora systemSystem management (daemon/coordinator control)
dora completion <SHELL>Generate shell completions
dora self updateUpdate dora CLI

For full CLI documentation, see docs/cli.md. For distributed deployment, see docs/distributed-deployment.md.

Dataflow Configuration

Dataflows are defined in YAML. Each node declares its binary/script, inputs, and outputs:

nodes:
  - id: camera
    build: pip install opencv-video-capture
    path: opencv-video-capture
    inputs:
      tick: dora/timer/millis/20
    outputs:
      - image
    env:
      CAPTURE_PATH: 0
      IMAGE_WIDTH: 640
      IMAGE_HEIGHT: 480

  - id: object-detection
    build: pip install dora-yolo
    path: dora-yolo
    inputs:
      image: camera/image
    outputs:
      - bbox

  - id: plot
    build: pip install dora-rerun
    path: dora-rerun
    inputs:
      image: camera/image
      boxes2d: object-detection/bbox

Built-in timer nodes: dora/timer/millis/<N> and dora/timer/hz/<N>.

Input format: <node-id>/<output-name> to subscribe to another node's output. Long form supports queue_size, queue_policy (drop_oldest or backpressure), and input_timeout. See the YAML Specification for details.

Type annotations: Optionally annotate ports with type URNs for static and runtime validation. See the Type Annotations Guide for the full type library.

nodes:
  - id: camera
    path: camera.py
    outputs:
      - image
    output_types:
      image: std/media/v1/Image
dora validate dataflow.yml                        # static check (warnings)
dora validate --strict-types dataflow.yml         # fail on warnings (CI)
dora build dataflow.yml --strict-types            # type check during build
DORA_RUNTIME_TYPE_CHECK=warn dora run dataflow.yml  # runtime check

Modules: Extract reusable sub-graphs into separate files with module: instead of path:. See the Modules Guide for details.

nodes:
  - id: nav_stack
    module: modules/navigation.module.yml
    inputs:
      goal_pose: localization/goal

Architecture

CLI  -->  Coordinator  -->  Daemon(s)  -->  Nodes / Operators
             (orchestration)  (per machine)    (user code)
LayerProtocolPurpose
CLI <-> CoordinatorWebSocket (port 6013)Build, run, stop commands
Coordinator <-> DaemonWebSocketNode spawning, dataflow lifecycle
Daemon <-> DaemonZenohDistributed cross-machine communication
Node <-> NodeZenoh SHMDirect zero-copy data plane for messages >4KB
Daemon <-> NodeShared memory / TCPControl plane + small message delivery

Key components

  • Coordinator -- orchestrates dataflow lifecycle across daemons. Persistent redb state store by default; daemons auto-reconnect on coordinator restart.
  • Daemon -- spawns and manages nodes on a single machine. Routes messages and manages Zenoh SHM data plane.
  • Runtime -- in-process operator execution engine. Operators run inside the runtime process, avoiding per-operator process overhead.
  • Nodes -- standalone processes that communicate via inputs/outputs. Written in Rust, Python, C, or C++.
  • Operators -- lightweight functions that run inside the runtime. Faster than nodes for simple transformations.

Workspace layout

binaries/
  cli/                  # dora CLI binary
  coordinator/          # Orchestration service
  daemon/               # Node manager + IPC
  runtime/              # In-process operator runtime
  ros2-bridge-node/     # ROS2 bridge binary
  record-node/          # Dataflow message recorder
  replay-node/          # Recorded message replayer
libraries/
  core/                 # Descriptor parsing, build utilities
  message/              # Inter-component message types
  shared-memory-server/ # Zero-copy IPC
  arrow-convert/        # Arrow data conversion
  recording/            # .drec recording format
  log-utils/            # Log parsing, merging, formatting
  coordinator-store/    # Persistent coordinator state (redb)
  extensions/
    telemetry/          # OpenTelemetry tracing + metrics
    ros2-bridge/        # ROS2 interop (bridge, msg-gen, arrow, python)
    download/           # Download utilities
apis/
  rust/node/            # Rust node API (dora-node-api)
  rust/operator/        # Rust operator API (dora-operator-api)
  python/node/          # Python node API (PyO3)
  python/operator/      # Python operator API (PyO3)
  python/cli/           # Python CLI interface
  c/node/               # C node API
  c/operator/           # C operator API
  c++/node/             # C++ node API (CXX bridge)
  c++/operator/         # C++ operator API (CXX bridge)
examples/               # Example dataflows

Language Support

LanguageNode APIOperator APIDocsStatus
Rustdora-node-apidora-operator-apiAPI ReferenceFirst-class
Python >= 3.8pip install dora-rsincludedGetting Started, API ReferenceFirst-class
Cdora-node-api-cdora-operator-api-cAPI ReferenceSupported
C++dora-node-api-cxxdora-operator-api-cxxAPI ReferenceSupported
ROS2 >= Foxydora-ros2-bridge--Bridge GuideExperimental

Platform support

PlatformRust / PythonC / C++ templates
Linux (x86_64, ARM64, ARM32)First-class (PR-gated)First-class (nightly-gated)
macOS (ARM64)First-class (nightly-gated)Best effort (nightly-gated)
Windows (x86_64)Best effort (nightly-gated)Best effort (not gated)
WSL (x86_64)Best effortBest effort (not gated)

Gate meanings (#1716):

  • PR-gated — every PR to main runs these tests; merge is blocked on failure.
  • Nightly-gated — the daily scheduled run (.github/workflows/nightly.yml) runs these. A failure auto-files a nightly-regression issue but does NOT block PRs.
  • Not gated — no automated CI coverage. Regressions surface via user reports.

dora new --lang rust/python template tests run in nightly across all three platforms; C/C++ variants run in nightly on Linux only. Developers who need cross-platform verification before merge can run make qa-test / make qa-examples / make qa-nightly locally. See docs/testing-matrix.md for the full rationale.

Examples

Core language examples

ExampleLanguageDescription
rust-dataflowRustBasic Rust node pipeline
python-dataflowPythonPython sender/transformer/receiver
python-operator-dataflowPythonPython operators (in-process)
python-dataflow-builderPythonPythonic imperative API
c-dataflowCC node example
c++-dataflowC++C++ node example
c++-arrow-dataflowC++C++ with Arrow data
cmake-dataflowC/C++CMake-based build

Composition

ExampleLanguageDescription
module-dataflowPythonReusable module composition
typed-dataflowPythonType annotations with dora validate

Communication patterns

ExampleLanguageDescription
service-exampleRustRequest/reply with request_id correlation
action-exampleRustGoal/feedback/result with cancellation
streaming-examplePythonToken-by-token generation with session/seq/fin metadata

See docs/patterns.md for the full guide.

Dynamic topology

ExampleLanguageDescription
dynamic-add-removePythonAdd/remove nodes from running dataflows
dynamic-agent-toolsPythonAI agent with dynamically-added tools

Advanced patterns

ExampleLanguageDescription
python-asyncPythonAsync Python nodes
python-concurrent-rwPythonConcurrent read-write patterns
python-multiple-arraysPythonMulti-array handling
python-drainPythonEvent draining patterns
multiple-daemonsRustDistributed multi-daemon setup
rust-dataflow-gitRustGit-based dataflow loading
rust-dataflow-urlRustURL-based dataflow loading

Logging

ExampleLanguageDescription
python-loggingPythonPython logging integration
python-logPythonBasic Python log output
log-sink-tcpYAMLTCP-based log sink
log-sink-fileYAMLFile-based log sink
log-sink-alertYAMLAlert-based log sink
log-aggregatorPythonCentralized log aggregation via dora/logs

Performance

ExampleLanguageDescription
benchmarkRust/PythonLatency and throughput benchmark
ros2-comparisonPythonDora vs ROS2 comparison
cuda-benchmarkRust/CUDAGPU zero-copy benchmark

ROS2 integration

ExampleDescription
ros2-bridge/rustRust ROS2 topics, services, actions
ros2-bridge/pythonPython ROS2 integration
ros2-bridge/c++C++ ROS2 integration
ros2-bridge/yaml-bridgeYAML-based ROS2 topic bridge
ros2-bridge/yaml-bridge-serviceYAML ROS2 service bridge
ros2-bridge/yaml-bridge-actionYAML ROS2 action client
ros2-bridge/yaml-bridge-action-serverYAML ROS2 action server

Development

Rust edition 2024; MSRV and default workspace package metadata are tracked in [workspace.package] of the root Cargo.toml. Most crates inherit the workspace version via version.workspace = true; a handful (e.g. apis/rust/operator/types, the examples/error-propagation/* samples) pin their own version independently.

Build

# Build all (excluding Python packages which require maturin)
cargo build --all \
  --exclude dora-node-api-python \
  --exclude dora-operator-api-python \
  --exclude dora-ros2-bridge-python

# Build specific package
cargo build -p dora-cli

Test

# Run all tests
cargo test --all \
  --exclude dora-node-api-python \
  --exclude dora-operator-api-python \
  --exclude dora-ros2-bridge-python

# Test single package
cargo test -p dora-core

# Smoke tests (requires coordinator/daemon)
cargo test --test example-smoke -- --test-threads=1

Lint and format

cargo clippy --all
cargo fmt --all -- --check

Run examples

cargo run --example rust-dataflow
cargo run --example python-dataflow
cargo run --example benchmark --release

Quality assurance

Dora ships with a three-tier QA system designed for AI-authored code. Everything runs locally first; CI mirrors the same scripts.

make qa-install        # one-time: install cargo-audit, cargo-deny, cargo-llvm-cov, cargo-mutants, cargo-semver-checks
make qa-fast           # ~15s    -- fmt + clippy + audit + unwrap-budget + typos (pre-commit)
make qa-full           # ~5-10m  -- qa-fast + tests + coverage (pre-push)
make qa-deep           # ~15m    -- qa-full + mutation testing + semver (target Tier 1 gate, stronger than today's CI; alias: qa-tier1)
make qa-nightly        # ~3-4h -- qa-deep + proptest@1000 + miri + example-smoke + ci-nightly-jobs (full parity with .github/workflows/nightly.yml)
make qa-release-gate   #         -- qa-deep + semver (Tier 3 automatable; audit/dogfood are human)
make qa-mutation-audit # ~10-18h -- full-repo cargo-mutants; deliberate test-quality audit
make qa-examples       # ~15-20m -- run all smoke-eligible example dataflows end-to-end (skips CUDA/ROS2/C++/interactive)

On Ubuntu, install ripgrep separately and install typos-cli with Cargo:

sudo apt update
sudo apt install ripgrep
cargo install typos-cli

Gates in place:

  • Supply chain -- cargo-audit + cargo-deny for CVEs, license policy, dependency bans
  • Unwrap ratchet -- counts .unwrap() / .expect( in production code; can only go down (.unwrap-budget)
  • Coverage -- cargo-llvm-cov with diff-coverage gate (70% on PR-touched lines)
  • Mutation testing -- cargo-mutants against critical crates (library crates at package scope, binary crates with test_workspace = true)
  • Property testing -- proptest on wire-protocol types; catches edge cases unit tests miss
  • Miri -- UB detection on pure-Rust unsafe hotspots (e.g., dora-core::metadata)
  • SemVer check -- cargo-semver-checks against the last git tag
  • Adversarial LLM review -- scripts/qa/adversarial.sh runs a different model on your diff to catch single-model blind spots (local today; CI pending API secret)

Reference docs:

  • Contributor QA Cheat Sheet -- contributor-oriented setup, day-to-day commands, and PR validation checklist
  • QA Runbook -- day-to-day command reference, failure modes, and fixes
  • Agentic QA Strategy -- full three-tier design and rationale
  • POC Report -- case studies, metrics, lessons learned, recommendations for the wider ecosystem

Contributing

We welcome contributors of all experience levels. See the contributing guide to get started.

For non-trivial work, discuss the approach in a GitHub issue, discussion, or Discord thread before implementing it. Before opening or updating a PR, run the QA level appropriate for the change and include the validation you ran in the PR description. The Contributor QA Cheat Sheet is the fastest day-to-day reference; the stricter per-change policy lives in docs/agentic-qa-policy.md.

Communication

AI-Assisted Development

This repository is maintained with AI-assisted agentic engineering. Code generation, reviews, refactoring, testing, and commits are driven by autonomous AI agents -- enabling faster iteration and higher code quality at scale.

License

Apache-2.0. See NOTICE.md for details.