Learning-2-Sample

September 19, 2025 · View on GitHub

RL0 in Evaluation Scenarios

Installation

git clone <this-reposttory>

Conda

conda create -n learning2sample python=3.11
conda activate learning2sample

This will install all necessary dependencies, and also installs external/dreamerv3 and external/frenetix-motion-planner as package dependencies, directly reflecting any changes made in these projects.

pip install poetry==2.1.3
poetry install

Training

Before training, please check and consider:

I. Before Training

Observation Settings

observation.py

Set the maximum number of obstacles to be considered:

max_elements: int = 3  # maximum number of obstacles in the set

Set the maximum number of consecutive lateral roads from the ego position to the final goal lanelet:

max_lanelets: int = 3 # maximum number of road boundaries (2: just a single lane, 3: two parallel lanes, 4: three parallel ...)

Generated Trajectory Evaluation

reinforcement_planner_cpp.py

In order to make the RL agent learn more directly what good trajectories are, consider commenting out the selection of the lowest cost trajectory if there is no drivable trajectory until the end of the trajectory.

# ********************************************
# Nice / Hard Mode (Comment Out) - Select first feasible trajectory in case no drivable trajectory exists
# ********************************************
if optimal_trajectory is None and len(feasible_trajectories) > 0:
    optimal_trajectory = feasible_trajectories[0]

Wandb

Set the wandb API Key:

export WANDB_API_KEY=<key>

and possible notes (for more environment options see wandb manual):

export WANDB_NOTES="Very interesting run"

II. Start Training

For the following command it is assumed being within the folder:
./learning2sample/external/dreamerv3

Finally, start training with:

python3 -u ./dreamerv3/main.py --configs frenetix_train size200m  2>&1 | tee dreamer_train_log.txt

III. Continue Training Run

For the following command it is assumed being within the folder:
./learning2sample/external/dreamerv3

Execute the same command as used previously for the training and add the logdir to the call:

python3 -u ./dreamerv3/main.py <train-arguments> --logdir <logdir>

Take the actual and no subdir.

Evaluation

The evaluation consists of the inference of the Dreamer Model and the execution of the Baseline Planner.
We also provide a bash script that executes these two steps and permutes the number of sample rings accordingly (see section).

To reproduce the paper's qualitative and quantitative results:

we provide the trained weights of the dreamer model
all used commonroad scenarios are in ./scenarios
(this code base)

DreamerV3 Inference

First, adapt the evaluation settings in the configs.yaml:

frenetix_eval:
  ...
  
  run: {
    ...
    steps: 300,  # number of evaluation steps - corresponds to number of evaluation scenarios (right now set for the quantitative scenarios)
    ...}

  env.frenetix.sample_rings: 3  # set here the number of sample rings around the ego vehicle [0,1,2,3, ...]  
  env.frenetix.scenario_folder:  "../../scenarios/quantitative"  # set here the scenario folder or leave it "" to use the dynamically generated scenarios during training
  ...

Then, run the model inference:

For the following command it is assumed being within the folder:
./learning2sample/external/dreamerv3

python3 -u dreamerv3/main.py --configs frenetix_train <model_size> frenetix_eval --run.from_checkpoint <checkpoint_folder>

Hint: The <checkpoint_folder> contains the done file.

Base Planner Execution

The Base / Standard Planner can be executed with:

python3 gym_env_basic.py --eval --episodes 300 --scen-folder ./scenarios/<folder> --standard --rings <int>

Unified Execution

For re-computation of the paper results, you can execute the bash script:

bash ./exec.sh

Results Analysis

Finally, the data analysis and plots can be executed in the provided Jupyter notebooks:

Visualization

To create a .gif from the *.svg files, you can use svgs2gif.py as follows:

python3 ./svgs2gif.py <folder to svgs> --fps <int>

See --help for more information.

Additional Visualizations

RL (ours) vs RL3 (ours) vs Baseline Planner B800

RL Versions (ours) against Baseline Planne

Evasive Scenario: RL3 (ours) vs FISS+

RL (ours) against FISS+ Planner in Evasive Scenario

FISS+ Scenario: RL3 (ours) vs FISS+

RL (ours) against FISS+ Planner in FISS+ Scenario

Simulation Hints

Curvilinear Projection Domain Constraints

The project depends on the curvilinear coordinate system and all relevant observations are transformed into curvilinear coordinates.

Thus, if the projection domain becomes too narrow around the reference path, consider smoothing the calculated reference path more by restricting its curvature frenet_interface.py.

Careful: If it is too smooth, the reference path may shortcut steep road curves!

params.processing_option = ProcessingOption.CURVE_SUBDIVISION
params.subdivision.max_curvature = 0.20 # reduce for smoother reference path