OBD

October 31, 2024 · View on GitHub

Official repository for the NeurIPS 2024 paper "Offline Behavior Distillation" by Shiye Lei, Sen Zhang, and Dacheng Tao.

Dependencies

  • Python 3.7
  • Pytorch 1.11
  • mujoco 2.10
  • d4rl
  • wandb

Quick Start

Syntheisze Behavioral Datasets

  • Av-PBC
python obd_bptt.py --env 'halfcheetah-medium-replay-v2' --match_objective 'offline_policy' --q_weight --save_dir './saved_synset' --seed 0
  • PBC
python obd_bptt.py --env 'halfcheetah-medium-replay-v2' --match_objective 'offline_policy' --save_dir './saved_synset' --seed 0
  • DBC
python obd_bptt.py --env 'halfcheetah-medium-replay-v2' --match_objective 'offline_data' --save_dir './saved_synset' --seed 0

Evaluate Behavioral Datasets

  • Standard evaluation
python evaluate_synset.py --env 'halfcheetah-medium-replay-v2' --match_objective 'offline_policy' --q_weight --eval_freq 1000 --save_dir './saved_synset' --group 'Evaluate' --seed 0
  • Ensemble evaluation
python evaluate_synset.py --env 'halfcheetah-medium-replay-v2' --match_objective 'offline_policy' --q_weight --eval_freq 1000 --eval_ensemble --ensemble_policy_num 10 --save_dir './saved_synset' --group 'Ensemble-Evaluate' -- --seed 0

Cross Arch/Optim Evaluation

python evaluate_cross_arch.py --env 'halfcheetah-medium-replay-v2' --match_objective 'offline_policy' --q_weight --eval_freq 1000 --save_dir '/home/leaves/Data/OBD/q-value-weighted-synset' --group 'Cross-Arch-Optim-Evaluate' --seed 0

Citation

@inproceedings{
lei2024offline,
title={Offline Behavior Distillation},
author={Lei, Shiye and Zhang, Sen and Tao, Dacheng},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024}
}

Contact

For any issue, please kindly contact Shiye Lei: leishiye@gmail.com

Acknowledgment