Tutorial on Exporting PyTorch Model to TensorRT
September 24, 2025 ยท View on GitHub
These instructions walk through exporting a baseline model to ONNX and building both INT8 and FP32 TensorRT engines.
Export model to ONNX and dump calibration samples
We use the script below to export the PyTorch model into ONNX format and generate calibration samples for quantization.
python opencood/tools/inference_onnx_dump_calibration.py --model_dir ${MODEL_DIR} --dump_npz
Arguments Explanation:
model_dir: path to the trained PyTorch checkpoint.--dump_npz: saves calibration samples in.npzformat for quantization.
Build TensorRT INT8 engine
After exporting the ONNX model and calibration samples, build the INT8 TensorRT engine:
python opencood/tools/build_trt_int8.py --onnx ${ONNX_ENGINE}.onnx --npz_dir ${CALIBRATION_NPZ} --cache ${CACHE_FILE}.cache --engine ${MODEL_INT8}.plan
Arguments Explanation:
onnx: exported ONNX model file.npz_dir: folder containing calibration.npzsamples.cache: calibration cache file to accelerate future builds.engine: output TensorRT INT8 engine file (.plan).
Build TensorRT FP32 engine
For a full-precision engine without quantization, build the FP32 TensorRT engine:
python opencood/tools/build_trt_fp32.py --onnx ${ONNX_ENGINE}.onnx --npz_dir ${CALIBRATION_NPZ} --engine ${MODEL_FP32}.plan
Arguments Explanation:
onnx: exported ONNX model file.npz_dir: folder containing calibration.npzsamples.engine: output TensorRT FP32 engine file (.plan).