accelerate-methods.md

July 22, 2024 · View on GitHub

显存相关

缓解显存碎片:

PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

加速相关

TF32

torch.backends.cudnn.enabled: it enables cudnn for some operations such as conv layers and RNNs, which can yield a significant speedup.