DeepSpeed Huggingface Inference Examples

February 7, 2025 ยท View on GitHub

Contents

Setup

The Python dependencies for each example are captured in requirements.txt in the corresponding ML task directory (e.g. ./text-generation).

Python dependencies can be installed using:

pip install -r requirements.txt

For the ./automatic-speech-recognition/test-wav2vec.py speech model example, you may also need to install the libsndfile1-dev generic library:

sudo apt-get install libsndfile1-dev

Usage

The DeepSpeed huggingface inference examples are organized into their corresponding ML task directories (e.g. ./text-generation). Each ML task directory contains a README.md and a requirements.txt.

TaskREADMErequirements
automatic-speech-recognitionREADMErequirements
fill-maskREADMErequirements
text-generationREADMErequirements
text-generation/run-generation-scriptREADMErequirements
text2text-generationREADMErequirements
translationREADMErequirements
stable-diffusionREADMErequirements

Most examples can be run as follows:

deepspeed --num_gpus [number of GPUs] test-[model].py

Additional Resources

Information about DeepSpeed can be found at the deepspeed.ai website.

DeepSpeed Inference

Additional information on DeepSpeed inference can be found here:

Benchmarking

DeepSpeed inference benchmarking can be found in the DeepSpeed repository: