DeepSpeed Huggingface Inference Examples

February 7, 2025 · View on GitHub

Setup
Usage
Additional Resources
- DeepSpeed Inference
- Benchmarking

Setup

The Python dependencies for each example are captured in requirements.txt in the corresponding ML task directory (e.g. ./text-generation).

Python dependencies can be installed using:

pip install -r requirements.txt

For the ./automatic-speech-recognition/test-wav2vec.py speech model example, you may also need to install the libsndfile1-dev generic library:

sudo apt-get install libsndfile1-dev

Usage

The DeepSpeed huggingface inference examples are organized into their corresponding ML task directories (e.g. ./text-generation). Each ML task directory contains a README.md and a requirements.txt.

Task	README	requirements
`automatic-speech-recognition`	`README`	`requirements`
`fill-mask`	`README`	`requirements`
`text-generation`	`README`	`requirements`
`text-generation/run-generation-script`	`README`	`requirements`
`text2text-generation`	`README`	`requirements`
`translation`	`README`	`requirements`
`stable-diffusion`	`README`	`requirements`

Most examples can be run as follows:

deepspeed --num_gpus [number of GPUs] test-[model].py

Additional Resources

Information about DeepSpeed can be found at the deepspeed.ai website.

DeepSpeed Inference

Additional information on DeepSpeed inference can be found here:

Getting Started with DeepSpeed for Inferencing Transformer based Models

Benchmarking

DeepSpeed inference benchmarking can be found in the DeepSpeed repository:

DeepSpeed Inference Benchmarking