nvForest - Highly Optimized Decision Tree Inference
March 2, 2026 · View on GitHub
nvForest is a highly-optimized and lightweight RAPIDS library that enables fast inference for decision tree models on NVIDIA GPUs and CPUs. It does not train models; it runs inference on models trained elsewhere (e.g., XGBoost, LightGBM, scikit-learn, or cuML).
nvForest uses Treelite as the common format for importing tree models. You can load a model from a file or from an in-memory scikit-learn or Treelite object, then run predictions with a scikit-learn-like API. Setting device="auto" lets you deploy the same script on machines with or without GPUs.
As an example, the following Python snippet loads an XGBoost model and runs inference on GPU:
import nvforest
# Load XGBoost model for GPU inference
fm = nvforest.load_model("/path/to/xgboost_model.ubj", device="gpu",
model_type="xgboost_ubj")
# Run inference (X can be a NumPy array or CuPy array)
pred = fm.predict(X)
Load a scikit-learn random forest model and get class probabilities:
import nvforest
from sklearn.ensemble import RandomForestClassifier
# Train with scikit-learn (or load a saved model)
skl_model = RandomForestClassifier(...)
skl_model.fit(X_train, y_train)
# Load into nvForest for fast GPU inference
fm = nvforest.load_from_sklearn(skl_model, device="gpu")
class_probs = fm.predict_proba(X)
For more examples and the full API, see the Getting started guide and the Python API documentation.
Supported Models
| Source | Formats |
|---|---|
| XGBoost | UBJSON, JSON, legacy binary |
| LightGBM | Text (.txt) |
| scikit-learn | In-memory (RandomForest, ExtraTrees, GradientBoosting) |
| cuML | Via Treelite export |
| Treelite | Checkpoint / in-memory treelite.Model |
Inference Modes
| Method | Description |
|---|---|
predict(X) | Standard predictions (class labels or regression values) |
predict_proba(X) | Class probabilities (classification only) |
apply(X) | Leaf indices per tree |
predict_per_tree(X) | Prediction from each tree in the ensemble |
You can tune performance with layout (e.g., depth_first, breadth_first) and chunk_size; use fm.optimize() to auto-tune.
Installation
See the RAPIDS Release Selector for the command line to install either nightly or official release nvForest packages via conda, pip, or Docker.
Build/Install from Source
See the build guide.
Contributing
We welcome contributions. For guidelines and how to get started, see the RAPIDS contributing guide.
Contact
Find out more on the RAPIDS site.
Open GPU Data Science
The RAPIDS suite of open source software libraries aims to enable execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, exposing GPU parallelism and high-bandwidth memory through user-friendly Python interfaces.