Lazy Predict

April 26, 2026 · View on GitHub

image Publish Documentation Downloads CodeFactor Citations

Lazy Predict helps build a lot of basic models without much code and helps understand which models work better without any parameter tuning.

Features

  • Over 40 built-in machine learning models
  • Automatic model selection for classification, regression, and time series forecasting
  • 20+ forecasting models: statistical (ETS, ARIMA, Theta), ML (Random Forest, XGBoost, etc.), deep learning (LSTM, GRU), and pretrained foundation models (TimesFM)
  • Automatic seasonal period detection via ACF
  • Multiple categorical encoding strategies (OneHot, Ordinal, Target, Binary)
  • Built-in MLflow integration for experiment tracking
  • GPU acceleration: XGBoost, LightGBM, CatBoost, cuML (RAPIDS), LSTM/GRU, TimesFM
  • Support for Python 3.9 through 3.13
  • Custom metric evaluation support
  • Configurable timeout and cross-validation
  • Intel Extension for Scikit-learn acceleration support

Installation

pip (PyPI)

pip install lazypredict

conda (conda-forge)

conda install -c conda-forge lazypredict

Optional extras (pip only)

Install with boosting libraries (XGBoost, LightGBM, CatBoost):

pip install lazypredict[boost]

Install with time series forecasting support:

pip install lazypredict[timeseries]          # statsmodels + pmdarima
pip install lazypredict[timeseries,deeplearning]  # + LSTM/GRU via PyTorch
pip install lazypredict[timeseries,foundation]    # + Google TimesFM (Python 3.10-3.11)

Install with all optional dependencies:

pip install lazypredict[all]

Usage

To use Lazy Predict in a project:

import lazypredict

Classification

Example:

from lazypredict.Supervised import LazyClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

data = load_breast_cancer()
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=123)

clf = LazyClassifier(verbose=0, ignore_warnings=True, custom_metric=None)
models, predictions = clf.fit(X_train, X_test, y_train, y_test)

print(models)

Advanced Options

# With categorical encoding, timeout, cross-validation, and GPU
clf = LazyClassifier(
    verbose=1,                          # Show progress
    ignore_warnings=True,               # Suppress warnings
    custom_metric=None,                 # Use default metrics
    predictions=True,                   # Return predictions
    classifiers='all',                  # Use all available classifiers
    categorical_encoder='onehot',       # Encoding: 'onehot', 'ordinal', 'target', 'binary'
    timeout=60,                         # Max time per model in seconds
    cv=5,                               # Cross-validation folds (optional)
    use_gpu=True                        # Enable GPU acceleration
)
models, predictions = clf.fit(X_train, X_test, y_train, y_test)

Parameters:

  • verbose (int): 0 for silent, 1 for progress display
  • ignore_warnings (bool): Suppress scikit-learn warnings
  • custom_metric (callable): Custom evaluation metric
  • predictions (bool): Return prediction DataFrame
  • classifiers (str/list): 'all' or list of classifier names
  • categorical_encoder (str): Encoding strategy for categorical features
    • 'onehot': One-hot encoding (default)
    • 'ordinal': Ordinal encoding
    • 'target': Target encoding (requires category-encoders)
    • 'binary': Binary encoding (requires category-encoders)
  • timeout (int): Maximum seconds per model (None for no limit)
  • cv (int): Number of cross-validation folds (None to disable)
  • use_gpu (bool): Enable GPU acceleration for supported models (default False)
ModelAccuracyBalanced AccuracyROC AUCF1 ScoreTime Taken
LinearSVC0.9894740.9875440.9875440.9894620.0150008
SGDClassifier0.9894740.9875440.9875440.9894620.0109992
MLPClassifier0.9859650.9869040.9869040.9859940.426
Perceptron0.9859650.9847970.9847970.9859650.0120046
LogisticRegression0.9859650.982690.982690.9859340.0200036
LogisticRegressionCV0.9859650.982690.982690.9859340.262997
SVC0.9824560.9799420.9799420.9824370.0140011
CalibratedClassifierCV0.9824560.9757280.9757280.9823570.0350015
PassiveAggressiveClassifier0.9754390.9744480.9744480.9754640.0130005
LabelPropagation0.9754390.9744480.9744480.9754640.0429988
LabelSpreading0.9754390.9744480.9744480.9754640.0310006
RandomForestClassifier0.971930.9695940.9695940.971930.033
GradientBoostingClassifier0.971930.9674860.9674860.9718690.166998
QuadraticDiscriminantAnalysis0.9649120.9662060.9662060.9650520.0119994
HistGradientBoostingClassifier0.9684210.9647390.9647390.9683870.682003
RidgeClassifierCV0.971930.9632720.9632720.9717360.0130029
RidgeClassifier0.9684210.9605250.9605250.9682420.0119977
AdaBoostClassifier0.9614040.9592450.9592450.9614440.204998
ExtraTreesClassifier0.9614040.9571380.9571380.9613620.0270066
KNeighborsClassifier0.9614040.955030.955030.9612760.0560005
BaggingClassifier0.9473680.9545770.9545770.9478820.0559971
BernoulliNB0.9508770.9510030.9510030.9510720.0169988
LinearDiscriminantAnalysis0.9614040.9508160.9508160.9610890.0199995
GaussianNB0.9543860.9495360.9495360.9543370.0139935
NuSVC0.9543860.9432150.9432150.9540140.019989
DecisionTreeClassifier0.9368420.9336930.9336930.9369710.0170023
NearestCentroid0.9473680.9335060.9335060.9468010.0160074
ExtraTreeClassifier0.9228070.9121680.9121680.9224620.0109999
CheckingClassifier0.3614040.50.50.1918790.0170043
DummyClassifier0.5122810.4895980.4895980.5189240.0119965

Regression

Example:

from lazypredict.Supervised import LazyRegressor
from sklearn import datasets
from sklearn.utils import shuffle
import numpy as np

diabetes  = datasets.load_diabetes()
X, y = shuffle(diabetes.data, diabetes.target, random_state=13)
X = X.astype(np.float32)

offset = int(X.shape[0] * 0.9)

X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]

reg = LazyRegressor(verbose=0, ignore_warnings=False, custom_metric=None)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

print(models)

Advanced Options

# With categorical encoding, timeout, and GPU
reg = LazyRegressor(
    verbose=1,                          # Show progress
    ignore_warnings=True,               # Suppress warnings
    custom_metric=None,                 # Use default metrics
    predictions=True,                   # Return predictions
    regressors='all',                   # Use all available regressors
    categorical_encoder='ordinal',      # Encoding: 'onehot', 'ordinal', 'target', 'binary'
    timeout=120,                        # Max time per model in seconds
    use_gpu=True                        # Enable GPU acceleration
)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

Parameters:

  • verbose (int): 0 for silent, 1 for progress display
  • ignore_warnings (bool): Suppress scikit-learn warnings
  • custom_metric (callable): Custom evaluation metric
  • predictions (bool): Return prediction DataFrame
  • regressors (str/list): 'all' or list of regressor names
  • categorical_encoder (str): Encoding strategy for categorical features
    • 'onehot': One-hot encoding (default)
    • 'ordinal': Ordinal encoding
    • 'target': Target encoding (requires category-encoders)
    • 'binary': Binary encoding (requires category-encoders)
  • timeout (int): Maximum seconds per model (None for no limit)
  • use_gpu (bool): Enable GPU acceleration for supported models (default False)
ModelAdjusted R-SquaredR-SquaredRMSETime Taken
ExtraTreesRegressor0.3789210.52007654.22020.121466
OrthogonalMatchingPursuitCV0.3749470.51700454.39340.0111742
Lasso0.3734830.51587354.4570.00620174
LassoLars0.3734740.51586654.45750.0087235
LarsCV0.37150.51434154.54320.0160234
LassoCV0.3704130.51350154.59030.0624897
PassiveAggressiveRegressor0.3669580.51083154.73990.00689793
LassoLarsIC0.3649840.50930654.82520.0108321
SGDRegressor0.3643070.50878354.85440.0055306
RidgeCV0.3630020.50777454.91070.00728202
Ridge0.3630020.50777454.91070.00556874
BayesianRidge0.3622960.50722954.94110.0122972
LassoLarsCV0.3617490.50680654.96460.0175984
TransformedTargetRegressor0.3617490.50680654.96460.00604773
LinearRegression0.3617490.50680654.96460.00677514
Lars0.3588280.50454955.09030.00935149
ElasticNetCV0.3561590.50248655.20480.0478678
HuberRegressor0.3552510.50178555.24370.0129263
RandomForestRegressor0.3496210.49743455.48440.2331
AdaBoostRegressor0.3404160.49032255.87570.0512381
LGBMRegressor0.3392390.48941255.92550.0396187
HistGradientBoostingRegressor0.3356320.48662556.07790.0897055
PoissonRegressor0.3230330.47688956.60720.00953603
ElasticNet0.3017550.46044757.48990.00604224
KNeighborsRegressor0.2998550.45897957.56810.00757337
OrthogonalMatchingPursuit0.2924210.45323557.87290.00709486
BaggingRegressor0.2912130.45230157.92230.0302746
GradientBoostingRegressor0.2470090.41814359.70110.136803
TweedieRegressor0.2442150.41598459.81180.00633955
XGBRegressor0.2242630.40056760.59610.339694
GammaRegressor0.2238950.40028360.61050.0235181
RANSACRegressor0.2035350.3845561.40040.0653253
LinearSVR0.1167070.31745564.66070.0077076
ExtraTreeRegressor0.002019020.22883368.73040.00626636
NuSVR-0.06670430.17572871.05750.0143399
SVR-0.09641280.15277272.04020.0114729
DummyRegressor-0.297553-0.0026547878.37010.00592971
DecisionTreeRegressor-0.470263-0.13611283.42290.00749898
GaussianProcessRegressor-0.769174-0.36708991.51090.0770502
MLPRegressor-1.86772-1.21597116.5080.235267
KernelRidge-5.03822-3.6659169.0610.0243919

Time Series Forecasting

LazyForecaster benchmarks 20+ forecasting models on your time series in a single call:

import numpy as np
from lazypredict.TimeSeriesForecasting import LazyForecaster

# Generate sample data (or use your own)
np.random.seed(42)
t = np.arange(200)
y = 10 + 0.05 * t + 3 * np.sin(2 * np.pi * t / 12) + np.random.normal(0, 1, 200)

y_train, y_test = y[:180], y[180:]

fcst = LazyForecaster(verbose=0, ignore_warnings=True)
scores, predictions = fcst.fit(y_train, y_test)
print(scores)
ModelMAERMSEMAPESMAPEMASER-SquaredTime Taken
Holt0.85321.02856.32416.17580.69930.72180.03
SARIMAX0.87911.06016.50126.34140.72050.70450.12
Ridge_TS0.91241.08436.75236.57210.74780.69120.01
........................

With Exogenous Variables

# Optional exogenous features
X_train = np.column_stack([np.sin(t[:180]), np.cos(t[:180])])
X_test = np.column_stack([np.sin(t[180:]), np.cos(t[180:])])

scores, predictions = fcst.fit(y_train, y_test, X_train, X_test)

Advanced Options

fcst = LazyForecaster(
    verbose=1,                          # Show progress
    ignore_warnings=True,               # Suppress model errors
    predictions=True,                   # Return forecast values
    seasonal_period=12,                 # Override auto-detection
    cv=3,                               # Time series cross-validation
    timeout=30,                         # Max seconds per model
    sort_by="RMSE",                     # Sort metric (MAE, MAPE, SMAPE, MASE, R-Squared)
    forecasters="all",                  # Or list: ["Holt", "AutoARIMA", "LSTM_TS"]
    max_models=10,                      # Limit number of models
    use_gpu=True,                       # GPU acceleration for supported models
    foundation_model_path="/path/to/timesfm-weights",  # Local model weights (offline)
)
scores, predictions = fcst.fit(y_train, y_test)

Parameters:

  • verbose (int): 0 for silent, 1 for progress display
  • ignore_warnings (bool): Suppress per-model exceptions
  • predictions (bool): Return a second DataFrame of forecasted values
  • seasonal_period (int/None): Seasonal cycle length; None auto-detects via ACF
  • cv (int/None): Number of TimeSeriesSplit folds for cross-validation
  • timeout (int/float/None): Maximum training seconds per model
  • sort_by (str): Metric to sort by ("RMSE", "MAE", "MAPE", "SMAPE", "MASE", "R-Squared")
  • forecasters (str/list): "all" or a list of model names
  • n_lags (int): Number of lag features for ML/DL models (default 10)
  • n_rolling (tuple): Rolling-window sizes for feature engineering (default (3, 7))
  • max_models (int/None): Limit total models to train
  • custom_metric (callable): Additional metric f(y_true, y_pred) -> float
  • use_gpu (bool): Enable GPU acceleration for supported models (default False)
  • foundation_model_path (str): Local path to pre-downloaded foundation model weights (e.g. TimesFM)

Available model categories:

  • Baselines: Naive, SeasonalNaive
  • Statistical (statsmodels): SimpleExpSmoothing, Holt, HoltWinters_Add, HoltWinters_Mul, Theta, SARIMAX
  • Statistical (pmdarima): AutoARIMA
  • ML (sklearn): LinearRegression_TS, Ridge_TS, Lasso_TS, ElasticNet_TS, KNeighborsRegressor_TS, DecisionTreeRegressor_TS, RandomForestRegressor_TS, GradientBoostingRegressor_TS, AdaBoostRegressor_TS, ExtraTreesRegressor_TS, BaggingRegressor_TS, SVR_TS, XGBRegressor_TS, LGBMRegressor_TS, CatBoostRegressor_TS
  • Deep Learning (torch): LSTM_TS, GRU_TS
  • Foundation (timesfm): TimesFM

GPU Acceleration

Enable GPU acceleration for supported models with use_gpu=True:

from lazypredict.Supervised import LazyClassifier, LazyRegressor

# Classification with GPU
clf = LazyClassifier(use_gpu=True, verbose=0, ignore_warnings=True)
models, predictions = clf.fit(X_train, X_test, y_train, y_test)

# Regression with GPU
reg = LazyRegressor(use_gpu=True, verbose=0, ignore_warnings=True)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

# Time Series with GPU
from lazypredict.TimeSeriesForecasting import LazyForecaster
fcst = LazyForecaster(use_gpu=True, verbose=0, ignore_warnings=True)
scores, predictions = fcst.fit(y_train, y_test)

Supported GPU backends:

  • XGBoostdevice="cuda"
  • LightGBMdevice="gpu"
  • CatBoosttask_type="GPU"
  • cuML (RAPIDS) — GPU-native scikit-learn replacements (auto-discovered when installed)
  • LSTM / GRU — PyTorch CUDA
  • TimesFM — PyTorch CUDA

Falls back to CPU automatically if no CUDA GPU is available.

Categorical Encoding

Lazy Predict supports multiple categorical encoding strategies:

from lazypredict.Supervised import LazyClassifier
import pandas as pd
from sklearn.model_selection import train_test_split

# Example with categorical features
df = pd.read_csv('data_with_categories.csv')
X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# Try different encoders
for encoder in ['onehot', 'ordinal', 'target', 'binary']:
    clf = LazyClassifier(
        categorical_encoder=encoder,
        verbose=0,
        ignore_warnings=True
    )
    models, predictions = clf.fit(X_train, X_test, y_train, y_test)
    print(f"\n{encoder.upper()} Encoding Results:")
    print(models.head())

Note: Target and binary encoders require the category-encoders package:

pip install category-encoders

Intel Extension Acceleration

For improved performance on Intel CPUs, install Intel Extension for Scikit-learn:

pip install scikit-learn-intelex

Lazy Predict will automatically detect and use it for acceleration.

MLflow Integration

Lazy Predict includes built-in MLflow integration. Enable it by setting the MLflow tracking URI:

import os
os.environ['MLFLOW_TRACKING_URI'] = 'sqlite:///mlflow.db'

# MLflow tracking will be automatically enabled
reg = LazyRegressor(verbose=0, ignore_warnings=True)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

Automatically tracks:

  • Model metrics (R-squared, RMSE, etc.)
  • Training time
  • Model parameters
  • Model artifacts