AdaCubic 🚀

March 1, 2026 · View on GitHub

Implementation of the paper: "AdaCubic: An Adaptive Cubic Regularization Optimizer for Deep Learning", published in TMLR 2026, by Ioannis Tsingalis and Constantine Kotropoulos and Corentin Briat.

Paper Abstract

A novel regularization technique, AdaCubic, is proposed that adapts the weight of the cubic term. The heart of AdaCubic is an auxiliary optimization problem with cubic constraints that dynamically adjusts the weight of the cubic term in Newton’s cubic regularized method. We use Hutchinson’s method to approximate the Hessian matrix, thereby reducing computational cost. We demonstrate that AdaCubic inherits the cubically regularized Newton method’s local convergence guarantees. Our experiments in Computer Vision, Natural Language Processing, and Signal Processing tasks demonstrate that AdaCubic outperforms or competes with several widely used optimizers. Unlike other adaptive algorithms that require hyperparameter fine-tuning, AdaCubic is evaluated with a fixed set of hyperparameters, rendering it a highly attractive optimizer in settings where fine-tuning is infeasible. This makes AdaCubic an attractive option for researchers and practitioners alike. To our knowledge, AdaCubic is the first optimizer to leverage cubic regularization in scalable deep learning applications.

Usage

AdaCubic optimizer implementation is designed to seamlessly integrate with PyTorch as a drop-in replacement for any existing optimizer. Simply by setting create_graph=True in the backward() call, you can enjoy its benefits without the need for additional adjustments.

from AdaCubic import AdaCubic
...
optimizer = AdaCubic(model.parameters())
...
for i, (samples, labels) in enumerate(train_loader):
    ...
    def closure(backward=True):
    if backward:
        optimizer.zero_grad()
    model_outputs = model(samples)
    cri_loss = criterion(model_outputs, labels)
    
    create_graph = type(optimizer).__name__ == "AdaCubic"
    if backward:
        cri_loss.backward(create_graph=create_graph)
    return cri_loss
    ...
    optimizer.step(closure=closure)
    ...

Documentation

`AdaCubic.init`

Argument	Description
`params` (iterable)	A collection of parameters to optimize, or dictionaries defining parameter groups.
`eta1`(float, optional)	Threshold related to the acceptance or rejection of the trial point (default: 0.05)
`eta1` (float, optional)	Threshold related to the acceptance or rejection of the trial point (default: 0.75)
`alpha1` (float, optional)	Constant that defined the portion of trust radius increase (default: 2.5)
`alpha2` (float, optional)	Constant that defined the portion of trust radius decrease (default: 0.25)
`kappa_easy` (int, optional)	The accuracy for the estimation of the root in Algorithm 2 (default: 0.01)
`grad_tol` (float, optional)	The accuracy of gradient to stop Algorithm 1 (default: 1e-4)
`xi0` (float, optional)	The size of the initial trust radius (default: 0.05)
`gamma1` (float, optional)	The gamma1 parameter in the algorithm Chapter 17 in Conn's book (default: 0.9)
`hutchinson_iters` (int, optional)	Number of times iterations for approximating the Hessian trace. (default: 1)
`average_conv_kernel` (bool, optional)	Compute the average of the Hessian traces of convolutional kernels. (default: false)
`solver` (str, optional)	The solver to use (default: exact)

`AdaCubic.step`

Performs a single optimization step.

Argument	Description
`closure` (callable, optional)	A closure that reevaluates the model and provides the loss as its output. (default: None)

Train and Test

runResNet.py --task cifar10 --optimizer AdaCubic --depth 20 --seed 45 --n_epochs 200

run_mlm_no_trainer.py --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 --model_name_or_path bert-base-uncased --optimizer AdaCubic --num_train_epochs 10 --seed 45 --output_dir /your_root/AdaCubic/Code/MLAlgorithms/LM/mlm/

Citation

@article{
    tsingalis2026adacubic,
    title={AdaCubic: An Adaptive Cubic Regularization Optimizer for Deep Learning},
    author={Ioannis Tsingalis and Constantine Kotropoulos and Corentin Briat},
    journal={Transactions on Machine Learning Research},
    issn={2835-8856},
    year={2026},
    url={https://openreview.net/forum?id=pZBQ7J37lk},
    note={J2C Certification}
}