BCOS
November 15, 2025 ยท View on GitHub
A stochastic approximation method (optimizer) with Block-Coordinate Optimal Stepsizes (paper on arXiv).
Installation
Download the file bcos.py and import BCOS from bcos.
Usage
Follow the PyTorch optimizer instructions by first constructing a BCOS optimizer as follows:
optimizer = BCOS(params, lr=0.001, beta=0.9, eps=1e-6, weight_decay=0.1,
mode='c', decouple_wd=True, simple_cond=False)
Parameters
- params (iterable): iterable of model parameters or iterable of dicts defining parameter groups.
- lr (float): the learning rate.
- beta (float, optional): smoothing factor in computing the momentum and EMA estimators (default: 0.9).
- eps (float, optional): small constant added to the denominator to improve numerical stability (default: 1e-6).
- weight_decay (float, optional): weight decay regularization strength (default: 0.1).
- mode (string, optional): algorithmic mode of BCOS, must be one of the three choices (default: 'c'):
- 'g': use gradient as search direction and EMA estimator for its 2nd moment (equivalent to RMSprop).
- 'm': use momentum as search direction and EMA estimator for its 2nd moment (using same beta).
- 'c': use momentum as search direction and conditional estimator for its 2nd moment.
- decouple_wd (bool, optional): whether or not use decoupled weight decay regularization (default: True).
- simple_cond (bool, optional): whether or not use simple alternative in BCOS-c variant (default: False)/
License
BCOS is MIT licensed, as found in the LICENSE file.