Ano Optimizer

July 27, 2025 · View on GitHub

Ano is the official implementation of the optimizer introduced in the paper
"Ano: Faster is Better in Noisy Landscapes"

This optimizer is designed for efficient and stable training in high-variance regimes, and includes both a standard and a logarithmic-scheduled variant (Anolog). Native support is provided for both PyTorch and TensorFlow.

Key Features

Sign–Magnitude Decoupling
Ano separates update direction and magnitude, using the sign of the momentum and the norm of the raw gradient respectively. This improves stability and performance in high-variance settings.
Additive Second-Moment Estimation
Ano employs an additive second-moment update, inspired by Yogi, to ensure smoother convergence and mitigate issues with gradient sparsity.
Logarithmic Momentum Schedule (Anolog)
A variant of Ano using a time-dependent momentum parameter, enabled via logarithmic_schedule=True. This extension improves noise attenuation in stationary training regimes.
Dual Framework Support
Compatible with both PyTorch and TensorFlow. Import the appropriate implementation via ano_optimizer.Ano or ano_optimizer.tensorflow.AnoTF.

Installation

Install the PyTorch version (default):

pip install ano-optimizer

Install with TensorFlow support:

pip install 'ano-optimizer[tensorflow]'

Usage

PyTorch

from ano_optimizer import Ano  
import torch

model = MyModel()  
optimizer = Ano(model.parameters(), lr=1e-4)

for input, target in data_loader:  
    optimizer.zero_grad()  
    output = model(input)  
    loss = loss_fn(output, target)  
    loss.backward()  
    optimizer.step()

To enable Anolog:

optimizer = Ano(model.parameters(), lr=1e-4, logarithmic_schedule=True)

TensorFlow

from ano_optimizer.tensorflow import AnoTF  
import tensorflow as tf

model = MyModel()  
optimizer = AnoTF(learning_rate=1e-4)

with tf.GradientTape() as tape:  
    predictions = model(inputs)  
    loss = loss_fn(targets, predictions)

gradients = tape.gradient(loss, model.trainable_variables)  
optimizer.apply_gradients(zip(gradients, model.trainable_variables))

To enable Anolog:

optimizer = AnoTF(learning_rate=1e-4, logarithmic_schedule=True)

Citation

If you use this work in your research, please cite the following paper:

@misc{kegreisz2025ano,
  author       = {Kegreisz, Adrien},
  title        = {Ano: Faster is Better in Noisy Landscapes},
  year         = {2025},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.16422081},
  url          = {https://doi.org/10.5281/zenodo.16422081}
}

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contributing

Contributions, issues, and suggestions are welcome. Please open an issue or submit a pull request.