ArbAlign (v2)
June 19, 2026 · View on GitHub
Optimal alignment of arbitrarily ordered molecular isomers using the Kuhn-Munkres / Hungarian algorithm and Kabsch RMSD.
Background
When comparing two molecular structures that are isomers of each other their atoms may appear in any order. A naïve atom-by-atom RMSD is meaningless unless the atoms are matched optimally first. ArbAlign:
- Groups atoms by element (or, optionally, by SYBYL type or MNA connectivity).
- For each group, finds the optimal one-to-one atom assignment via the
Kuhn-Munkres (Hungarian) algorithm (
scipy.optimize.linear_sum_assignment). - Tries all 48 combinations of axis permutations and sign flips to escape local minima that arise from symmetric point groups.
- Applies the full Kabsch rotation to superpose the aligned structure onto the reference.
If you use ArbAlign in published research, please cite:
Berhane Temelso, Joel M. Mabey, Toshiro Kubota, Nana Appiah-padi, George C. Shields. J. Chem. Info. Model. 2017, 57(5), 1045–1054. https://doi.org/10.1021/acs.jcim.6b00546
Installation
# Core (no optional dependencies)
pip install -e .
# With SYBYL-type and MNA connectivity labelling
pip install -e ".[openbabel]"
# Development (adds pytest)
pip install -e ".[dev]"
Python 3.6+ and NumPy ≥ 1.17 / SciPy ≥ 1.3 are required.
Quick start
# Full search (48 axis combos, element-label matching — default)
arbalign Mol-A.xyz Mol-B.xyz
# Fast mode (skip axis search)
arbalign -s Mol-A.xyz Mol-B.xyz
# Ignore hydrogens
arbalign -n Mol-A.xyz Mol-B.xyz
# Match by SYBYL atom type (requires openbabel-wheel)
arbalign -b t Mol-A.xyz Mol-B.xyz
# Match by MNA connectivity (requires openbabel-wheel)
arbalign -b c Mol-A.xyz Mol-B.xyz
# Verbose output (print every candidate RMSD)
arbalign -v Mol-A.xyz Mol-B.xyz
You can also invoke via python -m arbalign.
Python API
from arbalign import Molecule, align
mol_a = Molecule.from_xyz("Mol-A.xyz")
mol_b = Molecule.from_xyz("Mol-B.xyz")
result = align(mol_a, mol_b)
print(f"Best RMSD: {result.best_rmsd:.3f} Å")
print(f"Swap: {result.swap}")
print(f"Reflect: {result.reflection}")
# result.aligned is a Molecule in the original B atom order,
# with coordinates superposed onto A.
result.aligned.to_xyz("Mol-B-aligned.xyz")
Molecule
| Method / attribute | Description |
|---|---|
Molecule.from_xyz(path, no_hydrogens=False) | Read XYZ file |
mol.to_xyz(path, title=None) | Write XYZ file |
mol.labels | list[str] of element/type labels |
mol.coords | ndarray shape (N, 3) |
mol.element_counts() | {element: count} |
mol.unique_elements() | sorted unique labels |
mol.indices_of(element) | indices matching label |
mol.centroid() | geometric centroid |
mol.centered() | copy translated to origin |
mol.with_labels(new_labels) | copy with different labels |
mol.sorted_copy() | (sorted_mol, orig_indices) |
mol.validate_compatible(other) | raises ValueError on mismatch |
align
align(mol_a, mol_b, simple=False, verbose=False) -> AlignResult
AlignResult attribute | Description |
|---|---|
initial_rmsd | Kabsch RMSD before any reordering |
sorted_rmsd | Kabsch RMSD after sorting both by element |
best_rmsd | Kabsch RMSD after optimal reordering + axes search |
swap | Best axis permutation tuple e.g. (0, 2, 1) |
reflection | Best sign-flip tuple e.g. (-1, 1, 1) |
aligned | Molecule in original B order, superposed onto A |
Running tests
pytest tests/ -v
Repository layout
Arbalign-improved/
├── arbalign/
│ ├── __init__.py # public API exports
│ ├── __main__.py # python -m arbalign entry point
│ ├── cli.py # argparse CLI (replaces ArbAlign-driver.py + ArbAlign.py)
│ ├── molecule.py # Molecule class and XYZ I/O
│ ├── core.py # Kabsch RMSD / superpose / rotation
│ ├── align.py # Kuhn-Munkres + 48-isometry search
│ └── labeling.py # SYBYL / MNA relabelling via OpenBabel
├── tests/
│ ├── conftest.py
│ ├── test_core.py
│ ├── test_molecule.py
│ ├── test_align.py
│ └── fixtures/ # Mol-A.xyz, Mol-B.xyz, 10-1.xyz, 10-2.xyz
└── pyproject.toml