AltumAge
May 27, 2025 ยท View on GitHub
๐งฌ AltumAge: A Pan-Tissue DNA Methylation Epigenetic Clock Based on Deep Learning
AltumAge is a state-of-the-art epigenetic clock that predicts biological age from DNA methylation data across multiple tissue types. Built using deep learning, AltumAge demonstrates superior performance compared to traditional epigenetic clocks.
๐ฏ Key Features
- Pan-tissue compatibility: Works across multiple tissue types
- Deep learning architecture: Leverages neural networks for improved accuracy
- Multi-platform support: Compatible with Illumina 27k, 450k, and EPIC arrays
- PyTorch compatibility: Available in both TensorFlow and PyTorch formats
- Easy integration: Now available through the pyaging package
๐ Performance Highlights
- Trained, validated, and tested on 142 datasets
- Outperforms Horvath's 2013 model across multiple metrics
- Robust performance across diverse tissue types and age ranges
๐ Quick Start
Option 1: Using pyaging (Recommended)
The easiest way to use AltumAge is through pyaging:
pip install pyaging
Then follow the DNA methylation age prediction tutorial.
Option 2: Standalone Usage
Prerequisites
pip install tensorflow==2.5.0 numpy pandas scikit-learn
Basic Usage
import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn import linear_model, preprocessing
# Load CpG sites
cpgs = np.array(pd.read_pickle('example_dependencies/multi_platform_cpgs.pkl'))
# Load your methylation data
data = pd.read_pickle('example_dependencies/example_data.pkl')
methylation_data = data[cpgs]
# Load scaler and model
scaler = pd.read_pickle('example_dependencies/scaler.pkl')
AltumAge = tf.keras.models.load_model('example_dependencies/AltumAge.h5')
# Scale and predict
methylation_data_scaled = scaler.transform(methylation_data)
predicted_ages = AltumAge.predict(methylation_data_scaled).flatten()
๐ Detailed Instructions
1. Data Preparation
AltumAge requires:
- DNA methylation beta values from Illumina arrays (27k, 450k, or EPIC)
- Selection of 20,318 specific CpG sites (provided in
CpGsites.csv)
2. Model Loading
# For TensorFlow users
AltumAge = tf.keras.models.load_model('example_dependencies/AltumAge.h5')
# For PyTorch users
import torch
AltumAge_pytorch = torch.load('dependencies/AltumAge.pt')
3. Preprocessing Pipeline
- Select the required CpG sites in the correct order
- Scale using the provided RobustScaler
- Fill up missing values with 0 after scaling
- Input to the model for age prediction
๐ Repository Structure
AltumAge/
โโโ example.ipynb # Complete usage example
โโโ example_dependencies/ # Required files for running AltumAge
โ โโโ AltumAge.h5 # TensorFlow model
โ โโโ multi_platform_cpgs.pkl # List of CpG sites
โ โโโ scaler.pkl # Preprocessing scaler
โ โโโ example_data.pkl # Example dataset
โโโ dependencies/
โ โโโ AltumAge.pt # PyTorch model
โโโ CpGsites.csv # Required CpG sites
โโโ supplementary_results/ # Detailed performance metrics
๐พ Data Availability
Access our comprehensive dataset collection:
- Raw data from ArrayExpress and GEO
- Organized methylation data (non-normalized)
- Google Drive Repository
๐ Citation
If you use AltumAge in your research, please cite:
@article{de_Lima_Camillo_AltumAge,
author = {de Lima Camillo, Lucas Paulo and Lapierre, Louis R and Singh, Ritambhara},
title = {A pan-tissue DNA-methylation epigenetic clock based on deep learning},
journal = {npj Aging},
volume = {8},
pages = {4},
year = {2022},
doi = {10.1038/s41514-022-00085-y},
publisher = {Springer Nature},
URL = {https://doi.org/10.1038/s41514-022-00085-y}
}
๐ง Contact
For questions or collaborations, please contact:
- Lucas Paulo de Lima Camillo: lucas_camillo@alumni.brown.edu
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.