HigginsChenLab/methylCIPHER

May 2, 2026 · View on GitHub

HigginsChenLab/methylCIPHER

The goal of methylCIPHER is to allow users to easily calculate their choice of CpG clocks using simple commands, from a single source. CpG epigenetic clocks are currently found in many places, and some require users to send data to external portals, which is not advisable when working with protected or restricted data. The current package allows you to calculate reported epigenetic clocks, or where not precisely disclosed, our best estimates–all performed locally on your own machine in R Studio. We would like to acknowledge the authors of the original clocks and their valuable contributions to aging research, and the requisite citations for their clocks can be found in the getClockInfo table. Please do not forget to cite them in your work!

Note: the Higgins-Chen lab also maintains methylCIPHERplus, an internal package of prototype clocks that have not yet been published and proprietary clocks that cannot be publicly released. If you are interested in collaborating on these, please reach out to us. Most of these clocks have been benchmarked on key biomarker properties on our TranslAGE platform (translage.io), and you can download summary statistics there to help you select clocks for your study.

Installation

You can install the released version of methylCIPHER and its imported packages from Github with:

devtools::install_github("danbelsky/DunedinPoAm38")
devtools::install_github("danbelsky/DunedinPACE")
devtools::install_github("HigginsChenLab/methylCIPHER")

Calculating Epigenetic Clocks and Predictors

Running single “clock” calculations

The current package contains a large number of currently available CpG clocks or CpG based predictors. While we strove to be inclusive of such published CpG-based epigenetic clocks to our knowledge, if you find we are missing a clock, please contact us and we will do our best to promptly include it, if possible. You can do so by raising an issue on this repo or emailing us directly at a.higginschen@yale.edu.

In order to calculate a CpG clock, you simply need to use the appropriate function, typically named “calc[ClockNameHere]”. For example:

library(methylCIPHER)
calcPhenoAge(exampleBetas, examplePheno, imputation = F)
namegeo_accessiongenderagegroupsamplePhenoAge
7786915023_R02C02GSM1343050M57.91152.29315
7786915135_R04C02GSM1343051M42.01241.05867
7471147149_R06C01GSM1343052M47.41343.54460
7786915035_R05C01GSM1343053M49.31443.96697
7786923035_R01C01GSM1343054M52.51540.35242

Alternatively, if you would just like to receive a vector with the clock values to use, rather than appending it to an existing phenotype/ demographic dataframe, simply use:

calcPhenoAge(exampleBetas, imputation = F)
#> 7786915023_R02C02 7786915135_R04C02 7471147149_R06C01 7786915035_R05C01 
#>          52.29315          41.05867          43.54460          43.96697 
#> 7786923035_R01C01 
#>          40.35242

Categories of Epigenetic Clocks

Due to the abundance of epigenetic clocks are overlapping reasons for use, it is important to keep track of which clocks are most related to each other. This will allow you to steer clear of multiple testing and collinearity problems if you use all available clocks for your analysis.

Another important note is that with a few exceptions, the list of following clocks were trained and validated almost exclusively in blood. This can lead to a number of observed effects, which include unreasonable shifts in age intercept. As bespoke clocks are developed for use in additional human tissues, we will include these in their own section below.

getClockInfo()
Clock Name1st AuthorYearPMIDTrained Phenotype# of CpGsCohort TrainedTissues DerivedAge Range TrainedArray Type Trained
AdaptAgeYing202438243142Chronological Age1,000London Life Sciences Prospective Population (LOLIPOP)Blood23.7-75450K
Age_predictionSehgal202337503069Chronological Age125,175HRS and FHSBlood24 - 100450K
AlcoholMcCartney201830257690Clinical Phenotype450Generation Scotland: The Scottish Family Health Study [GS]Blood18–98450K
BloodSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
BMIMcCartney201830257690Clinical Phenotype1,109Generation Scotland: The Scottish Family Health Study [GS]Blood18-98450K
BocklandtBocklandt201121731603Chronological Age1See MiscSaliva21-5527K
BohlinBohlin201627717397Gestational Age251MoBa1450K
BrainSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
CausAgeYing202438243142Chronological Age586London Life Sciences Prospective Population (LOLIPOP)Blood23.7-75450K
CellDRIFTMinteer202337467337Mitotic Divisions2,322Immortalized astrocytesEPICv1
CellPopAgeLujan202438956711Mitotic Divisions42FibroblastsEPICv1
DamAgeYing202438243142Chronological Age1,090London Life Sciences Prospective Population (LOLIPOP)Blood23.7-75450K
DNAmADMLu20196366976Protein186FHS- Framingham heart study Offspring CohortBlood~40(min) to ~90 (max)450K
DNAmB2MLu20196366976Protein91FHS- Framingham heart study Offspring CohortBlood~40(min) to ~90 (max)450K
DNAmClockCorticalShireby202033300551Chronological Age347Multiple cohortsBrain Tissue1-108450K
DNAmCRPArpawong202640889076Clinical Biomarker185HRSBlood51-100EPICv1
DNAmCystatinC_PhysAgeLu20196366976Protein87FHS- Framingham heart study Offspring CohortBlood~40(min) to ~90 (max)450K
DNAmCystatinCArpawong202640889076Protein238HRSBlood51-100EPICv1
DNAmDHEASArpawong202640889076Clinical Biomarker199HRSBlood51-100EPICv1
DNAmFEV1_wAgeMcGreevy202336812475Clinical Phenotype77FHS, BLSA, BudapestBlood450K
DNAmFI_LiLi202236071044Frailty20ESTHERBlood50 - 75EPICv1
DNAmFitAgeMcGreevy202336812475Clinical Phenotype627FHS, BLSA, BudapestBlood450K
DNAmGait_noAgeMcGreevy202336812475Clinical Phenotype59FHS, BLSA, BudapestBlood450K
DNAmGait_wAgeMcGreevy202336812475Clinical Phenotype42FHS, BLSA, BudapestBlood450K
DNAmGDF15Lu20196366976Protein137FHS- Framingham heart study Offspring CohortBlood~40(min) to ~90 (max)450K
DNAmGripStrength_noAgeMcGreevy202336812475Clinical Phenotype93FHS, BLSA, BudapestBlood450K
DNAmGripStrength_wAgeMcGreevy202336812475Clinical Phenotype64FHS, BLSA, BudapestBlood450K
DNAmHbA1cArpawong202640889076Clinical Biomarker233HRSBlood51-100EPICv1
DNAmHDLArpawong202640889076Clinical Biomarker516HRSBlood51-100EPICv1
DNAmICFuentealba202540467932Intrinsic Capacity91Inspire-TBlood20-102EPICv1
DNAmLeptinLu20196366976Protein187FHS- Framingham heart study Offspring CohortBlood~40(min) to ~90 (max)450K
DNAmlogA1CLu202236516495Clinical Biomarker86FHS- Framingham heart study Offspring CohortBlood40 (min), 59 (25th), 66.1(mean), 73(75th), 92 (max)450K
DNAmlogCRPLu202236516495Clinical Biomarker132FHS- Framingham heart study Offspring CohortBlood40 (min), 59 (25th), 66.1(mean), 73(75th), 92 (max)450K
DNAmPACKYRSLu20196366976Clinical Phenotype172FHS- Framingham heart study Offspring CohortBlood~40(min) to ~90 (max)450K
DNAmPAI1Lu20196366976Protein211FHS- Framingham heart study Offspring CohortBlood~40(min) to ~90 (max)450K
DNAmPeakflowArpawong202640889076Clinical Biomarker155HRSBlood51-100EPICv1
DNAmPulsePrArpawong202640889076Clinical Biomarker60HRSBlood51-100EPICv1
DNAmStressJung202336182531Clinical Phenotype211NIAAA Discovery Stress CohortBloodEPICv1
DNAmTIMP1Lu20196366976Protein42FHS- Framingham heart study Offspring CohortBlood~40(min) to ~90 (max)450K
DNAmTLLu20193142238Telomere Length140WHI+ JHS- Women’s Health Initiative & Jackson Heart StudyBlood50.2, 66.5, 80.2 (WHI min, median, max) - 22.2, 56.6, 93.1 (JHS min, median, max)450K and EPICv1
DNAmVO2maxMcGreevy202336812475Clinical Phenotype40FHS, BLSA, BudapestBlood450K
DNAmWHRArpawong202640889076Clinical Biomarker191HRSBlood51-100EPICv1
DunedinPACEBelsky202235029144Pace of Aging173Dunedin StudyBlood26-45450K and EPICv1
DunedinPoAm38Belsky202032367804Pace of Aging47Dunedin StudyBlood26-38450K and EPICv1
EpiTOC1Yang201627716309Mitotic Divisions385UCSD and WCH (GSE40279)BloodSee Misc450K
EpiTOC2Teschendorff202032580750Mitotic Divisions163UCSD and WCH (GSE40279)Blood19-101450K
GaragnaniGaragnani201223061750Chronological Age1See MiscBlood42–83 and 9–52450K
GrimAgeV1Lu201930669119Mortality1,030FHS- Framingham heart study Offspring CohortBlood~40(min) to ~90 (max)450K
GrimAgeV2Lu202236516495Mortality1,030FHS- Framingham heart study Offspring CohortBlood40 (min), 59 (25th), 66.1(mean), 73(75th), 92 (max)450K
HannumHannum201323177740Chronological Age71UCSD and WCH (GSE40279)Blood19-101450K
HeartSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
HormoneSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
Horvath1Horvath201324138928Chronological Age353Multiple cohorts51 healthy tissues and cell types~3 (min) to 100 (max)450K and 27K
Horvath2Horvath201830048243Chronological Age391Multiple cohortsHuman fibroblasts, keratinocytes, buccal cells, endothelial cells, lymphoblastoid cells, skin, blood, and saliva-0.3(min) to 94 (max)450K and EPICv1
HRSInCHPhenoAgeHiggins-Chen202236277076Mortality via proxy959HRS and InCHIANTIBlood21-100450K
HypoClockTeschendorff202032580750Mitotic Divisions678See MiscSee MiscSee Misc450K
ImmuneSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
InflammationSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
IntrinClockTomusiak202439095531Chronological Age381Multiple cohorts15 Tissues, majority blood0-100 (approx)450K and EPICv1
KidneySehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
KnightKnight201627717399Gestational Age148Multiple cohortsumbilical cord blood or blood spotsneonates450K and 27K
LeeControlLee201931235674Gestational Age546Multiple cohortsPlacenta5 to 42 weeks gestation450K and EPICv1
LeeRefinedRobustLee201931235674Gestational Age395Multiple cohortsPlacenta5 to 42 weeks gestation450K and EPICv1
LeeRobustLee201931235674Gestational Age558Multiple cohortsPlacenta5 to 42 weeks gestation450K and EPICv1
LinWeidner201424490752Chronological Age99HNR studyBlood0-7827K
LiverSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
LungSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
MayneMayne201727894195Gestational Age62Multiple cohortsPlacenta8-42 weeks gestation450K and 27K
MetabolicSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
MiAgeYoun201829160179Mitotic Divisions2688 cancer types and adjacent tissues450K and EPICv1
MusculoSkeletalSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
PCADMHiggins-Chen202236277076Protein78,464FHS- Framingham heart studyBlood24 - 92450K
PCB2MHiggins-Chen202236277076Protein78,464FHS- Framingham heart studyBlood24 - 92450K
PCBrainAgeThrush202235907208Chronological Age357,852NIMH Brain Tissue CollectionDorsolateral prefrontal cortex20-98450K
PCCystatinCHiggins-Chen202236277076Protein78,464FHS- Framingham heart studyBlood24 - 92450K
PCDNAmTLHiggins-Chen202236277076Telomere Length78,464FHS- Framingham heart studyBlood24 - 92450K
PCGDF15Higgins-Chen202236277076Protein78,464FHS- Framingham heart studyBlood24 - 92450K
PCGrimAgeHiggins-Chen202236277076Mortality78,464FHS- Framingham heart studyBlood24 - 92450K
PCHannumHiggins-Chen202236277076Chronological Age78,464UCSD and WCH (GSE40279)Blood19 - 101450K
PCHorvath1Higgins-Chen202236277076Chronological Age78,464Multiple cohortsMultiple-105.5450K
PCHorvath2Higgins-Chen202236277076Chronological Age78,464Multiple cohortsSkin and Blood-101.3450K
PCLeptinHiggins-Chen202236277076Protein78,464FHS- Framingham heart studyBlood24 - 92450K
PCPACKYRSHiggins-Chen202236277076Clinical Phenotype78,464FHS- Framingham heart studyBlood24 - 92450K
PCPAI1Higgins-Chen202236277076Protein78,464FHS- Framingham heart studyBlood24 - 92450K
PCPhenoAgeHiggins-Chen202236277076Mortality78,464HRS and InCHIANTIBlood21 - 101450K
PCTIMP1Higgins-Chen202236277076Protein78,464FHS- Framingham heart studyBlood24 - 92450K
PedBEMcEwan201931611402Chronological Age94Multiple cohortsBuccal0-20450K and EPICv1
PhenoAgeLevine201829676998Mortality513InCHIANTIBlood21-100450K
PhysAgeArpawong202640889076Clinical Biomarker1,711HRSBlood51-100EPICv1
RepliTaliEndicott202236347867Mitotic Divisions87NIA Aging Cell Culture RepositoryPrimary cells (fibroblasts, endothelial, smooth muscle, keratinocyte)EPICv1
RepliTali_NormEndicott202236347867Mitotic Divisions218NIA Aging Cell Culture RepositoryPrimary cells (fibroblasts, endothelial, smooth muscle, keratinocyte)EPICv1
RetroAge450KNdhlovu202438106164Chronological Age1,317TruDiagnostic BioBankBlood12-100EPICv1
RetroAgeEPICv2Ndhlovu202438106164Chronological Age1,378TruDiagnostic BioBankBlood12-100EPICv1
SenChronoAgeKasamoto202641746138Chronological Age188UCSD and WCH (GSE40279)Blood19-101450K and EPICv1
SenCultureAgeKasamoto202641746138Senescence141GSE197723 and GSE227160Fibroblasts, mesenchymal stem cellsn/aEPICv1
SenMortalityAgeKasamoto202641746138Mortality89FHS- Framingham heart studyBlood24-92450K and EPICv1
SmokingMcCartney201830257690Clinical Phenotype233Generation Scotland: The Scottish Family Health Study [GS]Blood18–98450K
StocHTong202438724732Chronological Age353Simulated datasetBlood45-83450K and EPICv1
StocPTong202438724732Mortality513Simulated datasetBlood45-83450K and EPICv1
StocZTong202438724732Chronological Age514Simulated datasetBlood45-83450K and EPICv1
SystemsAgeSehgal202337503069Mortality125,175HRS and FHSBlood24 - 100450K
VidalBraloVidal-Bralo201627471517Chronological Age8Multiple cohortsBlood20-7827K
WeidnerWeidner201424490752Chronological Age3Multiple cohortsBlood0-78450K
ZhangZhang201728303888Mortality10ESTHERBlood50-75450K
Zhang2019Zhang201931443728Chronological Age514Multiple cohortsBlood and saliva2-104450K

Running A User-Defined List of Epigenetic Clocks

The user is welcome to specify a vector of clocks that they would like to calculate, rather than running each individual clock calculation. In this case, you will need to choose from the following options:

clockOptions()
#>  [1] "calcAlcoholMcCartney"            "calcBMIMcCartney"               
#>  [3] "calcBocklandt"                   "calcBohlin"                     
#>  [5] "calcClockCategory"               "calcDNAmClockCortical"          
#>  [7] "calcDNAmTL"                      "calcDunedinPoAm38"              
#>  [9] "calcEpiTOC"                      "calcEpiTOC2"                    
#> [11] "calcGaragnani"                   "calcGrimAgeV1"                  
#> [13] "calcGrimAgeV2"                   "calcHannum"                     
#> [15] "calcHorvath1"                    "calcHorvath2"                   
#> [17] "calcHRSInChPhenoAge"             "calcHypoClock"                  
#> [19] "calcKnight"                      "calcLeeControl"                 
#> [21] "calcLeeRefinedRobust"            "calcLeeRobust"                  
#> [23] "calcLin"                         "calcMayne"                      
#> [25] "calcMiAge"                       "calcPCClocks"                   
#> [27] "calcPEDBE"                       "calcPhenoAge"                   
#> [29] "calcSmokingMcCartney"            "calcSystemsAge"                 
#> [31] "calcVidalBralo"                  "calcWeidner"                    
#> [33] "calcZhang"                       "calcZhang2019"                  
#> [35] "prcPhenoAge::calcPRCPhenoAge"    "prcPhenoAge::calcnonPRCPhenoAge"
#> [37] "DunedinPoAm38::PoAmProjector"

To do so, here is an example:

userClocks <- c("calcSmokingMcCartney","calcPhenoAge","calcEpiTOC2")
calcUserClocks(userClocks, exampleBetas, examplePheno, imputation = F)
namegeo_accessiongenderagegroupsampleSmoking_McCartneyPhenoAgeepiTOC2
7786915023_R02C02GSM1343050M57.9113.99350852.293155012.412
7786915135_R04C02GSM1343051M42.0124.50165741.058674622.625
7471147149_R06C01GSM1343052M47.4133.17374443.544602956.300
7786915035_R05C01GSM1343053M49.3143.21678843.966973446.410
7786923035_R01C01GSM1343054M52.5154.41454140.352423245.157

Missing beta values

Of course, all of the CpG clocks work best when you have all of the necessary probes’ beta values for each sample. However, sometimes after preprocessing, beta values will be removed for a variety of reasons. For each CpG clock, you have the option to impute missing values for CpGs that were removed across all samples. In this case, you will need to impute using a vector of your choice (e.g. mean methylation values across CpGs from an independent tissue-matched dataset). However, by default, imputation will not be performed and the portion of the clock that is reliant upon those CpGs will not be considered. To check quickly whether this is the case for your data and clock(s) of interest, we have created the following helper function:

getClockProbes(exampleBetas)
ClockTotal.ProbesPresent.ProbesPercent.Present
Alcohol450450100%
BMI11091109100%
Bocklandt11100%
Bohlin25183%
DNAmClockCortical3473310%
DNAmTL140118%
EpiToc2163163100%
EpiToc385385100%
Garagnani11100%
GrimAge11139837%
GrimAge213621118%
HRSInCHPhenoAge959959100%
Hannum7171100%
Horvath1353353100%
Horvath2391390100%
Knight1481611%
LeeControl546132%
LeeRefinedRobust39592%
LeeRobust55892%
Lin993939%
Mayne6258%
MiAge26841%
PCClocks7846429764%
PEDBE941415%
PhenoAge513513100%
Smoking233233100%
SystemsAge12517540243%
VidalBralo8562%
Weidner33100%
Zhang201951413125%
Zhang1010100%
hypoClock678678100%

Please note that this will not count columns of NAs for named CpGs as missing! If you want to check for this you can run the following line of code to find the column numbers that are all NAs. If you get “named integer(0)” then you don’t have any. We recommend that you remove any identified columns from your beta matrix entirely to avoid errors, and then rerun the code producing the table above.

which(apply(exampleBetas, 2, function(x)all(is.na(x))))

In the case that you have CpGs missing from only some samples, we encourage you to be aware of this early on. Run the following line and check that it is 0.

sum(is.na(betaMatrix))

If this does not end up being 0, you might consider running mean imputation within your data so that NA values for single/ few samples at least have mean values rather than being ignored.