Alpha-Earth Land Cover Classifier
March 7, 2026 · View on GitHub
This repository contains the experimental pipeline developed for the study:
What on Earth is AlphaEarth? Hierarchical Structure and Functional Interpretability of Embeddings for Global Land Cover Benavides et al. (2025) — Preprint
The code implements a large-scale automated binary classification framework that evaluates the discriminatory capacity of Google AlphaEarth Foundation (AEF) embeddings across the 11 ESA WorldCover 2020 land cover classes. Over 130,000 independent experiments were executed globally using this pipeline.
What This Code Does
Each experiment is structured as a binary classification task in which a target land cover class is evaluated against all remaining classes. This isolates the embedding dimensions most informative for distinguishing each individual land cover type.
For each experiment, the pipeline:
- Selects a random continent and generates a smart ROI centered on a location where the target class is present
- Extracts AEF embedding values and ESA WorldCover labels from Google Earth Engine
- Trains a classifier using all 64 embedding dimensions
- Applies progressive ablation — retraining models with the top 1 through 30 most important dimensions
- Records all performance metrics and feature importances to Google Sheets
Results from these experiments were used to characterize the functional organization of the AEF embedding space, identifying specialist, low-generalist, mid-generalist, and high-generalist embedding dimensions.
ESA WorldCover 2020 Land Cover Classes
| Code | Class |
|---|---|
| 10 | Tree cover |
| 20 | Shrubland |
| 30 | Grassland |
| 40 | Cropland |
| 50 | Built-up |
| 60 | Bare/sparse vegetation |
| 70 | Snow/ice |
| 80 | Permanent water bodies |
| 90 | Herbaceous wetland |
| 95 | Mangroves |
| 100 | Moss/lichen |
Requirements
- Python 3.8+
- Google Earth Engine account (free for research use)
- Google Cloud project with Sheets API and Drive API enabled
- A Google service account JSON key file (see setup below)
Install dependencies:
pip install -r requirements.txt
Google Cloud Setup
To log results to Google Sheets, you need a service account key:
- Go to https://console.cloud.google.com
- Create or select a project
- Enable the Google Sheets API and Google Drive API
- Go to IAM & Admin > Service Accounts > Create Service Account
- Download the JSON key file and save it locally
- Share your destination Google Sheet with the service account email
- Set
SERVICE_ACCOUNT_KEY_PATHin the script to the path of your JSON key file
Important: Never commit your service account key to GitHub. It is already blocked by the included
.gitignore.
Configuration
Edit the configuration panel at the top of alpha_earth_experiments.py:
CLASS_A = 100 # Target ESA class code (e.g. 100 = Moss/lichen)
CLASS_B = 999 # All other classes
USER_COUNTRY = 'USA' # Your country (for logging)
N_ANALYSES = 3000 # Number of experiments to run
GCP_PROJECT_ID = "..." # Your Google Cloud project ID
GOOGLE_SHEET_NAME = "..." # Name of your destination Google Sheet
SERVICE_ACCOUNT_KEY_PATH = "path/to/key.json"
Running the Experiments
python alpha_earth_experiments.py
Or launch the interactive notebook via Binder (no installation required):
Output
Each experiment logs 410 columns to Google Sheets, including:
- Experiment metadata (timestamp, location, algorithm, class)
- Full model performance (accuracy, AUC, precision, recall, F1)
- Feature importance for all 64 embedding dimensions (A01-A64)
- Sequential retraining results for top 1 through top 30 embeddings across 8 metrics
Repository Structure
Alpha-Earth-Land-Cover-Classifier/
├── README.md
├── .gitignore
├── requirements.txt
├── alpha_earth_app.ipynb # Interactive Binder notebook
├── alpha_earth_experiments.py # Main experiment pipeline
└── docs/
└── paper_reference.md # Link to preprint and citation info
Interactive Dashboard
"What on Earth is AlphaEarth?" is an interactive dashboard that visualizes the functional organization of Google AlphaEarth Foundations (GAEF) embedding dimensions in relation to global land cover classes. Built as a companion to over 130,000 binary classification experiments conducted across ESA WorldCover 2020 categories, the dashboard integrates classification outputs, embedding importance metrics, and geographic context into four views: an Overview summarizing global experiment performance across machine learning algorithms, a Class Analysis view housing the "Embedding Universe" visualization alongside class performance matrices and embedding importance charts, a Geographic view mapping experiment locations onto satellite imagery with spatially aggregated performance heatmaps, and a Chat interface that guides users through configuring new experiments. The Embedding Universe — the dashboard's primary visualization — arranges land cover classes as nodes in a circular layout with embedding dimensions positioned according to their functional role: specialist dimensions cluster near individual classes while shared dimensions occupy intermediate positions reflecting their contribution across multiple classes. The dashboard is designed to make the structure of a 64-dimensional latent space explorable and interpretable, bridging the gap between abstract embeddings and the geographic categories they encode.
If you use this dashboard in scholarly outputs, please cite according to the specifications here.
Citation
If you use this code in your research, please cite:
@misc{Benavides2025,
author = {I. F. Benavides},
title = {Alpha-Earth-Land-Cover-Classifier: AlphaEarth Foundation Model App v1.0},
year = {2025},
publisher = {Zenodo},
doi = {10.5281/zenodo.16911104},
url = {https://mybinder.org/v2/gh/FelipeBenavidesMz/Alpha-Earth-Land-Cover-Classifier/main?labpath=alpha_earth_app.ipynb}
}
And the companion paper:
@article{Benavides2025paper,
author = {Ivan Felipe Benavides and Justin Guthrie and John Edwin Arias and
Yeison Alberto Garces-Gomez and Angela Ines Guzman-Alvis and
Cristiam Victoriano Portilla-Cabrera and Somnath Mondal and
Andrew J. Allyn and Auroop R. Ganguly},
title = {What on Earth is AlphaEarth? Hierarchical Structure and Functional
Interpretability of Embeddings for Global Land Cover},
year = {2025},
note = {Preprint}
}
License
This project is licensed under the MIT License. See LICENSE for details.
Acknowledgments
This work uses:
- Google AlphaEarth Foundations embeddings via Google Earth Engine
- ESA WorldCover 2020 land cover product
- The interactive dashboard developed by Guthrie & Benavides (2025)