snakebite
March 24, 2026 · View on GitHub
snakebite
PyPI supply chain attack detector. Scans Python packages for malicious patterns (credential theft, code obfuscation, persistence mechanisms) and uses an LLM to filter out false positives.
___ _ _ _ _
/ __|_ _ ___| |_____| |__(_) |_ ___
\__ \ ' \/ _ \ / / -_) '_ \ | _/ -_)
|___/_||_\__,_|_\_\___|_.__/_|\__\___|

Why
On March 24 2026, litellm versions 1.82.7 and 1.82.8 were published to PyPI with a credential-stealing payload. A malicious .pth file executed automatically on every Python process startup — no import needed — and exfiltrated SSH keys, cloud credentials, crypto wallets, and Kubernetes secrets to an attacker-controlled domain. The package had 97 million monthly downloads.
Pure heuristic scanners flag patterns like os.environ or subprocess but drown you in false positives — legitimate packages use these all the time. snakebite solves this with a two-stage approach:
- 14 heuristic rules tuned to real attack patterns (not generic code smell)
- LLM-powered analysis that reads the code in context and filters out legitimate usage
The LLM knows that os.environ.get("AIOHTTP_NO_EXTENSIONS") is a build toggle, not credential theft. That subprocess.call([editor]) in an editor package is normal. That base64.b64decode in a test file is test data. You get signal, not noise.
Install
git clone https://github.com/pinperepette/snakebite.git
cd snakebite
Zero external dependencies. Standard library only. Python 3.8+.
Two modes
local — scan what's installed on your machine
# Scan everything
python3 snakebite.py local
# Scan specific packages
python3 snakebite.py local flask requests litellm
Downloads each package from PyPI, extracts sdist + wheel, scans every .py, .pth, setup.py, __init__.py. If an LLM backend is configured, suspicious findings get analyzed before reporting.
feed — monitor PyPI in real time
# Single scan of latest packages
python3 snakebite.py feed
# Continuous monitoring every 60 seconds
python3 snakebite.py feed --loop 60
Watches PyPI's RSS feeds for new and updated packages, downloads and scans each one. Leave it running to catch malicious packages as they're published — the same window attackers exploit before takedown.
LLM backends
When you run snakebite without -m, it asks what you want to use:
Select LLM backend for false positive filtering:
1) claude-code Claude Code CLI (subscription)
2) claude Anthropic API (ANTHROPIC_API_KEY)
3) chatgpt OpenAI API (OPENAI_API_KEY)
4) ollama Ollama local model
5) none Heuristics only, no LLM
Or specify directly:
python3 snakebite.py local -m claude-code # Claude Code CLI (subscription)
python3 snakebite.py local -m claude # Anthropic API
python3 snakebite.py local -m chatgpt # OpenAI gpt-4o
python3 snakebite.py local -m chatgpt:gpt-4o-mini # OpenAI specific model
python3 snakebite.py local -m ollama:qwen2.5:32b # Ollama local
python3 snakebite.py local --no-llm # No LLM, heuristics only
API keys
# Anthropic (for -m claude)
export ANTHROPIC_API_KEY="sk-ant-..."
# OpenAI (for -m chatgpt)
export OPENAI_API_KEY="sk-..."
Add the export lines to ~/.zshrc or ~/.bashrc to persist them.
claude-code uses your Claude Code subscription via the claude CLI. No API key needed.
ollama runs entirely local. Install Ollama, pull a model (ollama pull qwen2.5:32b), done.
Architecture
┌─────────────┐ ┌──────────────┐ ┌────────────────┐ ┌─────────────┐
│ PyPI API │────>│ Download │────>│ Heuristic │────>│ LLM filter │
│ / RSS feed │ │ & extract │ │ engine │ │ (optional) │
└─────────────┘ │ sdist+wheel │ │ 14 rules │ │ │
└──────────────┘ └───────┬────────┘ └──────┬──────┘
│ │
no hits? ─> CLEAN verdict:
│ TRUE/FALSE POSITIVE
hits found │
└──────────────────>│
v
┌──────────┐
│ Output │
└──────────┘
- Fetch — download sdist and/or wheel from PyPI (or get new packages via RSS)
- Extract — unpack safely (path traversal protection, symlink filtering)
- Scan — 14 regex-based heuristic rules against
.py,.pth,setup.py, shell scripts - Filter — only if hits found: send code snippets + context to LLM for verdict
- Report — CLEAN / LOW / MEDIUM / HIGH / CRITICAL with explanations
What it detects
| Rule | Severity | Pattern |
|---|---|---|
PTH_EXEC | CRITICAL | .pth files with executable code (the litellm vector) |
BASE64_NESTED | CRITICAL | Nested base64 decoding (payload obfuscation) |
EXEC_ENCODED | CRITICAL | exec()/eval() with encoded payloads |
SETUP_NETWORK | CRITICAL | Network calls in setup.py / __init__.py / .pth |
CRED_HARVEST | CRITICAL | Accessing SSH keys, AWS/GCP/Azure creds, kubeconfig |
CRYPTO_WALLET | CRITICAL | Accessing Bitcoin/Ethereum/Solana wallet files |
K8S_SECRETS | CRITICAL | Reading Kubernetes secrets or service account tokens |
PERSISTENCE | CRITICAL | systemd/cron/launchd/shell rc persistence |
SETUP_SUBPROCESS | HIGH | subprocess/os.system in setup/init files |
ENV_DUMP | HIGH | Bulk environment variable collection in install context |
DNS_EXFIL | HIGH | Cloud metadata endpoints (AWS IMDS, GCP, Alibaba) |
OBFUSCATION | HIGH | chr() chains, reversed exec, dynamic base64 imports |
ARCHIVE_EXFIL | HIGH | Archive creation + HTTP POST (data exfiltration) |
OPENSSL_ENCRYPT | HIGH | OpenSSL encryption (exfil preparation) |
Real-world detection: litellm 1.82.7
snakebite would catch the litellm supply chain attack with three CRITICAL findings:
======================================================================
[CRITICAL] litellm 1.82.7
LLM: Malicious .pth file executing obfuscated credential stealer at Python startup
======================================================================
[CRITICAL] PTH_EXEC in litellm_init.pth:1
import os, subprocess, sys; subprocess.Popen([sys.executable, "-c", "import base64; exec(...)"])
[CRITICAL] CRED_HARVEST in litellm_init.pth:1
.ssh/id_rsa, .aws/credentials, .kube/config
[CRITICAL] BASE64_NESTED in litellm_init.pth:1
exec(base64.b64decode(base64.b64decode(...)))
No LLM needed — the heuristics alone flag this as CRITICAL. The LLM confirms it's a true positive and adds context about the exfiltration mechanism.
Threat model
snakebite detects:
- Supply chain attacks in Python packages published to PyPI
- Credential exfiltration (SSH, cloud, database, crypto) at install or import time
- Code obfuscation used to hide malicious payloads
- Persistence mechanisms embedded in packages (systemd, cron, launchd)
.pthfile abuse for pre-import code execution- Kubernetes lateral movement from compromised packages
snakebite does not detect:
- Malicious compiled extensions (
.so,.pyd,.dll) — binary analysis is out of scope - Runtime-only attacks triggered by specific input or conditions
- Logic bombs without static indicators
- Typosquatting or dependency confusion (use pip-audit for that)
- Attacks outside the Python/PyPI ecosystem
- Vulnerabilities in legitimate code (use bandit or safety)
LLM usage and privacy
When a package triggers heuristic rules, snakebite sends only the suspicious code snippets (a few lines of context around each hit) to the selected LLM backend. It does not send:
- Full package source code
- Your system information
- Your credentials or environment variables
- Package contents that passed heuristic checks
If this is a concern:
- Use
ollama— everything stays on your machine - Use
--no-llm— no external calls at all, pure heuristic analysis - Review what gets sent: run with
--verboseto see the exact code excerpts
Performance
| Mode | Speed | Notes |
|---|---|---|
Heuristics only (--no-llm) | ~1-2s per package | Download + extract + regex scan |
| With LLM | ~5-15s per package | Depends on backend and model |
RSS feed (--loop 60) | ~40 packages/cycle | PyPI publishes ~40 packages per RSS fetch |
The LLM is the bottleneck. claude-code and API backends (claude, chatgpt) are faster than local models. Ollama speed depends on your hardware and model size.
Heuristic-only mode is fast enough for full local scans (hundreds of packages in minutes).
Comparison
| Tool | Approach | LLM filtering | Supply chain focus | False positives |
|---|---|---|---|---|
| snakebite | Heuristic + LLM | yes | yes | low (LLM filters) |
| bandit | AST analysis | no | no (general code quality) | high |
| pip-audit | Vulnerability DB | no | partial (known CVEs only) | low |
| safety | Vulnerability DB | no | partial (known CVEs only) | low |
| packj | Heuristic | no | yes | medium-high |
pip-audit and safety catch known vulnerabilities. snakebite catches unknown malicious code — the zero-day supply chain attack that hasn't been reported yet.
CI integration
Scan dependencies before deploy:
# In your CI pipeline
pip install -r requirements.txt
python3 snakebite.py local --no-llm
With LLM (set the API key in CI secrets):
export ANTHROPIC_API_KEY="${{ secrets.ANTHROPIC_API_KEY }}"
python3 snakebite.py local -m claude
Scan a requirements file without installing:
cat requirements.txt | cut -d'=' -f1 | xargs python3 snakebite.py local
GitHub Actions example:
- name: Scan dependencies for supply chain attacks
run: |
pip install -r requirements.txt
python3 snakebite.py local --no-llm
Options
-m, --model LLM backend (claude-code, claude, chatgpt, ollama:<model>)
--no-llm Heuristics only, skip LLM analysis
--log FILE Save suspicious findings to a JSON file
-v, --verbose Show clean packages and false positive details
--version Show version
Feed mode:
--loop N Repeat scan every N seconds (default: single run)
Alert log
Use --log to save all suspicious findings to a JSON file:
# Monitor feed and log alerts
python3 snakebite.py feed --loop 60 --log alerts.json -m claude-code
# Scan local packages and log alerts
python3 snakebite.py local --log alerts.json -m claude-code
Each alert is appended to the file with full details:
{
"timestamp": "2026-03-24T18:29:33.450991+00:00",
"package": "stats-helpers",
"version": "1.0.0",
"threat_level": "CRITICAL",
"summary": "Litecoin private key stealer",
"pypi_url": "https://pypi.org/project/stats-helpers/",
"hits": [
{
"rule": "SETUP_NETWORK",
"severity": "CRITICAL",
"file": "stats_helpers/__init__.py",
"line_no": 36,
"line": "response = requests.post("
}
],
"llm_findings": [...],
"reviewed": false
}
The "reviewed": false field lets you track which alerts you've already analyzed. Leave it running with --loop and check the file periodically for new findings.
Example output
Clean (false positive filtered by LLM):
18:30:54 OK astroid 3.3.10 - CLEAN (LLM: Benign setuptools namespace package .pth files)
18:31:22 OK babel 2.16.0 - CLEAN (LLM: Legitimate CLDR data import in setup.py, not executed during install)
18:31:39 OK banks 2.2.0 - CLEAN (LLM: Benign test code verifying base64 encoding of image data URLs)
Suspicious:
======================================================================
[CRITICAL] evil-package 0.1.0
LLM: Credential stealer targeting SSH keys and cloud provider tokens
======================================================================
! CRED_HARVEST: Reads SSH private keys and AWS credentials, concatenates into single payload
! SETUP_NETWORK: POSTs collected credentials to external domain during pip install
! ENV_DUMP: Captures all environment variables including API tokens
License
MIT