๐ง Self-Evolving Agent
March 25, 2026 ยท View on GitHub
๐ง Self-Evolving Agent
Your AI agent reviews its own logs and proposes behavior improvements โ weekly, automatically.
Stop making the same mistakes. Let your agent learn from them.
If this improved your agent's behavior, a โญ helps others find it.
โก Quick Start ยท ๐ What It Detects ยท ๐ Real Results ยท ๐ค Claude Code
Quick Install
curl -fsSL https://raw.githubusercontent.com/Ramsbaby/openclaw-self-evolving/main/install.sh | bash
Then point it at your logs:
# edit config.yaml โ set agents_md and logs_dir
nano ~/.local/share/openclaw-self-evolving/config.yaml
What Is This?
Self-Evolving is a weekly agent improvement pipeline โ no LLM, no API calls, no cloud.
It reads your agent's session logs, finds patterns of bad behavior (retry loops, broken rules, user frustration), and surfaces exact AGENTS.md or CLAUDE.md rule changes you can approve or reject in under a minute.
The agent doesn't change itself. You approve every change. That's the point.
The Problem
AI agents make the same mistakes repeatedly. Nobody has time to manually review thousands of conversation logs. The mistakes keep accumulating, silently.
Week 1: Agent calls git directly โ you correct it
Week 2: Same mistake again
Week 3: Same mistake
Week 4: Still happening โ 3 weeks wasted
Self-Evolving automates the review โ and brings you a short list of what to fix, every week.
Week 1: Agent calls git directly 4 times despite CLAUDE.md rule
Week 2: Same mistake, 3 more times
Week 3: Self-Evolving flags it โ you approve the stronger rule โ never happens again
How It Works
Your agent runs Self-Evolving runs (weekly)
โโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Session logs โโโโโโโโโโบ 1. analyze-behavior.sh
~/.openclaw/logs/ โข Scans JSONL session logs
~/.claude/logs/ โข Finds retry loops, errors,
rule violations, frustration
AGENTS.md / CLAUDE.md โโโโโโโโโโโบ โข Zero API calls โ pure bash + python3
(current rules)
2. generate-proposal.sh
โข Builds Before/After diff proposals
โข Filters previously-rejected IDs
โข Posts to Discord / creates GitHub Issue
3. You review
โโโโโโโโโโ โข React โ
= apply + git commit
โข React โ = reject (stored, won't resurface)
โข React 1๏ธโฃโ5๏ธโฃ = approve specific proposals
One pipeline, three scripts, zero ongoing cost.
Before & After
Before Self-Evolving
Agent behavior log (4 weeks):
[Week 1] exec: git push origin main โ CLAUDE.md says use git-sync.sh
[Week 2] exec: git push origin main โ same violation
[Week 3] exec: git pull โ same violation
[Week 4] exec: git commit -m "fix" โ still happening
User message: "why are you calling git directly AGAIN?"
Agent: "I apologize, I'll use git-sync.sh going forward"
[Week 5] exec: git push origin main โ happens again anyway
After One Self-Evolving Cycle
Proposal generated automatically:
## Git Rules
- Direct git commands prohibited.
+ Direct git commands prohibited. (includes git add / commit / push / pull / fetch)
+ โ ๏ธ CRITICAL โ Violated 4ร in 3 weeks. Use bash ~/openclaw/scripts/git-sync.sh for ALL git ops.
You react โ โ rule applied โ never happens again.
Real Example Output
๐งฌ Self-Evolving Agent Weekly Report v3.2
Analysis period: 2026-03-17 ~ 2026-03-24
Sessions analyzed: 23
Tool retry events: 47 โ newly detected
Improvement proposals: 3
---
### ๐ Proposal #1: `exec` tool consecutive retry pattern (8 sessions affected)
Severity: ๐ด HIGH | Type: ๐ AGENTS.md addition | Score: 92
> Evidence:
> Last 7 days: `exec` called 5+ times consecutively in 8 sessions
> Worst streak: 23 consecutive calls (no interruption)
> Total retry events: 47
> โ 5+ consecutive identical tool calls = failure/retry loop signal
Before (current):
No rule for consecutive exec retries
After (proposed):
## โก exec Consecutive Retry Prevention
Before retrying the same exec 3+ times:
1. First failure: report error to user
2. Second attempt: change approach (different flags/path)
3. Third failure: stop and ask for manual confirmation
---
### โ
Approval
React to approve/reject:
| Reaction | Action |
|----------|--------|
| โ
| Approve all โ auto-apply + git commit |
| 1๏ธโฃ โ 5๏ธโฃ | Approve only that proposal |
| โ | Reject (add comment โ fed into next analysis) |
| ๐ | Request revision |
Works With
| Platform | Support | Notes |
|---|---|---|
| Claude Code | โ Full | CLAUDE.md / AGENTS.md rules |
| OpenClaw | โ Full | Native log format support |
| Any JSONL agent logs | โ Partial | Session logger compatible |
OpenClaw is one supported platform โ not a requirement. Claude Code works out of the box.
โก Quick Start
Prerequisites
python3(built-in on macOS/Linux)bash3.2+- Agent logs in JSONL format (Claude Code:
~/.claude/logs/, OpenClaw:~/.openclaw/agents/) - 5 minutes
Option A: One-line install (recommended)
curl -fsSL https://raw.githubusercontent.com/Ramsbaby/openclaw-self-evolving/main/install.sh | bash
Edit ~/.local/share/openclaw-self-evolving/config.yaml to set your agents_md and logs_dir, then:
# Dry run first โ see what would be detected, no changes made
bash scripts/generate-proposal.sh --dry-run
# Register weekly cron
bash scripts/setup-wizard.sh
Option B: Claude Code (git clone)
git clone https://github.com/Ramsbaby/openclaw-self-evolving.git
cd openclaw-self-evolving
cp config.yaml.example config.yaml
Edit config.yaml:
agents_md: ~/your-project/CLAUDE.md # path to your CLAUDE.md
logs_dir: ~/.claude/logs # Claude Code log path
# Dry run first โ see what would be detected, no changes made
bash scripts/generate-proposal.sh --dry-run
# If it finds things worth fixing, run the full setup
bash scripts/setup-wizard.sh # registers weekly cron
Option C: OpenClaw (clawhub)
clawhub install openclaw-self-evolving
bash scripts/setup-wizard.sh
First Run Output
[09:00:01] === Self-Evolving Agent behavior analysis v3.2 ===
[09:00:01] Analysis period: last 7 days / max 50 sessions
[09:00:03] Sessions found: 23
[09:00:04] Analysis complete: 23 sessions, 2 complaints, 1 violation, 47 retry events
[09:00:04] === generate-proposal.sh v3.2 started ===
[09:00:04] Generating proposals...
[09:00:05] Proposal saved: data/proposals/proposal_20260324_090005.json
## ๐งฌ Self-Evolving Agent Weekly Report v3.2
...3 proposals ready for your review
๐ค Claude Code / AGENTS.md
Works directly with Claude Code's CLAUDE.md or AGENTS.md behavior rules.
Setup:
# config.yaml
agents_md: ~/your-project/CLAUDE.md # or AGENTS.md
logs_dir: ~/.claude/logs # or your log path
What it does:
- Scans your Claude Code session logs
- Detects patterns: rule violations, repeated mistakes, user frustration
- Proposes exact diffs to your
CLAUDE.md - You approve โ it applies the change + git commits
Example โ detected violation in Claude Code logs:
[Session #312] User: "why are you calling git directly again?"
[Session #318] User: "you did it again"
[Session #325] exec: git commit -m "fix" โ CLAUDE.md violation flagged
Proposed fix:
## Git Rules
+ โ ๏ธ CRITICAL โ Never run git directly. Violated 4ร in 3 weeks.
- Direct git commands prohibited.
+ Direct git commands prohibited. (includes git add / commit / push)
Conflicts: report to user.
๐ What It Detects (6 Pattern Types)
1. Tool retry loops โ Same tool called 5+ times consecutively. Agent confusion signal.
2. Repeating errors โ Same error 5+ times across sessions. Unfixed bug, not a fluke.
3. User frustration โ Keywords like "you said this already", "why again", "๋ค์", "๋" โ with context filtering.
4. AGENTS.md / CLAUDE.md violations โ Rules broken in actual exec tool calls, cross-referenced against your rules file.
5. Heavy sessions โ Sessions hitting >85% context window. Tasks that should be sub-agents.
6. Unresolved learnings โ High-priority items in .learnings/ not yet promoted to rules.
No LLM calls during analysis. No API fees. Pure local log processing.
See docs/DETECTION-PATTERNS.md for full details.
๐ Real Results
Single-user production instance (macOS, 4 weeks):
| Metric | Result |
|---|---|
| Patterns detected | 85 across 30 sessions |
| Proposals per week | 4 on average |
| Rule violations caught | 13 |
| False positive rate | ~8% (v5.0) |
| API cost | $0 |
Your results will vary โ these are from one instance.
Approval Workflow
After analysis, a report is posted to your configured channel (Discord/Telegram). React to approve or reject:
| Reaction | Action |
|---|---|
| โ | Approve all proposals โ auto-apply + git commit |
| 1๏ธโฃโ5๏ธโฃ | Approve only that numbered proposal |
| โ | Reject (add comment with reason โ fed into next analysis) |
| ๐ | Request revision |
Rejected proposal IDs stored in data/rejected-proposals.json โ never proposed again.
Options
# Dry run (no changes)
bash scripts/generate-proposal.sh --dry-run
# Scan more history
ANALYSIS_DAYS=14 bash scripts/generate-proposal.sh
# Auto-create a GitHub Issue with the proposal report
bash scripts/generate-proposal.sh --create-issue
# Requires: gh CLI + gh auth login
# Specify repo explicitly
EVOLVING_GITHUB_REPO="owner/repo" bash scripts/generate-proposal.sh --create-issue
# Output a clean weekly digest (Markdown, top 3 proposals by score)
bash scripts/generate-proposal.sh --weekly-digest
--weekly-digest
Outputs a structured Markdown report of the top 3 proposals ranked by score (frequency ร severity ร impact). Designed for use in weekly summaries, Notion pages, or piped into a Discord message.
# Preview in terminal
bash scripts/generate-proposal.sh --weekly-digest
# Save to file
bash scripts/generate-proposal.sh --weekly-digest > weekly-report.md
# Post to Discord via webhook
bash scripts/generate-proposal.sh --weekly-digest \
| curl -s -X POST "$DISCORD_WEBHOOK" \
-H "Content-Type: application/json" \
-d "{\"content\": $(cat weekly-report.md | jq -Rs .)}"
Example digest output:
# ๐ Weekly Self-Evolution Report โ 2026-03-24
## Top 3 Proposals
### 1. exec tool consecutive retry pattern
**Severity:** ๐ด HIGH | **Score:** 92 | **Estimated Impact:** medium-high
**Pattern detected:** exec called 5+ times consecutively in 8 sessions
**Proposed change:** Add retry-prevention rule to AGENTS.md
<details><summary>View diff</summary>
\`\`\`diff
- No rule for consecutive exec retries
+ ## exec Retry Prevention
+ Stop after 3 consecutive identical tool calls; report to user.
\`\`\`
</details>
---
Configuration
# config.yaml
analysis_days: 7 # Days of logs to scan
max_sessions: 50 # Max session files
# Paths (auto-detected for standard OpenClaw layout)
agents_dir: ~/.openclaw/agents
logs_dir: ~/.openclaw/logs
agents_md: ~/openclaw/AGENTS.md # โ change to your CLAUDE.md path
# Notifications
notify:
discord_channel: ""
telegram_chat_id: ""
# Detection thresholds
thresholds:
tool_retry: 5
error_repeat: 5
heavy_session: 85
vs. Alternatives
| Feature | Capability Evolver | Self-Evolving |
|---|---|---|
| Silent modification | โ ๏ธ Yes (on by default) | โ Never |
| Human approval | Optional (off by default) | Required. Always. |
| API calls per run | Multiple LLM calls | Zero |
| False positive rate | ~22% (self-reported) | ~8% (measured) |
| Rejection memory | None | Stored + fed back |
Pairs Well With
openclaw-self-healing โ Crash recovery + auto-repair. Self-healing fires on crash. Self-Evolving runs weekly to fix what causes the crashes โ promoting error patterns directly into AGENTS.md rules.
openclaw-memorybox โ Memory hygiene CLI. Keeps MEMORY.md lean so your agent doesn't crash from context overflow.
File Structure
openclaw-self-evolving/
โโโ scripts/
โ โโโ analyze-behavior.sh # Log analysis engine (JSONL-aware)
โ โโโ session-logger.sh # Structured JSONL event logger
โ โโโ generate-proposal.sh # Pipeline orchestrator
โ โโโ setup-wizard.sh # Interactive setup + cron registration
โ โโโ lib/config-loader.sh
โโโ .github/
โ โโโ workflows/
โ โโโ ci.yml # ShellCheck + flake8 lint on push/PR
โโโ docs/
โ โโโ assets/
โ โ โโโ hero.svg # Hero banner
โ โ โโโ loop.svg # Self-improvement loop diagram
โ โโโ DETECTION-PATTERNS.md
โ โโโ QUICKSTART.md
โโโ test/fixtures/ # Sample JSONL for contributor testing
โโโ data/
โ โโโ proposals/
โ โโโ rejected-proposals.json
โโโ config.yaml.example
๐ OpenClaw Ecosystem
| Project | Role |
|---|---|
| openclaw-self-evolving โ you are here | Weekly log review โ propose AGENTS.md/CLAUDE.md improvements |
| openclaw-self-healing | 4-tier autonomous crash recovery |
| openclaw-memorybox | Memory hygiene CLI โ prevents bloat crashes |
| jarvis | 24/7 AI ops system using Claude Max |
Contributing
PRs welcome โ especially:
- New detection patterns for
analyze-behavior.sh - Better false-positive filtering
- Support for other log formats (currently OpenClaw + Claude Code)
- Test fixtures in
test/fixtures/
License
MIT โ do whatever you want, just don't remove the "human approval required" part. That part matters.