๐Ÿง  Self-Evolving Agent

March 25, 2026 ยท View on GitHub

๐Ÿง  Self-Evolving Agent

Your AI agent reviews its own logs and proposes behavior improvements โ€” weekly, automatically.

Stop making the same mistakes. Let your agent learn from them.

GitHub stars CI Lint License: MIT Platform: macOS/Linux No Silent Modification Zero API Cost False Positive Rate

If this improved your agent's behavior, a โญ helps others find it.

โšก Quick Start ยท ๐Ÿ” What It Detects ยท ๐Ÿ“‹ Real Results ยท ๐Ÿค– Claude Code

openclaw-self-evolving


Quick Install

curl -fsSL https://raw.githubusercontent.com/Ramsbaby/openclaw-self-evolving/main/install.sh | bash

Then point it at your logs:

# edit config.yaml โ€” set agents_md and logs_dir
nano ~/.local/share/openclaw-self-evolving/config.yaml

What Is This?

Self-Evolving is a weekly agent improvement pipeline โ€” no LLM, no API calls, no cloud.

It reads your agent's session logs, finds patterns of bad behavior (retry loops, broken rules, user frustration), and surfaces exact AGENTS.md or CLAUDE.md rule changes you can approve or reject in under a minute.

The agent doesn't change itself. You approve every change. That's the point.


The Problem

AI agents make the same mistakes repeatedly. Nobody has time to manually review thousands of conversation logs. The mistakes keep accumulating, silently.

Week 1: Agent calls git directly โ€” you correct it
Week 2: Same mistake again
Week 3: Same mistake
Week 4: Still happening โ€” 3 weeks wasted

Self-Evolving automates the review โ€” and brings you a short list of what to fix, every week.

Week 1: Agent calls git directly 4 times despite CLAUDE.md rule
Week 2: Same mistake, 3 more times
Week 3: Self-Evolving flags it โ†’ you approve the stronger rule โ†’ never happens again

How It Works

Your agent runs                    Self-Evolving runs (weekly)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                    โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

Session logs         โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ   1. analyze-behavior.sh
~/.openclaw/logs/                    โ€ข Scans JSONL session logs
~/.claude/logs/                      โ€ข Finds retry loops, errors,
                                       rule violations, frustration
AGENTS.md / CLAUDE.md โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ    โ€ข Zero API calls โ€” pure bash + python3
  (current rules)
                                  2. generate-proposal.sh
                                     โ€ข Builds Before/After diff proposals
                                     โ€ข Filters previously-rejected IDs
                                     โ€ข Posts to Discord / creates GitHub Issue

                                  3. You review
                     โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€     โ€ข React โœ… = apply + git commit
                                     โ€ข React โŒ = reject (stored, won't resurface)
                                     โ€ข React 1๏ธโƒฃโ€“5๏ธโƒฃ = approve specific proposals

One pipeline, three scripts, zero ongoing cost.

Self-improvement loop


Before & After

Before Self-Evolving

Agent behavior log (4 weeks):
  [Week 1] exec: git push origin main      โ† CLAUDE.md says use git-sync.sh
  [Week 2] exec: git push origin main      โ† same violation
  [Week 3] exec: git pull                  โ† same violation
  [Week 4] exec: git commit -m "fix"       โ† still happening

User message: "why are you calling git directly AGAIN?"
Agent: "I apologize, I'll use git-sync.sh going forward"
[Week 5] exec: git push origin main        โ† happens again anyway

After One Self-Evolving Cycle

Proposal generated automatically:

## Git Rules
- Direct git commands prohibited.
+ Direct git commands prohibited. (includes git add / commit / push / pull / fetch)
+ โš ๏ธ CRITICAL โ€” Violated 4ร— in 3 weeks. Use bash ~/openclaw/scripts/git-sync.sh for ALL git ops.

You react โœ… โ†’ rule applied โ†’ never happens again.


Real Example Output

๐Ÿงฌ Self-Evolving Agent Weekly Report v3.2

Analysis period: 2026-03-17 ~ 2026-03-24
Sessions analyzed: 23
Tool retry events: 47 โ† newly detected
Improvement proposals: 3

---

### ๐Ÿ” Proposal #1: `exec` tool consecutive retry pattern (8 sessions affected)

Severity: ๐Ÿ”ด HIGH  |  Type: ๐Ÿ“ AGENTS.md addition  |  Score: 92

> Evidence:
> Last 7 days: `exec` called 5+ times consecutively in 8 sessions
> Worst streak: 23 consecutive calls (no interruption)
> Total retry events: 47
> โ†’ 5+ consecutive identical tool calls = failure/retry loop signal

Before (current):
  No rule for consecutive exec retries

After (proposed):
  ## โšก exec Consecutive Retry Prevention
  Before retrying the same exec 3+ times:
  1. First failure: report error to user
  2. Second attempt: change approach (different flags/path)
  3. Third failure: stop and ask for manual confirmation

---

### โœ… Approval

React to approve/reject:
| Reaction | Action |
|----------|--------|
| โœ… | Approve all โ†’ auto-apply + git commit |
| 1๏ธโƒฃ โ€“ 5๏ธโƒฃ | Approve only that proposal |
| โŒ | Reject (add comment โ†’ fed into next analysis) |
| ๐Ÿ”„ | Request revision |

Works With

PlatformSupportNotes
Claude Codeโœ… FullCLAUDE.md / AGENTS.md rules
OpenClawโœ… FullNative log format support
Any JSONL agent logsโœ… PartialSession logger compatible

OpenClaw is one supported platform โ€” not a requirement. Claude Code works out of the box.


โšก Quick Start

Prerequisites

  • python3 (built-in on macOS/Linux)
  • bash 3.2+
  • Agent logs in JSONL format (Claude Code: ~/.claude/logs/, OpenClaw: ~/.openclaw/agents/)
  • 5 minutes
curl -fsSL https://raw.githubusercontent.com/Ramsbaby/openclaw-self-evolving/main/install.sh | bash

Edit ~/.local/share/openclaw-self-evolving/config.yaml to set your agents_md and logs_dir, then:

# Dry run first โ€” see what would be detected, no changes made
bash scripts/generate-proposal.sh --dry-run

# Register weekly cron
bash scripts/setup-wizard.sh

Option B: Claude Code (git clone)

git clone https://github.com/Ramsbaby/openclaw-self-evolving.git
cd openclaw-self-evolving
cp config.yaml.example config.yaml

Edit config.yaml:

agents_md: ~/your-project/CLAUDE.md    # path to your CLAUDE.md
logs_dir: ~/.claude/logs               # Claude Code log path
# Dry run first โ€” see what would be detected, no changes made
bash scripts/generate-proposal.sh --dry-run

# If it finds things worth fixing, run the full setup
bash scripts/setup-wizard.sh   # registers weekly cron

Option C: OpenClaw (clawhub)

clawhub install openclaw-self-evolving
bash scripts/setup-wizard.sh

First Run Output

[09:00:01] === Self-Evolving Agent behavior analysis v3.2 ===
[09:00:01] Analysis period: last 7 days / max 50 sessions
[09:00:03] Sessions found: 23
[09:00:04] Analysis complete: 23 sessions, 2 complaints, 1 violation, 47 retry events
[09:00:04] === generate-proposal.sh v3.2 started ===
[09:00:04] Generating proposals...
[09:00:05] Proposal saved: data/proposals/proposal_20260324_090005.json

## ๐Ÿงฌ Self-Evolving Agent Weekly Report v3.2
...3 proposals ready for your review

๐Ÿค– Claude Code / AGENTS.md

Works directly with Claude Code's CLAUDE.md or AGENTS.md behavior rules.

Setup:

# config.yaml
agents_md: ~/your-project/CLAUDE.md   # or AGENTS.md
logs_dir: ~/.claude/logs               # or your log path

What it does:

  1. Scans your Claude Code session logs
  2. Detects patterns: rule violations, repeated mistakes, user frustration
  3. Proposes exact diffs to your CLAUDE.md
  4. You approve โ†’ it applies the change + git commits

Example โ€” detected violation in Claude Code logs:

[Session #312] User: "why are you calling git directly again?"
[Session #318] User: "you did it again"
[Session #325] exec: git commit -m "fix"  โ† CLAUDE.md violation flagged

Proposed fix:

## Git Rules
+ โš ๏ธ CRITICAL โ€” Never run git directly. Violated 4ร— in 3 weeks.
- Direct git commands prohibited.
+ Direct git commands prohibited. (includes git add / commit / push)
  Conflicts: report to user.

๐Ÿ” What It Detects (6 Pattern Types)

1. Tool retry loops โ€” Same tool called 5+ times consecutively. Agent confusion signal.

2. Repeating errors โ€” Same error 5+ times across sessions. Unfixed bug, not a fluke.

3. User frustration โ€” Keywords like "you said this already", "why again", "๋‹ค์‹œ", "๋˜" โ€” with context filtering.

4. AGENTS.md / CLAUDE.md violations โ€” Rules broken in actual exec tool calls, cross-referenced against your rules file.

5. Heavy sessions โ€” Sessions hitting >85% context window. Tasks that should be sub-agents.

6. Unresolved learnings โ€” High-priority items in .learnings/ not yet promoted to rules.

No LLM calls during analysis. No API fees. Pure local log processing.

See docs/DETECTION-PATTERNS.md for full details.


๐Ÿ“‹ Real Results

Single-user production instance (macOS, 4 weeks):

MetricResult
Patterns detected85 across 30 sessions
Proposals per week4 on average
Rule violations caught13
False positive rate~8% (v5.0)
API cost$0

Your results will vary โ€” these are from one instance.


Approval Workflow

After analysis, a report is posted to your configured channel (Discord/Telegram). React to approve or reject:

ReactionAction
โœ…Approve all proposals โ†’ auto-apply + git commit
1๏ธโƒฃโ€“5๏ธโƒฃApprove only that numbered proposal
โŒReject (add comment with reason โ†’ fed into next analysis)
๐Ÿ”„Request revision

Rejected proposal IDs stored in data/rejected-proposals.json โ€” never proposed again.


Options

# Dry run (no changes)
bash scripts/generate-proposal.sh --dry-run

# Scan more history
ANALYSIS_DAYS=14 bash scripts/generate-proposal.sh

# Auto-create a GitHub Issue with the proposal report
bash scripts/generate-proposal.sh --create-issue
# Requires: gh CLI + gh auth login

# Specify repo explicitly
EVOLVING_GITHUB_REPO="owner/repo" bash scripts/generate-proposal.sh --create-issue

# Output a clean weekly digest (Markdown, top 3 proposals by score)
bash scripts/generate-proposal.sh --weekly-digest

--weekly-digest

Outputs a structured Markdown report of the top 3 proposals ranked by score (frequency ร— severity ร— impact). Designed for use in weekly summaries, Notion pages, or piped into a Discord message.

# Preview in terminal
bash scripts/generate-proposal.sh --weekly-digest

# Save to file
bash scripts/generate-proposal.sh --weekly-digest > weekly-report.md

# Post to Discord via webhook
bash scripts/generate-proposal.sh --weekly-digest \
  | curl -s -X POST "$DISCORD_WEBHOOK" \
    -H "Content-Type: application/json" \
    -d "{\"content\": $(cat weekly-report.md | jq -Rs .)}"

Example digest output:

# ๐Ÿ”„ Weekly Self-Evolution Report โ€” 2026-03-24

## Top 3 Proposals

### 1. exec tool consecutive retry pattern
**Severity:** ๐Ÿ”ด HIGH | **Score:** 92 | **Estimated Impact:** medium-high

**Pattern detected:** exec called 5+ times consecutively in 8 sessions
**Proposed change:** Add retry-prevention rule to AGENTS.md

<details><summary>View diff</summary>

\`\`\`diff
- No rule for consecutive exec retries
+ ## exec Retry Prevention
+ Stop after 3 consecutive identical tool calls; report to user.
\`\`\`

</details>

---

Configuration

# config.yaml
analysis_days: 7          # Days of logs to scan
max_sessions: 50          # Max session files

# Paths (auto-detected for standard OpenClaw layout)
agents_dir: ~/.openclaw/agents
logs_dir: ~/.openclaw/logs
agents_md: ~/openclaw/AGENTS.md   # โ† change to your CLAUDE.md path

# Notifications
notify:
  discord_channel: ""
  telegram_chat_id: ""

# Detection thresholds
thresholds:
  tool_retry: 5
  error_repeat: 5
  heavy_session: 85

vs. Alternatives

FeatureCapability EvolverSelf-Evolving
Silent modificationโš ๏ธ Yes (on by default)โŒ Never
Human approvalOptional (off by default)Required. Always.
API calls per runMultiple LLM callsZero
False positive rate~22% (self-reported)~8% (measured)
Rejection memoryNoneStored + fed back

Pairs Well With

openclaw-self-healing โ€” Crash recovery + auto-repair. Self-healing fires on crash. Self-Evolving runs weekly to fix what causes the crashes โ€” promoting error patterns directly into AGENTS.md rules.

openclaw-memorybox โ€” Memory hygiene CLI. Keeps MEMORY.md lean so your agent doesn't crash from context overflow.


File Structure

openclaw-self-evolving/
โ”œโ”€โ”€ scripts/
โ”‚   โ”œโ”€โ”€ analyze-behavior.sh      # Log analysis engine (JSONL-aware)
โ”‚   โ”œโ”€โ”€ session-logger.sh        # Structured JSONL event logger
โ”‚   โ”œโ”€โ”€ generate-proposal.sh     # Pipeline orchestrator
โ”‚   โ”œโ”€โ”€ setup-wizard.sh          # Interactive setup + cron registration
โ”‚   โ””โ”€โ”€ lib/config-loader.sh
โ”œโ”€โ”€ .github/
โ”‚   โ””โ”€โ”€ workflows/
โ”‚       โ””โ”€โ”€ ci.yml               # ShellCheck + flake8 lint on push/PR
โ”œโ”€โ”€ docs/
โ”‚   โ”œโ”€โ”€ assets/
โ”‚   โ”‚   โ”œโ”€โ”€ hero.svg             # Hero banner
โ”‚   โ”‚   โ””โ”€โ”€ loop.svg             # Self-improvement loop diagram
โ”‚   โ”œโ”€โ”€ DETECTION-PATTERNS.md
โ”‚   โ””โ”€โ”€ QUICKSTART.md
โ”œโ”€โ”€ test/fixtures/               # Sample JSONL for contributor testing
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ proposals/
โ”‚   โ””โ”€โ”€ rejected-proposals.json
โ””โ”€โ”€ config.yaml.example

๐ŸŒ OpenClaw Ecosystem

ProjectRole
openclaw-self-evolving โ† you are hereWeekly log review โ†’ propose AGENTS.md/CLAUDE.md improvements
openclaw-self-healing4-tier autonomous crash recovery
openclaw-memoryboxMemory hygiene CLI โ€” prevents bloat crashes
jarvis24/7 AI ops system using Claude Max

Contributing

PRs welcome โ€” especially:

  • New detection patterns for analyze-behavior.sh
  • Better false-positive filtering
  • Support for other log formats (currently OpenClaw + Claude Code)
  • Test fixtures in test/fixtures/

License

MIT โ€” do whatever you want, just don't remove the "human approval required" part. That part matters.


Made with ๐Ÿง  by @ramsbaby

"The best agent is one that learns from its mistakes."

Star History Chart