πŸ¦… Eagle Eye

June 1, 2026 Β· View on GitHub

Narrow 100+ skills down to the right 5 β€” deterministic triggers, fuzzy matching, semantic search, and rank fusion. Zero core modification.

δΈ­ζ–‡ζ–‡ζ‘£


The Problem

Hermes Agent loads every installed skill into the system prompt as a flat list. When you have 50+ skills:

  • The LLM picks wrong β€” overlapping descriptions confuse selection
  • You burn tokens β€” 5,000–10,000 tokens per turn just for the skill list
  • Rarely-used skills become invisible β€” buried at the bottom of a long list

The Solution

Eagle Eye is a zero-invasive plugin that acts as an intelligent pre-filter. Before each API call, it narrows the skill list to the top-5 most relevant candidates and injects them as a lightweight hint.

User Query
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  L1: Hard Triggers                          β”‚
β”‚  Deterministic keyword matching (3-tier)    β”‚
β”‚  Hit β†’ Inject full SKILL.md instantly       β”‚
β”‚  Miss ↓                                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  L2: FTS5 BM25     (text similarity)        β”‚
β”‚  L3: Synonym Dict   (domain knowledge)      β”‚
β”‚  L4: Dense Embedding (semantic similarity)  β”‚
β”‚  L5: RRF Fusion     (rank combination)      β”‚
β”‚                                             β”‚
β”‚  Score β‰₯ threshold β†’ Inject skill hints     β”‚
β”‚  Score < threshold β†’ Silent (LLM decides)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    β”‚
    β–Ό
LLM Final Decision

Key Design Decisions

1. "Not matching" is a valid result

Not every query needs a skill. "What should I eat for dinner?" is best answered by the LLM's general knowledge β€” not by loading a restaurant-finder skill. Eagle Eye's confidence gate prevents forced matches.

2. Deterministic first, probabilistic second

L1 (hard triggers) is 100% precise β€” if the user types "debug", the debugging skill loads instantly with no probability involved. L2–L5 handles the long tail where fuzzy, semantic matching adds value.

3. Hints, not decisions

L2–L5 returns candidates, not conclusions. The LLM retains final authority to load a skill, combine multiple skills, or ignore the hint entirely. The retrieval system doesn't override the LLM's judgment.

4. Each layer fails independently

If sentence-transformers isn't installed, L4 degrades gracefully β€” L1+L2+L3 still work. If jieba is missing, L1+L4 still work. The system never crashes; it always falls back to a working subset.

Quick Start

# 1. Clone
git clone https://github.com/willingning-coder/eagle-eye.git
cd eagle-eye

# 2. Generate config from your local skill library
python scripts/generate_config.py

# 3. Review and customize
#    - Edit _HARD_TRIGGERS in src/skill_retriever.py
#    - Edit src/skill_synonyms.yaml
#    (See PROMPTS.md for LLM-assisted generation)

# 4. Install
bash scripts/install.sh

# 5. Restart Hermes
hermes gateway restart

Customization

Eagle Eye ships with minimal example data. The real power comes from generating your own configuration based on your installed skills.

# Scan your skills and generate template configs
python scripts/generate_config.py

# Or just list what was found
python scripts/generate_config.py --scan-only

Manual Customization

ComponentFileWhat to do
Hard Triggerssrc/skill_retriever.py β†’ _HARD_TRIGGERSAdd (keyword, skill-name) tuples. More specific first.
Synonym Dictionarysrc/skill_synonyms.yamlMap natural language terms to skills. 5–15 per skill.
Embedding ModelHERMES_EMBEDDING_MODEL env varSwap to a different sentence-transformers model.

LLM-Assisted Generation

Use the prompts in PROMPTS_EN.md or PROMPTS_CN.md with any LLM to generate high-quality triggers and synonyms from your skill list.

Environment Variables

VariableDefaultDescription
HERMES_DISABLE_SKILL_RETRIEVAL(unset)Set 1 to disable entirely
HERMES_SKILL_RETRIEVAL_TOP_K5Number of skills to return
HERMES_EMBEDDING_MODELshibing624/text2vec-base-chinese-paraphraseEmbedding model for L4

Performance

MetricValue
L1 real-world accuracy~90%
Functional test accuracy100%
Query latency (cached)~20ms
First-call latency~11s (model loading)
Memory footprint~403MB (with embedding)

Architecture

See ARCHITECTURE.md for a deep technical dive covering:

  • Layer-by-layer algorithm analysis with code
  • RRF fusion math and why it beats score normalization
  • Confidence gate design philosophy
  • Failure mode matrix and degradation hierarchy
  • Latency and memory profiling

File Structure

eagle-eye/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ skill_retriever.py      # Core 5-layer retrieval engine
β”‚   β”œβ”€β”€ skill_synonyms.yaml     # Synonym dictionary (template)
β”‚   β”œβ”€β”€ plugin.py               # Hermes plugin (pre_llm_call hook)
β”‚   └── plugin.yaml             # Plugin manifest
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ generate_config.py      # Auto-generate config from your skills
β”‚   └── install.sh              # One-command installation
β”œβ”€β”€ templates/
β”‚   └── hard_triggers.example.py  # Trigger format reference
β”œβ”€β”€ README.md                   # This file (English)
β”œβ”€β”€ README_CN.md                # δΈ­ζ–‡ζ–‡ζ‘£
β”œβ”€β”€ ARCHITECTURE.md             # Technical deep dive
β”œβ”€β”€ PROMPTS_EN.md               # LLM prompts for config generation (English)
β”œβ”€β”€ PROMPTS_CN.md               # LLM prompts for config generation (δΈ­ζ–‡)
β”œβ”€β”€ CHANGELOG.md                # Version history
└── LICENSE                     # MIT

Dependencies

PackageRequired?Purpose
jiebaYesChinese tokenization for L2–L3
sentence-transformersOptionalDense embedding for L4 (graceful fallback if missing)
numpyOptionalNumerical operations for L4

Contributing

Contributions are welcome! Areas where help is especially valuable:

  • Trigger/synonym quality: Share your _HARD_TRIGGERS and skill_synonyms.yaml configurations
  • Embedding model benchmarks: Test alternative models and report accuracy
  • Multi-language support: Extend triggers and synonyms beyond Chinese/English
  • Bug reports: Edge cases in fuzzy matching, false positives/negatives

License

MIT