Similarity Pipeline
February 28, 2026 · View on GitHub
How floop detects and merges duplicate behaviors.
Overview
When floop deduplicate (or the floop_deduplicate MCP tool) runs, it compares every pair of behaviors in the store to find duplicates. Similarity is computed using a 3-tier fallback chain — the first method that produces a result wins:
- Embedding similarity — cosine similarity between vector embeddings
- LLM comparison — structured semantic comparison via a language model
- Jaccard word overlap — rule-based word overlap (always available)
Pairs that meet the similarity threshold are flagged as duplicates and optionally merged.
Embedding Similarity
When an LLM client implements the EmbeddingComparer interface, floop generates embeddings for each behavior's content and computes cosine similarity between the resulting vectors.
- Range: -1.0 to 1.0 (in practice, 0.0 to 1.0 for normalized text embeddings)
- Vectors are L2-normalized before comparison
- Returns 0.0 for zero-magnitude or mismatched-length vectors
The local provider (llm.provider = local) runs offline embedding via GGUF models loaded through yzma (purego bindings to llama.cpp). It uses nomic-embed-text-v1.5 (Q4_K_M) to generate 768-dimension embeddings locally with no API keys or network access required. This is typically the fastest tier in the chain.
Providers that support embeddings: openai, ollama, local.
LLM Comparison
When embeddings are unavailable (or when the provider does not implement EmbeddingComparer), floop falls back to LLM-based comparison via CompareBehaviors. The model receives both behaviors and returns a structured result:
- Semantic similarity — 0.0 to 1.0 score
- Intent match — whether the behaviors target the same underlying intent
- Merge recommendation — whether the model recommends merging
The comparison uses llm.comparison_model (configurable). This is slower than embedding similarity but can capture nuanced semantic relationships.
Jaccard Fallback
When no LLM is configured (or when LLM comparison fails and llm.fallback_to_rules is enabled), floop uses a weighted Jaccard word overlap:
| Component | Weight | Method |
|---|---|---|
| When-condition overlap | 40% | Exact value matching across when condition sets |
| Content word overlap | 60% | Case-insensitive word tokenization, set intersection / set union |
When-condition overlap compares the activation conditions (file patterns, task types, etc.) of both behaviors using exact value matching, with double weighting for matches.
Content word overlap tokenizes behavior content into lowercase words and computes the Jaccard index (intersection / union).
The final score is: 0.4 * when_overlap + 0.6 * content_overlap
This method requires no external services and is always available as a fallback.
Thresholds
The similarity threshold determines when two behaviors are considered duplicates:
- Default: 0.9 (configurable)
- Auto-merge during learn: 0.9 (when
--auto-mergeis enabled) - Range: 0.0 (everything matches) to 1.0 (exact match only)
Configure via:
# CLI flag
floop deduplicate --threshold 0.85
# Config file
floop config set deduplication.similarity_threshold 0.85
# Environment variable
export FLOOP_SIMILARITY_THRESHOLD=0.85
Cross-Store Deduplication
Behaviors live in two stores:
- Local — project-scoped (
.floop/) - Global — user-scoped (
~/.floop/)
By default, deduplication runs within a single store. Use --scope both to compare behaviors across stores:
# Deduplicate within local store only
floop deduplicate
# Deduplicate within global store only
floop deduplicate --scope global
# Cross-store deduplication (local + global)
floop deduplicate --scope both
Cross-store dedup uses the same fallback chain and threshold. When duplicates span stores, the merge target is chosen based on the surviving behavior's store.
Note: When using floop_learn via MCP, behaviors are automatically classified into the correct store (local for project-specific, global for universal) based on their activation conditions. This reduces cross-store duplicates at the source. See the MCP server integration guide for details.
Configuration
Provider Setup
# ~/.floop/config.yaml
llm:
provider: anthropic # anthropic, openai, ollama, local, subagent
enabled: true
api_key: ${ANTHROPIC_API_KEY}
comparison_model: claude-sonnet-4-5-20250929
fallback_to_rules: true # Fall back to Jaccard when LLM fails
# Local provider (offline embeddings via llama.cpp)
local_lib_path: /path/to/yzma/libs
local_model_path: /path/to/model.gguf
local_embedding_model_path: /path/to/embedding-model.gguf
local_gpu_layers: 0 # 0 = CPU only
local_context_size: 512
deduplication:
auto_merge: false
similarity_threshold: 0.9
Environment Variables
| Variable | Config Key |
|---|---|
FLOOP_LLM_PROVIDER | llm.provider |
FLOOP_LLM_ENABLED | llm.enabled |
FLOOP_SIMILARITY_THRESHOLD | deduplication.similarity_threshold |
FLOOP_AUTO_MERGE | deduplication.auto_merge |
FLOOP_LOCAL_LIB_PATH | llm.local_lib_path |
FLOOP_LOCAL_MODEL_PATH | llm.local_model_path |
FLOOP_LOCAL_EMBEDDING_MODEL_PATH | llm.local_embedding_model_path |
FLOOP_LOCAL_GPU_LAYERS | llm.local_gpu_layers |
FLOOP_LOCAL_CONTEXT_SIZE | llm.local_context_size |
See CLI Reference — Environment Variables for the complete list.
Decision Logging
All similarity comparisons are logged through the DecisionLogger for audit and debugging. Set the log level to see similarity decisions:
# See which comparison method was used and the resulting scores
export FLOOP_LOG_LEVEL=debug
# See full detail including individual component scores
export FLOOP_LOG_LEVEL=trace
Or via config:
floop config set logging.level debug