๐Ÿงฌ EvoControl

January 25, 2026 ยท View on GitHub

๐Ÿงฌ EvoControl

๐Ÿ† Controlled Self-Evolution for Algorithmic Code Optimization

arXiv Benchmark License Python

Achieving superior code complexity through diversified planning initialization, genetic evolution, and hierarchical experience memory


๐ŸŽฏ What is CSE?

Controlled Self-Evolution (CSE) is a novel framework that dramatically improves exploration efficiency in code optimization. Unlike existing self-evolution methods that suffer from initialization bias, uncontrolled stochastic operations, and insufficient experience utilization, CSE addresses all three bottlenecks through:

CSE Framework

๐Ÿ”‘ Three Key Innovations

ComponentProblem AddressedSolution
๐ŸŽจ Diversified Planning InitializationInitialization bias trapping evolution in poor regionsGenerate structurally distinct algorithmic strategies
๐Ÿงฌ Genetic EvolutionUncontrolled stochastic operations lacking feedbackFeedback-guided mutation & compositional crossover
๐Ÿง  Hierarchical Experience MemoryInsufficient experience utilization across tasksLocal + Global memory for experience reuse

๐Ÿ”ฌ Method Overview

1. ๐ŸŽจ Diversified Planning Initialization

Generates multiple structurally distinct algorithmic strategies before evolution begins, ensuring broad coverage of the solution space:

  • Multi-paradigm exploration: DP, Greedy, Two Pointers, Bit Manipulation, etc.
  • Sketch instantiation: Transform abstract strategies into concrete implementations
  • Initial population: Create diverse starting points to avoid local optima

2. ๐Ÿงฌ Genetic Evolution

Replaces stochastic operations with fine-grained feedback-guided mechanisms:

๐Ÿ”ง Controlled Mutation

  • Slot-based decomposition: Decompose solutions into functional components
  • Targeted refinement: Fix faulty components while preserving high-performing parts
  • Priority-guided: Optimize bottlenecks, inherit good parts, inspect risky areas

๐Ÿค Compositional Crossover

  • Complementary combination: Merge strengths from different solution trajectories
  • Structural integration: Create cohesive hybrid implementations
  • Synergistic synthesis: Achieve 1+1>2 effects through intelligent merging

3. ๐Ÿง  Hierarchical Evolution Memory

Captures and reuses evolutionary insights at two levels:

Memory TypeScopeFunction
Local MemoryIntra-taskAccumulates task-specific lessons to avoid repeating failures
Global MemoryInter-taskDistills cross-task optimization patterns into reusable templates

๐Ÿ“Š Performance Results

๐Ÿ† Main Results on EffiBench-X

CSE consistently outperforms all baselines across diverse LLM backbones (Qwen3-235B-A22B, DeepSeek-v3-0324, Claude-4.5-Sonnet, GPT-5):

Main Results

Metrics: ET (Execution Time efficiency), MP (Memory Peak efficiency), MI (Memory-time Integral - our primary metric balancing both runtime and memory)

๐Ÿ“ˆ Evolution Progress Analysis

CSE achieves higher efficiency from early generations and maintains continuous improvement throughout evolution:

Evolution Progress

Key Observations:

  • ๐Ÿš€ Fast Start: CSE outperforms baselines from the first generation
  • ๐Ÿ“ˆ Sustained Growth: Continuous improvement without plateauing
  • ๐ŸŽฏ Efficiency: Achieves superior results with limited exploration budget

๐Ÿ” Case Study

Detailed evolution trajectory on a real optimization task, showing how CSE progressively discovers more efficient algorithms:

Case Study

Evolution Highlights:

  • Iter 1: Initial solution with basic approach (886.13 MI)
  • Iter 5: Strategy switch to square-root factorization (197.49 MI)
  • Iter 8: Controlled mutation improves factor-checking (176.89 MI)
  • Iter 22: Trial division with early-stop optimization
  • Iter 25: Crossover combines best features โ†’ Final: 93.93 MI (7ร— improvement!)

โšก Quick Start

Get CSE running in 3 steps:

# 1. Clone and install
git clone https://github.com/your-repo/EvoControl.git
cd EvoControl
conda create -n cse python=3.12
conda activate cse
pip install -e .

# 2. Configure API credentials in configs/Plan-Weighted-Local-Global-30.yaml
# Set model.api_key, model.api_base, etc.

# 3. Run your first experiment
python SE_Perf/instance_runner.py \
    --config configs/Plan-Weighted-Local-Global-30.yaml \
    --max-parallel 10 \
    --mode execute

๐Ÿ’ก Prerequisites: Ensure EffiBench-X backend is running for code evaluation


๐Ÿ“ฆ Installation & Configuration

Installation

# Create virtual environment
conda create -n cse python=3.12
conda activate cse

# Install dependencies
pip install -e .

Configuration

CSE uses a two-layer configuration system:

Config TypeFilePurpose
Base Configconfigs/perf_configs/config_integral.yamlModel parameters, runtime limits, prompts
Strategy Configconfigs/Plan-Weighted-Local-Global-30.yamlEvolution strategy orchestration

Required Settings (in strategy config):

model:
  name: "deepseek-chat" # LLM model name
  api_base: "https://api.deepseek.com/v1"
  api_key: "your-api-key" # ๐Ÿ”‘ Required!

global_memory_bank:
  enabled: true
  embedding_model:
    api_base: "your-embedding-api-base"
    model: "embedding-model-name" # Embedding Model Name
    api_key: "your-embedding-key" # ๐Ÿ”‘ Required!

๐Ÿ’ป Usage Examples

Basic Experiment

python SE_Perf/instance_runner.py \
    --config configs/Plan-Weighted-Local-Global-30.yaml \
    --max-parallel 10 \
    --mode execute

Quick Test (First 5 Instances)

python SE_Perf/instance_runner.py \
    --config configs/Plan-Weighted-Local-Global-30.yaml \
    --max-parallel 1 \
    --limit 5 \
    --mode execute

๐Ÿ“‚ Output Structure

trajectories_perf/experiment_{timestamp}/
โ”œโ”€โ”€ {instance_name}/
โ”‚   โ”œโ”€โ”€ iteration_{n}/          # Per-iteration results
โ”‚   โ”‚   โ”œโ”€โ”€ result.json         # Evaluation metrics
โ”‚   โ”‚   โ””โ”€โ”€ *.traj              # Solution trajectories
โ”‚   โ”œโ”€โ”€ final.json              # Best optimized solution
โ”‚   โ”œโ”€โ”€ traj.pool               # All attempted solutions
โ”‚   โ””โ”€โ”€ se_framework.log        # Execution logs
โ”œโ”€โ”€ all_hist.json               # Aggregated history
โ”œโ”€โ”€ final.json                  # All final solutions
โ””โ”€โ”€ total_token_usage.json      # API usage statistics

๐Ÿ“ฅ Download Full Dataset

To run experiments on the complete EffiBench-X dataset, you need to download the full instances from the official repository:

Download from EffiBench-X Repository

# Clone the EffiBench-X repository
git clone https://github.com/EffiBench/EffiBench-X.git
cd EffiBench-X

# Install dependencies
pip install -r requirements.txt

# Download dataset from Hugging Face Hub
python hf_dataset.py download

Then copy the downloaded instances to your EvoControl project.

Dataset Structure

After downloading, the instances/ directory should contain JSON files:

instances/
โ”œโ”€โ”€ aizu_1444_yokohama-phenomena.json
โ”œโ”€โ”€ aizu_1459_e-circuit-is-now-on-sale.json
โ”œโ”€โ”€ leetcode_123_best-time-to-buy-and-sell-stock.json
โ”œโ”€โ”€ codeforces_1234_some-problem.json
โ””โ”€โ”€ ... (600+ problem instances from LeetCode, AtCoder, CodeChef, Codeforces, AOJ)

Run Full Experiment

# Run on all instances with high parallelism
python SE_Perf/instance_runner.py \
    --config configs/Plan-Weighted-Local-Global-30.yaml \
    --instances-dir ./instances \
    --max-parallel 20 \
    --mode execute

# Or run on a subset (first 100 instances)
python SE_Perf/instance_runner.py \
    --config configs/Plan-Weighted-Local-Global-30.yaml \
    --instances-dir ./instances \
    --max-parallel 10 \
    --limit 100 \
    --mode execute

๐Ÿ“Š Visualization Tool

CSE provides an interactive web-based visualization tool for analyzing experiment results, including trajectory graphs, performance curves, and detailed LLM interactions.

Launch Visualization Server

# Set the root directory for your experiments
export VIZ_ROOT="trajectories_perf/your_experiment_dir"

# Start the visualization server
cd viz_tool
python app.py

Then open your browser and navigate to: http://localhost:5000

Features

FeatureDescription
๐Ÿ”€ Trajectory GraphInteractive DAG visualization of solution evolution (Cytoscape.js)
๐Ÿ“ˆ Performance ChartBest performance curve across generations with turning points
๐Ÿ” Node DetailsClick any node to view detailed metrics, approach summary
๐Ÿ’ฌ LLM IOView complete LLM input/output for each iteration
๐Ÿ“ Code ViewInspect generated code for any solution
๐Ÿ”„ Comparison ModeCompare two experiments side-by-side

Interface Overview

  • Select Experiment: Choose from available experiment directories
  • Select Instance: Pick a specific problem instance to visualize
  • Compare with: Optionally select another experiment for comparison
  • Tabs:
    • Problem: View the original problem description
    • Details: Node metrics and summary information
    • LLM IO: Full LLM conversation history per iteration
    • Code: Generated solution code
    • Full Data: Raw JSON data for debugging

Chart Controls

ControlFunction
Y-Axis RangeSet custom min/max for performance axis
Show first K generationsFilter to display only first K iterations
Export PNGDownload chart as PNG image
Export HDDownload high-resolution (2x) PNG
Click legendEdit legend labels for publication
Shift+Click pointEdit data point labels

Tips

# View results from a specific experiment
export VIZ_ROOT="trajectories_perf/Plan-Weighted-Local-Global-30its_20260109_205036"
python viz_tool/app.py

# Change port if 5000 is occupied
python viz_tool/app.py --port 8080

๐Ÿ“– Citation

If you find EvoControl useful in your research, please cite our paper:

@article{hu2026controlled,
  title={Controlled Self-Evolution for Algorithmic Code Optimization},
  author={Tu Hu and Ronghao Chen and Shuo Zhang and Jianghao Yin and Mou Xiao Feng and Jingping Liu and Shaolei Zhang and Wenqi Jiang and Yuqi Fang and Sen Hu and Yi Xu and Huacan Wang},
  journal={arXiv preprint arXiv:2601.07348},
  year={2026}
}

๐Ÿ“„ Paper: arXiv:2601.07348


๐Ÿ™ Acknowledgments

We thank the following projects:

  • EffiBench-X โ€” Code efficiency evaluation benchmark
  • SE-Agent โ€” Trajectory-level self-evolution
  • OpenEvolve โ€” Open-source implementation of AlphaEvolve