LLM Investor Behavior Benchmark (LIBB)
March 22, 2026 · View on GitHub
What Is LIBB?
LIBB is an open-source, opinionated research library designed to automatically manage portfolio state and compute key metrics, while still giving users flexibility over the system.
Why LIBB Exists
This project originally began as a generic benchmark for LLM-based trading in U.S. equities. While surveying existing LLM trading projects (including my own), I noticed a consistent lack of rigorous sentiment, behavioral, and performance metrics; most projects reported little more than an equity curve.
This raised a fundamental question: "Why isn't LLM trading held to the same analytical standards as the rest of finance?"
So I developed a library designed to support rigorous evaluation of LLM-driven trading systems. The hope is that it provides a useful foundation for others doing similar work.
Features
-
Persistent Portfolio State
All portfolio data is explicitly stored on disk, enabling inspection, reproducibility, and post-hoc analysis across runs.
-
Built-In Behavioral, Performance, and Sentiment Analysis
Quantitative behavioral metrics (HHI concentration, loss aversion, turnover, cash allocation, order quality), key performance metrics (Sharpe, Sortino, drawdown, CAPM), and sentiment analysis via the Loughran-McDonald financial lexicon. All results are persisted as first-class research artifacts.
-
Atomic Portfolio Processing with Rollback
All portfolio processing is transactional. If execution fails mid-run, disk state is automatically restored to a snapshot taken at startup, preventing partial writes and corrupt portfolio state.
-
Reproducible Run Structure
Each model run follows a consistent on-disk directory layout, making experiments easy to reproduce, compare, and archive.
-
Flexible Execution Workflows
Execution logic remains fully user-controlled, allowing researchers to integrate custom strategies, models, or data sources.
How It Works
LIBB operates as a file-backed execution loop where portfolio state, analytics, and research artifacts are explicitly persisted to disk.
For each run, the engine:
- Loads and processes existing portfolio state
- Receives inputs (e.g., via an LLM)
- Computes and stores analytical signals (such as sentiment) via explicit user calls
- Saves execution instructions (orders) by passing a JSON block
- Persists all outputs for inspection and reuse
Execution scheduling (e.g., daily vs. weekly runs) and model orchestration are intentionally left to the user, preserving flexibility while maintaining a consistent on-disk state.
Documentation
New to LIBB? Start here → Documentation Guide
This guide explains the system philosophy, execution workflow, and how to read the codebase effectively.
Example Workflow
from libb import LIBBmodel
from libb.other.parse import parse_json
# See user_side/prompt_orchestration/prompt_models.py for a full prompting example
MODELS = ["deepseek", "gpt-4.1"]
def daily_flow():
for model in MODELS:
libb = LIBBmodel(f"user_side/runs/run_v1/{model}")
libb.process_portfolio()
daily_report = prompt_daily_report(libb)
libb.save_daily_update(daily_report)
orders_json = parse_json(daily_report, "ORDERS_JSON")
libb.save_orders(orders_json)
libb.analyze_sentiment(daily_report, report_type="daily")
return
Created File Tree
After running for the first time, LIBB generates a fixed directory structure at the user-specified output path.
<output_dir>/
├── config.json # run configuration and parameters
|
├── metrics/ # evaluation outputs
│ ├── behavior.json
│ ├── performance.json
│ └── sentiment.json
│
├── portfolio/ # live trading state & history
│ ├── cash.json # authoritative current cash balance
│ ├── pending_trades.json
│ ├── portfolio.csv # current positions only
│ ├── portfolio_history.csv # daily equity & cash snapshots
│ ├── position_history.csv # per-position daily history
│ └── trade_log.csv
│
├── logging/ # per-run execution logs (JSON)
│
└── research/ # generated analysis & reports
├── daily_reports/
└── deep_research/
No manual file setup is required. LIBB will use this file tree to save artifacts for all future runs in the output directory.
Getting Started
LIBB can be used in two ways:
- As a library — install via pip and import
LIBBmodelinto your own project - As a template — clone the repo and build directly on top of
user_side/
Using LIBB as a Library
Add to your requirements.txt:
git+https://github.com/LuckyOne7777/LLM-Investor-Behavior-Benchmark.git
Then install:
pip install -r requirements.txt
Or install directly:
pip install git+https://github.com/LuckyOne7777/LLM-Investor-Behavior-Benchmark.git
Using LIBB as a Template
Clone the repo and build on top of user_side/. This gives you the full
example workflow, prompt templates, and orchestration scaffolding to work from.
This guide shows two supported setup paths:
- Option A (Recommended): Virtual Environment
- Option B: Global / No Virtual Environment
Choose the option that best fits your workflow.
Note: Installation requires internet access. Dependencies including
yfinanceandpysentiment2download data and lexicon files on first use.
Option A: Virtual Environment (Recommended)
This option isolates dependencies and avoids conflicts with other Python projects.
1. Clone the Repository
git clone https://github.com/LuckyOne7777/LLM-Investor-Behavior-Benchmark.git
cd LLM-Investor-Behavior-Benchmark
Verify contents:
ls
You should see folders like libb/, user_side/, and requirements.txt.
2. Create a Virtual Environment
Windows:
python -m venv .venv
macOS / Linux:
python3 -m venv .venv
3. Activate the Virtual Environment
Windows (PowerShell) If activation fails due to script execution policy, run once:
Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned
Then activate:
.venv\Scripts\activate
Windows (Command Prompt alternative)
.venv\Scripts\activate.bat
macOS / Linux
source .venv/bin/activate
Verify activation:
python --version
You should see (.venv) in your shell prompt.
4. Install Dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .
5. Verify Installation
python -c "import libb; print(libb.__file__)"
Expected output should point to libb/__init__.py.
6. Set Environment Variables
macOS / Linux:
export OPENAI_API_KEY="your_key_here"
export DEEPSEEK_API_KEY="your_key_here"
Windows (PowerShell):
setx OPENAI_API_KEY "your_key_here"
setx DEEPSEEK_API_KEY "your_key_here"
Restart the terminal after using setx.
7. Run an Example Workflow
python -m user_side.workflow
8. Exit the Virtual Environment
To remove the virtual environment entirely:
Linux / macOS:
rm -rf .venv
Windows:
Remove-Item -Recurse -Force .venv
Option B: Global Setup (No Virtual Environment)
This option installs dependencies into the active Python environment. Recommended only for users comfortable managing global Python packages.
1. Clone the Repo
git clone https://github.com/LuckyOne7777/LLM-Investor-Behavior-Benchmark.git
cd LLM-Investor-Behavior-Benchmark
2. Verify Python Version
LIBB requires Python 3.10 or newer.
python --version
3. Upgrade pip
python -m pip install --upgrade pip
4. Install Dependencies Globally
pip install -r requirements.txt
pip install -e .
Verify installation:
python -c "import libb; print(libb.__file__)"
5. Set Environment Variables
Same as Option A.
6. Run an Example Workflow
python -m user_side.workflow
Optional: Uninstall
pip uninstall libb
Notes
Dependencies may remain installed if they were already present.
Windows users may encounter PowerShell execution policy restrictions.
Command Prompt can be used instead of PowerShell if preferred.
Execution scheduling and orchestration are intentionally left to the user.
Research Directions
LIBB is an exploratory research library, and its development is driven by ongoing areas of improvement rather than a fixed roadmap.
Areas of current interest include:
- Deeper integration of performance analytics into the core workflow
- Expansion of sentiment analytics across multiple data sources
- Improved tooling for comparing runs and strategies over time
- Expansion of config system to cover data source preferences and metric toggles
- General design improvements for efficiency and code quality
To see the current roadmap for major features, check out: roadmap.md
These directions reflect current research interests and may evolve, change, or be abandoned as the project develops.