ποΈ Consilium
January 26, 2026 Β· View on GitHub
A quiver of methods for seeking truth through AI councils.
v1.2.0
Query multiple AI models simultaneously. Compare, debate, verify, refine, and synthesize their responses.
β‘ Skip to Install Β· Docker
π More Screenshots
Analytics Dashboard
Track model performance, costs, and usage patterns.
Evaluation Panel
See detailed rankings, scores by criteria, and AI-generated analysis.
Philosophy β’ The Quiver β’ Quick Start β’ Docker β’ Configuration
π― Philosophy
The Core Insight
There is no single method that reliably produces truth.
But there are appropriate methods for different kinds of questions.
Consilium (Latin for "council" or "deliberation") is an epistemological framework instantiated in software. Different questions require different methods of inquiry. Each mode is an "arrow" in your quiver, designed for a specific epistemic target.
Why Multiple Models?
No single AI has all the answers. Each LLM has different training data, reasoning approaches, and blind spots. What one model gets wrong, another might get right.
| Challenge | How Consilium Helps |
|---|---|
| AI hallucinations | Cross-check answers across multiple models (Veritas mode) |
| Model bias | Anonymous deliberation removes reputation bias (Consensus, Arbitrium) |
| Finding the right model | Blind preference voting reveals true preferences (Arbitrium) |
| Complex decisions | Structured debate surfaces all arguments (Debate, Elenchus) |
| Quality output | Sequential refinement polishes content (Limatura) |
| Comprehensive answers | Synthesize insights from multiple sources (Synthesis) |
Methodological Pluralism
This is methodological pluralism β the philosophical position that different domains of inquiry require different approaches:
| Question Type | Method | Consilium Mode |
|---|---|---|
| Factual claims | Verification | Veritas |
| Complex trade-offs | Dialectic | Debate |
| Bias reduction | Deliberation | Consensus, Arbitrium |
| Quality assessment | Cross-examination | Analysis, Elenchus |
| Capability testing | Empiricism | Peira |
| Comprehensive coverage | Integration | Synthesis |
| Quality improvement | Iteration | Limatura |
πΉ The Quiver (12 Modes)
Consilium provides 12 distinct modes β each an arrow designed for a different target:
Mode Overview
| Mode | Shortcut | Purpose | When to Use |
|---|---|---|---|
| Forum | Ctrl+1 | Compare & Judge | General questions, find best answer |
| Debate | Ctrl+2 | Round-Robin Discussion | Complex topics with trade-offs |
| Consensus | Ctrl+3 | Anonymous Deliberation | Bias-reduced conclusions |
| Analysis | Ctrl+4 | Multi-Judge Critique | Deep evaluation of one answer |
| Synthesis | Ctrl+5 | Combine into One | Comprehensive coverage needed |
| Analytics | Ctrl+6 | Performance Stats | Review usage and costs |
| Peira | Ctrl+7 | Capability Testing | Benchmark model abilities |
| Elenchus | Ctrl+8 | Adversarial Red Team | Stress-test code/ideas |
| Versus | Ctrl+9 | Local vs Commercial | Compare local to cloud models |
| Arbitrium | Ctrl+0 | Blind Preference Vote | Discover true preferences |
| Veritas | Ctrl+- | Fact Check & Verify | Detect hallucinations |
| Limatura | Ctrl+= | Iterative Polish | Refine through multiple passes |
| Prompting | Ctrl+G | Prompting Guide | Learn effective prompting techniques |
π Detailed Mode Guide
1. Forum Mode (Ctrl+1)
Latin: "forum" β public place of discussion
All selected models answer your question simultaneously, then an AI judge ranks them.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR QUESTION β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββ
β Model A β β Model B β β Model C β
ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SIDE-BY-SIDE COMPARISON β
β + Blind Evaluation (judges see "Response A" not names)β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Best for: General questions, comparing writing styles, finding the best model for your use case
Features:
- Real-time streaming responses
- Blind evaluation (prevents model reputation bias)
- Follow-up questions with context
- Auto-evaluation ranks responses when complete
2. Debate Mode (Ctrl+2)
Latin: "debattuere" β to fight, contend
A structured multi-round discussion where models build on each other's ideas.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR TOPIC β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ROUND 1 β
β Model A β Model B β Model C (sees all previous) β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ROUND 2 β
β Model A β Model B β Model C (builds on Round 1) β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AUTOMATIC CONSENSUS SUMMARY β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Best for: Complex topics with trade-offs, controversial questions, exploring all sides
How to use:
- Select 2+ Participants
- Set number of Rounds (1-5)
- Models discuss round-robin, building on previous responses
- Automatic Consensus Summary generated at the end
3. Consensus Mode (Ctrl+3)
Latin: "consensus" β agreement, harmony
Models deliberate anonymously over multiple rounds to find where they agree.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR QUESTION β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ROUND 0 - INITIAL POSITIONS β
β Each model answers independently β
β Responses anonymized: Position A, B, C, D... β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ROUNDS 1-3 - DELIBERATION β
β Each model sees ALL anonymized positions β
β (but NOT who said what - prevents bias) β
β Task: Consider others, identify agreements/disputes, β
β refine position, move toward consensus β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FINAL - ARBITER SYNTHESIS β
β β
Consensus answer (if agreement reached) β
β OR β
β π Summary: What they agree on + What remains disputed β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Best for: Reducing model bias, finding fundamental agreements, cross-validated answers
Key difference from Debate: Models don't know who said what during deliberation, preventing "I agree with GPT because it's GPT" bias.
4. Analysis Mode (Ctrl+4)
Greek: "analusis" β breaking up, investigation
One model answers, multiple analysts evaluate the response from different perspectives.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR QUESTION β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ANSWERER MODEL RESPONDS β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
ββββββββββββ ββββββββββββ ββββββββββββ
β Analyst 1β β Analyst 2β β Analyst 3β
β Evaluatesβ β Evaluatesβ β Evaluatesβ
ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MULTI-PERSPECTIVE EVALUATION & SCORING β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Best for: Deep critique, understanding strengths/weaknesses, academic review
5. Synthesis Mode (Ctrl+5)
Greek: "sunthesis" β putting together
Multiple models answer, one synthesizer combines the best parts into a unified response.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR QUESTION β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββ
β Source 1β β Source 2β β Source 3β
β Answers β β Answers β β Answers β
ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ
β β β
ββββββββββββββββΌβββββββββββββββ
βΌ
βββββββββββββββββββββββββββ
β SYNTHESIZER MODEL β
β Combines all responses β
β into unified answer β
βββββββββββββββββββββββββββ
Best for: Research requiring comprehensive coverage, combining expertise, unified summaries
6. Peira Mode (Ctrl+7)
Greek: "ΟΞ΅αΏΟΞ±" (peira) β trial, experiment, test
Systematically test what models can and cannot do with structured benchmarks.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SELECT TEST CATEGORY β
β [Coding] [Math] [Reasoning] [Knowledge] [Creative] β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SELECT MODELS TO TEST β
β β‘ Claude Sonnet 4.5 β‘ GPT-5.2 β‘ Gemini 3 Pro β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β STRUCTURED TEST BATTERY β
β Each model receives identical test prompts β
β for fair comparison β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CAPABILITY REPORT β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Model β Score β Speed β Style β β
β β Claude Sonnet β 92% β 45t/s β Detailed β β
β β GPT-5.2 β 89% β 52t/s β Concise β β
β β Gemini 3 Pro β 87% β 61t/s β Structured β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Test Categories:
- Coding: Algorithm implementation, debugging, code review
- Math/Logic: Arithmetic, word problems, proofs, puzzles
- Reasoning: Syllogisms, analogies, causal reasoning
- Knowledge: Trivia, history, science, current events
- Creativity: Storytelling, poetry, brainstorming
Unique Value: This is the only mode where the question is fundamentally about the models themselves, not the world.
7. Elenchus Mode (Ctrl+8)
Greek: "αΌΞ»Ξ΅Ξ³ΟΞΏΟ" (elenchus) β cross-examination, refutation (Socrates' method)
Stress-test ideas, code, or plans by having models attack them.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR CONTENT TO BE CHALLENGED β
β (code, argument, plan, proposal, idea) β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββ΄ββββββββββββββββ
βΌ βΌ
βββββββββββββββββ βββββββββββββββββ
β DEFENDER β β CHALLENGERS β
β (1 model) β βοΈ VS βοΈ β (1+ models) β
β Defends the β β Attack/find β
β content β β flaws β
βββββββββ¬ββββββββ βββββββββ¬ββββββββ
β β
βββββββββββββββββ¬ββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ROUND 1: Challengers attack β
β ROUND 2: Defender responds β
β ROUND 3: Challengers counter β
β ... β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ARBITER VERDICT (optional) β
β β’ Vulnerabilities found β
β β’ Defenses successful β
β β’ Final assessment β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Use Cases:
- Security Review: "Find vulnerabilities in this code"
- Argument Testing: "What's wrong with this reasoning?"
- Business Plans: "What could go wrong with this strategy?"
- Risk Assessment: "Why shouldn't I do this?"
Unique Value: Systematic adversarial testing. Truth survives challenge.
8. Versus Mode (Ctrl+9)
Latin: "versus" β against, turned toward
Compare your local models against commercial frontier models with blind evaluation.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR PROMPT β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββ΄ββββββββββββββββ
βΌ βΌ
βββββββββββββββββββββββββ βββββββββββββββββββββββββ
β LOCAL COUNCIL β β COMMERCIAL COUNCIL β
β β’ llama3.3:70b β β β’ Claude Sonnet 4.5 β
β β’ qwen2.5:32b β β β’ GPT-5.2 β
β β’ deepseek-r1:14b β β β’ Gemini 3 Pro β
βββββββββββββ¬ββββββββββββ βββββββββββββ¬ββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββββββββ βββββββββββββββββββββββββ
β SYNTHESIZE into β β SYNTHESIZE into β
β one council answer β β one council answer β
βββββββββββββ¬ββββββββββββ βββββββββββββ¬ββββββββββββ
β β
βββββββββββββ¬ββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BLIND JUDGE EVALUATION β
β (Compares councils AND local vs each individual model) β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RESULTS & INSIGHTS β
β π Winner: [Local Council/Commercial Council] β
β π° Cost: Local \$0 vs Commercial $X.XX β
β π Savings if local wins: $X.XX saved! β
β β
β π― Local Council vs Individual Models: β
β ββββββββββ ββββββββββ ββββββββββ β
β β β
β β β β β π€ β β
β β Claude β β GPT-5 β β Gemini β β
β ββββββββββ ββββββββββ ββββββββββ β
β β
= Local council beats model β
β β = Model beats local council β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
How It Works:
- Both councils answer your question (models run serially for quality)
- Each council synthesizes individual responses into one unified answer
- Judge compares synthesized answers (Council A vs B) β blind, fair
- Judge also compares local synthesis vs each individual commercial model
- Results show: winner, cost saved, and whether your council beats frontier models individually
Best for: Testing if local models can replace paid APIs, finding which tasks locals handle well
Unique Value: Two levels of insight:
- "Is my local council as good as commercial?" (synthesis vs synthesis)
- "Can my local council beat individual frontier models?" (teamwork vs individuals)
9. Arbitrium Mode (Ctrl+0)
Latin: "arbitrium" β judgment, decision, free will
Discover your true preferences without model reputation bias.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR QUESTION β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββ΄ββββββββββββββββ
βΌ βΌ
βββββββββββββββββββββββββ βββββββββββββββββββββββββ
β RESPONSE A β β RESPONSE B β
β (Model hidden) β β (Model hidden) β
β β β β
β [Full response β β [Full response β
β displayed here] β β displayed here] β
β β β β
βββββββββββββ¬ββββββββββββ βββββββββββββ¬ββββββββββββ
β β
βββββββββββββ¬ββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β WHICH DO YOU PREFER? β
β [Vote A] [Vote B] β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β (after voting)
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β REVEAL: You chose Claude Sonnet 4.5! β
β Your preference data feeds into personal analytics β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Features:
- Blind by default β no peeking at model names
- Reveal after voting β see which model you actually preferred
- Preference tracking β builds personal model rankings over time
- Arena-style data β similar to LMSYS Chatbot Arena, but personal
Unique Value: Removes reputation bias. You might discover you prefer different models than you thought!
10. Veritas Mode (Ctrl+-)
Latin: "veritas" β truth
Detect hallucinations and verify factual claims through cross-model consensus.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CLAIM OR QUESTION TO VERIFY β
β "The Great Wall of China is visible from space" β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββ βββββββββββββββββ βββββββββββββββββ
β VERIFIER 1 β β VERIFIER 2 β β VERIFIER 3 β
β Claude Sonnet β β GPT-5.2 β β Gemini Pro β
β β β β β β
β Verdict: FALSEβ β Verdict: FALSEβ β Verdict: FALSEβ
β Confidence:95%β β Confidence:92%β β Confidence:88%β
β Citations: β β β Citations: β β β Citations: β β
βββββββββββββββββ βββββββββββββββββ βββββββββββββββββ
β β β
βββββββββββββββββΌββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ANALYZER SYNTHESIS β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β OVERALL VERDICT: FALSE β
β CONFIDENCE: 92% β
β β
β CONSENSUS FACTS: β
β β
All models agree the claim is false β
β β
Cited NASA astronaut testimonies β
β β
Referenced physics of human vision β
β β
β KEY EVIDENCE: β
β β’ Wall is ~30ft wide, not visible at orbital altitude β
β β’ Myth debunked by multiple astronauts β
β β
β DISPUTED: None β
β UNSUPPORTED: None β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Three Verification Methods:
| Method | Description | Best For |
|---|---|---|
| π§ Memory Only | Uses model training data only. "If unknown, say so." | Testing model knowledge without external sources |
| π Shared Research | One search, all models get same results | Fair comparison with consistent evidence |
| π Independent Research | Each model searches independently | Seeing how models approach verification differently |
Independent Research - Source Comparison: When using Independent Research mode, Veritas compares sources found by different models:
- Common Sources: URLs found by multiple models (high confidence)
- Unique Sources: URLs only one model found (may reveal blind spots)
- Search Queries: See what each model searched for
Verification Flow:
- Select verification method (Memory Only / Shared Research / Independent Research)
- Enter claim or question to verify
- Multiple verifier models independently assess truthfulness with citations
- Analyzer model synthesizes final report
- Report shows: consensus facts, disputed claims, confidence levels
Best for: Fact-checking before publishing, detecting hallucinations, verifying information
Unique Value: Structured hallucination detection with flexible research options. Trust but verify.
11. Limatura Mode (Ctrl+=)
Latin: "limatura" β filing, polishing, refinement
Polish and improve output through sequential model passes.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONTENT TO POLISH β
β (code, text, email, document) β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β V0: ORIGINAL (Model A creates initial response) β
β "Here is my first draft of the email..." β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β V1: FIRST REFINEMENT (Model B improves V0) β
β "Here is the improved version with clearer..." β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β V2: SECOND REFINEMENT (Model C improves V1) β
β "Here is the polished final version..." β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β VERSION COMPARISON β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β V0 (Original) β V1 (Refined) β V2 (Polished) β β
β β [View] β [View] β [View] β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β Current: V2 by Model C β
β [Copy Final] [Continue Refining] β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Refinement Types:
- General improvement: "Make this better"
- Style refinement: "Make this more concise/formal/casual"
- Code refinement: "Optimize and clean this code"
- Custom instruction: User-defined refinement criteria
Best for: Code optimization, document drafting, email refinement, creative writing polish
Unique Value: Sequential improvement, not just comparison. Each model builds on the last.
12. Prompting Guide (Ctrl+G)
Purpose: Learn and apply effective prompting techniques
A comprehensive guide to crafting effective AI prompts with 8 proven formulas.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PROMPTING GUIDE β
β β
β π 8 PROVEN FORMULAS: β
β β
β 1. RTCF - Role, Task, Context, Format β
β 2. CREATE - Character, Request, Examples... β
β 3. RISEN - Role, Instructions, Steps, End Goal... β
β 4. Chain-of-Thought - Step-by-step reasoning β
β 5. Few-Shot Learning - Input/output examples β
β 6. STAR - Situation, Task, Action, Result β
β 7. Code Generation - Language, Requirements... β
β 8. Self-Critique - Generate, critique, improve β
β β
β Each formula includes: β
β β’ Component breakdown β
β β’ Real-world examples β
β β’ Best use cases β
β β’ One-click copy β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Available Formulas:
| Formula | Components | Best For |
|---|---|---|
| RTCF | Role + Task + Context + Format | General structured prompts |
| CREATE | Character + Request + Examples + Adjustments + Type + Extras | Detailed specifications |
| RISEN | Role + Instructions + Steps + End Goal + Narrowing | Multi-step tasks |
| Chain-of-Thought | Step-by-step reasoning | Complex reasoning problems |
| Few-Shot | Input β Output examples | Pattern learning |
| STAR | Situation + Task + Action + Result | Problem-solving narratives |
| Code Generation | Language + Requirements + Standards + Edge Cases | Programming tasks |
| Self-Critique | Generate β Critique β Improve | Quality iteration |
Best for: Learning prompt engineering, improving query quality, teaching prompting techniques
Unique Value: Reference guide for effective prompting, always available with Ctrl+G.
π Question Type Matrix
Use this table to choose the right mode for your question:
| Question Type | Recommended Mode | Why |
|---|---|---|
| "What is X?" (Factual) | Forum or Veritas | Forum for comparison; Veritas for accuracy |
| "What's the best X?" (Opinion) | Consensus or Arbitrium | Consensus reduces bias; Arbitrium reveals preference |
| Creative writing | Forum or Limatura | Forum for variety; Limatura for polish |
| Coding/Technical | Forum or Elenchus | Forum for solutions; Elenchus for security review |
| Controversial/Ethical | Debate | Models engage with counterarguments |
| "Should I do X?" (Decision) | Consensus or Elenchus | Consensus for recommendation; Elenchus for risks |
| Research/Comprehensive | Synthesis + Veritas | Synthesis for coverage; Veritas for accuracy |
| Security Review | Elenchus | Adversarial testing finds vulnerabilities |
| Model Benchmarking | Peira | Structured capability testing |
| Quick Comparison | Arbitrium | Fast blind preference voting |
| Quality Polish | Limatura | Iterative improvement chain |
| Hallucination Check | Veritas | Cross-model fact verification |
| Local vs Cloud | Versus | Data-driven cost/quality comparison |
β¨ Features
π― Core Capabilities
- Multi-Model Comparison β Query multiple LLMs simultaneously
- Streaming Responses β Real-time output from all models
- Blind Evaluation β Anonymized judging prevents bias
- URL Content Fetching β Include webpage content in prompts
- Session Management β Save, tag, search, reload sessions
- Export Options β JSON, Markdown, CSV
π§ Advanced Features
- Knowledge Base (RAG) β Upload documents for context-aware responses
- Vision Support β Upload images for multi-model analysis
- Research Mode (SearXNG) β Web search before querying models
- Conversation Continuity β Follow-up questions with context
- Prompt Templates β Reusable prompts with variables
- Cost Tracking β Estimated API costs per response
- Model Analytics β Track which models win evaluations
- Pin/Favorite Responses β Star great responses
- Keyboard Shortcuts β Full keyboard navigation
- Local Model Support β Ollama and LM Studio integration
- Dark/Light Themes β Beautiful UI in both modes
- Model Sync β Fetch latest models from OpenRouter API
- Benchmark Sync β Update benchmark scores from HuggingFace Leaderboard
- Prompting Guide β Learn effective prompting techniques
π Quick Start
Prerequisites
- Node.js 18+
- npm or yarn
- API key from OpenRouter
π‘ Why OpenRouter? One API key = access to 25+ models (OpenAI, Anthropic, Google, xAI, Mistral, and more). Pay-as-you-go pricing.
Installation
# Clone the repository
git clone https://github.com/lafintiger/Consilium.git
cd Consilium
# Install backend dependencies
cd backend
npm install
# Configure environment
cp ../env.example.txt .env
# Edit .env and add your OPENROUTER_API_KEY
# Start backend
npm run dev
# In a new terminal, install and start frontend
cd frontend
npm install
npm run dev
Access the App
- Frontend: http://localhost:3800
- Backend API: http://localhost:3801
π³ Docker
Using Docker Compose (Recommended)
# Copy and configure environment
cp env.example.txt .env
# Edit .env with your API keys
# Build and start
docker compose up -d
# View logs
docker compose logs -f
# Stop
docker compose down
Updating to Latest Version
git pull
docker compose down
docker compose build --no-cache
docker compose up -d
β¨οΈ Keyboard Shortcuts
| Shortcut | Action |
|---|---|
Ctrl+1 | Forum mode |
Ctrl+2 | Debate mode |
Ctrl+3 | Consensus mode |
Ctrl+4 | Analysis mode |
Ctrl+5 | Synthesis mode |
Ctrl+6 | Analytics dashboard |
Ctrl+7 | Peira (capability testing) |
Ctrl+8 | Elenchus (red team) |
Ctrl+9 | Versus (local vs commercial) |
Ctrl+0 | Arbitrium (blind voting) |
Ctrl+- | Veritas (fact check) |
Ctrl+= | Limatura (iterative polish) |
Ctrl+G | Prompting Guide |
Ctrl+R | Toggle Research mode |
βοΈ Configuration
Environment Variables
Copy env.example.txt to .env in the backend folder:
# Required
OPENROUTER_API_KEY=sk-or-v1-your-key-here
# Optional - Local Models
OLLAMA_URL=http://localhost:11434
LMSTUDIO_URL=http://localhost:1234
# Optional - Research Mode
SEARXNG_URL=http://localhost:4000
# Performance
LOCAL_MODELS_SEQUENTIAL=true # Run local models one at a time
Available Models
Consilium supports 25+ models via OpenRouter:
| Provider | Models |
|---|---|
| Anthropic | Claude Sonnet 4.5, Opus 4.5, Haiku 4.5 |
| OpenAI | GPT-5.2, GPT-5.2 Pro, GPT-5.1, o3 |
| Gemini 3 Pro, Gemini 2.5 Pro/Flash | |
| xAI | Grok 4, Grok 4 Fast, Grok 3 |
| DeepSeek | DeepSeek V3.2, V3.2 Speciale |
| Mistral | Mistral Large 3, Devstral 2 |
| Local | Any Ollama/LM Studio model |
π Local Models
Ollama Setup
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull models
ollama pull llama3.3
ollama pull qwen2.5:32b
ollama pull deepseek-r1:14b
# Start Ollama server
ollama serve
Consilium automatically detects running Ollama models.
Docker + Local Models
| Scenario | OLLAMA_URL |
|---|---|
| Both native | http://localhost:11434 |
| Consilium in Docker, Ollama native | http://host.docker.internal:11434 |
π Project Structure
Consilium/
βββ backend/ # Express.js API server
β βββ src/
β β βββ index.js # Server entry point
β β βββ config/ # Model configs, benchmarks
β β βββ db/ # SQLite database
β β βββ routes/ # API endpoints
β βββ package.json
β
βββ frontend/ # React + Vite + Tailwind
β βββ src/
β β βββ components/ # React components
β β βββ constants/ # Mode definitions
β β βββ stores/ # Zustand state
β β βββ types/ # TypeScript definitions
β βββ package.json
β
βββ docker-compose.yml
βββ DEVELOPER_GUIDE.md # Developer guide
βββ README.md # This file
π Knowledge Base (RAG)
Consilium includes a built-in Retrieval Augmented Generation (RAG) system that lets you upload documents and have AI models answer questions using your own content.
How It Works
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β KNOWLEDGE BASE WORKFLOW β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 1. UPLOAD DOCUMENTS β
β β’ Click Database icon (ποΈ) in header β
β β’ Create collections: "Tech Docs", "Research", etc. β
β β’ Upload PDFs, Word docs, text files, Markdown β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 2. AUTOMATIC PROCESSING (Background) β
β β’ Parse document β Extract text β
β β’ Chunk text β Smart segmentation (~500 tokens each) β
β β’ Generate embeddings β Ollama qwen3-embedding:8b β
β β’ Store in SQLite database β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 3. QUERY WITH KNOWLEDGE β
β β’ Toggle "Knowledge" button in prompt input β
β β’ Select specific collection or "All Collections" β
β β’ Ask your question β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 4. SEMANTIC SEARCH & AUGMENTATION β
β β’ Your question β Embedded β Compare to chunks β
β β’ Top 5 most relevant chunks retrieved β
β β’ Chunks added as context to your prompt β
β β’ All models receive the augmented prompt β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 5. VIEW SOURCES β
β β’ "Knowledge Base Sources" panel shows retrieved chunks β
β β’ Document name, collection, similarity score β
β β’ Preview of the chunk content β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Supported Document Types
| Type | Extension | Notes |
|---|---|---|
.pdf | Text extraction via pdf-parse | |
| Word | .docx | Modern Word format via mammoth |
| Text | .txt | Plain text files |
| Markdown | .md | Markdown files |
Knowledge Collections
Organize your documents into themed collections:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π Collections β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π₯ Medical Research β 12 docs β 342 chunks β
β π» Tech Documentation β 8 docs β 215 chunks β
β π Company Policies β 5 docs β 89 chunks β
β π General β 3 docs β 47 chunks β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Create collections with custom names, colors, and descriptions
- Filter searches to specific collections or search all
- Move documents between collections as needed
- Delete collections without losing documents (they go to "uncategorized")
Requirements for Knowledge Base
-
Ollama must be running with an embedding model:
# Install the embedding model ollama pull qwen3-embedding:8b # Start Ollama server ollama serve -
Status Check: The Knowledge Panel shows embedding model status
- β Green = Ready to process documents
- β Red = Embedding model not available
Configuration
| Environment Variable | Default | Description |
|---|---|---|
EMBEDDING_MODEL | qwen3-embedding:8b | Ollama embedding model to use |
KNOWLEDGE_TOP_K | 5 | Max chunks to retrieve per query |
KNOWLEDGE_MIN_SIMILARITY | 0.3 | Minimum similarity threshold |
KNOWLEDGE_MAX_TOKENS | 8000 | Max tokens for context |
Use Cases
| Scenario | How to Use |
|---|---|
| Company Q&A Bot | Upload policy docs β Ask questions about procedures |
| Research Assistant | Upload papers β Ask for summaries and connections |
| Documentation Search | Upload tech docs β Query specific APIs or features |
| Study Helper | Upload course materials β Ask practice questions |
| Legal Research | Upload contracts β Query for specific clauses |
Combined with Other Features
Knowledge Base works alongside other Consilium features:
| Combination | Result |
|---|---|
| Knowledge + Forum | Multiple models answer using your documents |
| Knowledge + Veritas | Fact-check claims against your own sources |
| Knowledge + Synthesis | Combine document insights from multiple models |
| Knowledge + Research | Use both your docs AND web search |
π License
Polyform Noncommercial 1.0.0 β See LICENSE for details.
| Use Case | Allowed |
|---|---|
| Educators & Students | β Free |
| Personal/Hobby Use | β Free |
| Non-profit Organizations | β Free |
| Research | β Free |
| Commercial Use | β Contact for license |
π Acknowledgments
- OpenRouter for unified LLM API access
- Ollama for local model inference
- Vite + React + Tailwind CSS
πΉ A quiver of methods for seeking truth
Built with π§ β Seeking Truth Through AI Councils