ΩmegaWiki

May 11, 2026 · View on GitHub

ΩmegaWiki Logo

ΩmegaWiki

Karpathy's LLM-Wiki Vision, Fully Realized

Your AI Research Platform — From Papers to Publications, Powered by Claude Code

From paper ingestion to publication — your research knowledge compounds, never decays.

License: MIT Python 3.9+ Skills Claude Code Bilingual

English | 中文


Team

ΩmegaWiki is built by DAIR Lab at Peking University — a fully agentic platform that automates the complete research pipeline, from knowledge ingestion to paper submission.

Weitong Qian

Weitong Qian
PKU
Undergraduate · 2023
Beicheng Xu

Beicheng Xu
PKU
Ph.D. · 2023
Zhongao Xie

Zhongao Xie
PKU
Undergraduate · 2025
Bowen Fan

Bowen Fan
PKU
Undergraduate · 2024
Guozheng Tang

Guozheng Tang
PKU
Undergraduate · 2024
Xinzhe Wu

Xinzhe Wu
PKU
Undergraduate · 2024
Jiale Chen

Jiale Chen
PKU
Undergraduate · 2024
Mingtian Yang

Mingtian Yang
PKU
Undergraduate · 2024

🆕 What's New

📰 2026-05-09 · Daily arXiv — fresh-paper recommendations, on demand or scheduled

Run /daily-arxiv for a one-off pass, or /daily-arxiv setup to schedule the same pipeline in GitHub Actions. The skill builds an evidence packet from arXiv + Semantic Scholar + DeepXiv, lets the LLM rank candidates against your wiki interests, and delivers a digest by e-mail. Explicit --mode auto-ingest calls /ingest for high-confidence picks; inform mode just notifies.

🌐 2026-05-06 · Knowledge Graph Visualization — browser + Obsidian

Your research graph now has two ways to explore:

  • Web UI — run python3 tools/serve.py, open http://localhost:8765/#/graph. Click any node to highlight its neighborhood via BFS, filter by entity type or edge category, double-click to open the full page in the Reader.
  • Obsidian — run /visualize --obsidian to generate a color-coded graph config, or /visualize --canvas to produce a force-layout Canvas with labeled semantic edges.

🔬 2026-05-06 · Methods — Reusable Techniques are Now First-Class

Architectures, training recipes, evaluation protocols, and other reusable techniques now live in wiki/methods/ as proper wiki entities — with their own pages, source-paper links, and parent/child method chains.


What is ΩmegaWiki?

Andrej Karpathy proposed LLM-Wiki: an LLM that builds and maintains a persistent, structured wiki from your sources — not a throwaway RAG answer, but compounding knowledge that grows smarter with every paper you feed it.

ΩmegaWiki takes that idea and runs the full distance. It's not just a wiki builder — it's a complete research lifecycle platform: from paper ingestion → knowledge graph → gap detection → idea generation → experiment design → paper writing → peer review response. All driven by 24 Claude Code skills, all centered on one wiki as the single source of truth.

Drop your .tex / .pdf files in a folder. Run one command. Get a fully cross-referenced knowledge base — and then use it to generate novel research ideas, design experiments, write papers, and respond to reviewers.

Why Wiki-Centric, Not RAG?

RAGΩmegaWiki
Knowledge persistenceRediscovered on every queryCompiled once, maintained forever
StructureFlat chunk store9 typed entities with relationships
Cross-referencesNone — chunks are isolatedBidirectional wikilinks + typed graph
Knowledge gapsInvisibleExplicitly tracked, drive research
Failed experimentsLostFirst-class anti-repetition memory
OutputChat answersPapers, surveys, experiment plans, rebuttals
CompoundingNo — same cost every queryYes — each paper enriches the whole graph

Architecture

ΩmegaWiki Architecture

Every skill reads from and writes back to the wiki. Knowledge compounds — each new paper enriches the whole graph. Failed experiments aren't discarded; they become anti-repetition memory that prevents re-exploring dead ends.

Quick Start

Prerequisites: Python 3.9+, Node.js 18+

# 1. Clone
git clone https://github.com/skyllwt/OmegaWiki.git
cd OmegaWiki

# 2. Install Claude Code
npm install -g @anthropic-ai/claude-code
claude login

# 3. One-click setup
chmod +x setup.sh && ./setup.sh        # Linux / macOS
# Windows (PowerShell):
#   powershell -ExecutionPolicy Bypass -File .\setup.ps1
# setup creates .venv for OmegaWiki
# the script does not keep your shell activated, but /init will use .venv automatically

# 4. Put your own papers in raw/papers/ (.tex or .pdf)
#    Optional: add intent notes to raw/notes/ and saved pages to raw/web/
#    /init and direct local /ingest will manage generated inputs under raw/discovered/ and raw/tmp/

# 5. Build your wiki
claude
# Then type: /init [your-research-topic]
Manual setup (Linux / macOS)
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env                 # Edit to add API keys
cp config/settings.local.json.example .claude/settings.local.json
Manual setup (Windows / PowerShell)
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
Copy-Item .env.example .env          # Edit to add API keys
Copy-Item config\settings.local.json.example .claude\settings.local.json

Note: native Windows is supported for the local pipeline. Remote-GPU experiments via /exp-run --env remote rely on ssh/rsync/screen and are best run from WSL2 or Linux/macOS.

API Keys

KeyRequired?How to getWhat it enables
ANTHROPIC_API_KEYYesclaude login (automatic)Powers all Claude Code skills
CLAUDE_CODE_OAUTH_TOKENOptionalclaude setup-tokenGitHub Actions Claude Code auth for Pro/Max users
SEMANTIC_SCHOLAR_API_KEYOptionalsemanticscholar.org/product/api (free)Citation graph, paper search
DEEPXIV_TOKENOptionalsetup.sh auto-registersSemantic search, TLDR, trending
LLM_API_KEY + LLM_BASE_URL + LLM_MODELOptionalAny OpenAI-compatible APICross-model review; /daily-arxiv inform recommendations

Cross-model review: ΩmegaWiki uses a second LLM as an independent reviewer for ideas, experiments, and paper drafts. Works with any OpenAI-compatible API — DeepSeek, OpenAI, Qwen, OpenRouter, SiliconFlow, etc. If not configured, skills still work in Claude-only mode.

Daily arXiv Recommendations

/daily-arxiv runs a one-off fresh-paper recommendation pass even before automation is configured. To schedule the same pipeline in GitHub Actions, copy config/daily-arxiv.yml.example to config/daily-arxiv.yml, then run /daily-arxiv setup. The config stores non-secret preferences such as mode, categories, caps, and schedule; SMTP/API credentials stay in .env or GitHub Actions secrets. In CI inform mode, recommendations can use Claude Code auth (ANTHROPIC_API_KEY or CLAUDE_CODE_OAUTH_TOKEN) or the OpenAI-compatible LLM_* review model; auto-ingest still requires Claude Code.

See docs/daily-arxiv-deployment.md for the GitHub Actions setup checklist and symptom-keyed troubleshooting.

Sample digest
Sample /daily-arxiv digest

A real /daily-arxiv run: ranked recommendations with scores, rationales, wiki connections, and an auto-ingest section.

Skills

24 slash commands spanning the full research lifecycle:

Phase 0: Setup

CommandWhat it does
/setupFirst-time configuration (API keys, language, dependencies)
/reset <scope>Destructive cleanup: wiki | raw | log | checkpoints | all

Phase 1: Knowledge Foundation

CommandWhat it does
/prefill <domain>Optionally seed foundations/ with background knowledge
/init [topic]Bootstrap a full wiki from user raw sources plus optional discovery
/ingest <source>Parse a paper → wiki pages + cross-references
/discoverRecommend ranked next-read papers from anchors, a topic, or the current wiki
/edit <request>Add/remove sources or update wiki content
/ask <question>Query the wiki, crystallize answers back
/checkHealth scan: broken links, missing cross-refs, consistency

Phase 2: Research Pipeline

CommandWhat it does
/daily-arxivRun/manage a daily arXiv recommendation feed (+ optional GitHub Actions scheduler)
/ideateMulti-phase idea generation from cross-topic connections
/novelty <idea>Multi-source novelty verification (web + S2 + wiki + review LLM)
/review <artifact>Cross-model adversarial review for any research artifact
/exp-design <idea>Idea-driven experiment + ablation design
/exp-run <experiment>Implement + deploy + monitor (local or remote GPU)
/exp-statusDashboard for running experiments; auto-collect results
/exp-eval <experiment>Verdict gate → auto-update the linked idea + graph
/refine <artifact>Multi-round: produce → review → fix → re-review

Phase 3: Writing & Submission

CommandWhat it does
/surveyGenerate Related Work from wiki knowledge
/paper-plan <ideas>Outline from validated-idea graph + evidence matrix
/paper-draft <plan>Draft LaTeX + figures, section by section
/paper-compile <dir>Compile → PDF, auto-fix, verify page/anonymity
/research <direction>End-to-end orchestrator with human gates
/rebuttal <reviews>Parse reviewer comments → draft point-by-point responses

Wiki Structure

9 Entity Types

TypeDirectoryPurpose
Paperpapers/Structured summary: problem/key idea/method/experiment+results/limitations + tldr/contribution_type/datasets
Conceptconcepts/Cross-paper technical concept with variants, comparisons, definition, linked ideas
Topictopics/Research direction map with SOTA tracker, key benchmarks, and open problems (split into known + methodological gaps)
Personpeople/Researcher profile with research areas, recent work, and a researcher/team/organization type
Ideaideas/Research idea with lifecycle, novelty argument & score, target venue
Experimentexperiments/Full record: hypothesis → setup → results → updates to the linked idea
Methodmethods/Reusable, citable technique entity (cross-paper); links to source papers and parent/child methods
SummarySummary/Domain-wide survey across topics
Foundationfoundations/Background knowledge (terminal: receives inward links, writes none)

Knowledge Graph

Semantic relationships are stored in graph/edges.jsonl; bibliographic paper citations are stored separately in graph/citations.jsonl.

Paper-paper semantic edges include same_problem_as, similar_method_to, complementary_to, builds_on, compares_against, improves_on, challenges, and surveys. Paper-concept edges use introduces_concept, uses_concept, extends_concept, and critiques_concept. Workflow edges (supports, contradicts, tested_by, invalidates, addresses_gap, inspired_by, derived_from) span experiments, ideas, methods, and concepts.

All pages use Obsidian [[wikilink]] format — open wiki/ in Obsidian for visual graph exploration.

Automation

GitHub Actions runs the /daily-arxiv recommendation pipeline at UTC 00:17 daily (08:17 Beijing time):

  1. Add SMTP secrets to repo Settings → Secrets when e-mail delivery is enabled: SMTP_HOST, SMTP_PORT, SMTP_USER, SMTP_PASSWORD, SMTP_FROM, DAILY_ARXIV_EMAIL_TO
  2. Optional inform-mode LLM recommendation: add ANTHROPIC_API_KEY or CLAUDE_CODE_OAUTH_TOKEN for Claude Code, or LLM_API_KEY, LLM_BASE_URL, and LLM_MODEL for any OpenAI-compatible provider
  3. .github/workflows/daily-arxiv.yml fetches arXiv, deduplicates against the wiki, builds a recommendation context, uploads artifacts, and sends the digest by SMTP

auto-ingest mode is explicit and requires Claude Code in CI, because plain API LLMs cannot invoke slash skills such as /ingest. Use manual dispatch with send_email=false for a dry run without SMTP secrets.

Project Structure

OmegaWiki/
├── CLAUDE.md                    # Runtime schema & rules
├── wiki/                        # Knowledge base (LLM-maintained)
│   ├── papers/                  #   Structured paper summaries
│   ├── concepts/                #   Cross-paper technical concepts
│   ├── topics/                  #   Research direction maps
│   ├── people/                  #   Researcher profiles
│   ├── ideas/                   #   Research ideas (with lifecycle)
│   ├── experiments/             #   Experiment records
│   ├── methods/                 #   Reusable cross-paper method entities
│   ├── Summary/                 #   Domain-wide surveys
│   ├── foundations/             #   Background knowledge (terminal pages)
│   ├── outputs/                 #   Generated artifacts
│   ├── graph/                   #   Auto-generated: edges, context, gaps
│   ├── index.md                 #   Content catalog
│   └── log.md                   #   Chronological log
├── raw/                         # Source materials
│   ├── papers/                  #   User-owned .tex / .pdf files
│   ├── discovered/              #   external papers from /init and explicit /daily-arxiv auto-ingest
│   ├── tmp/                     #   generated prepared local sidecars for /init and direct local /ingest
│   ├── notes/                   #   User-owned .md notes
│   └── web/                     #   User-owned HTML / Markdown
├── tools/                       # Deterministic Python helpers
│   ├── research_wiki.py         #   Wiki engine (20 CLI commands)
│   ├── init_discovery.py        #   /init prepare + plan + fetch helper
│   ├── discover.py              #   /discover candidate gathering, dedup, ranking
│   ├── lint.py                  #   Structural validation (10 checks)
│   ├── reset_wiki.py            #   Scoped destructive cleanup helper
│   ├── fetch_arxiv.py           #   arXiv RSS fetcher
│   ├── fetch_s2.py              #   Semantic Scholar API
│   ├── fetch_deepxiv.py         #   DeepXiv semantic search
│   ├── fetch_wikipedia.py       #   Wikipedia fetcher (used by /prefill)
│   └── remote.py                #   SSH ops for remote experiments
├── .claude/skills/              # 24 Claude Code skill definitions
├── i18n/                        # Bilingual: en/ (canonical) + zh/
├── config/                      # Configuration templates
├── mcp-servers/                 # Cross-model review server
└── .github/workflows/           # Daily arXiv cron

Bilingual Support

ΩmegaWiki ships in English and Chinese:

./setup.sh --lang en   # English (default)
./setup.sh --lang zh   # 中文

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

LLM API Configuration / 大模型 API 配置

ΩmegaWiki runs on Claude Code, which speaks the Anthropic API protocol. You can use Claude directly, or route Claude Code to any third-party provider that exposes an Anthropic-compatible endpoint by overriding a few environment variables.

ΩmegaWiki 基于 Claude Code,Claude Code 使用 Anthropic API 协议通信。你既可以直接使用 Claude,也可以通过覆盖几个环境变量,把 Claude Code 指向任意支持 Anthropic 协议的第三方供应商。

Option A — Native Claude / 原生 Claude

claude login   # OAuth, no manual config / OAuth 登录,无需手动配置

Option B — Third-party Anthropic-compatible API / 第三方 Anthropic 兼容 API

Pick a provider below, paste the snippet into ~/.claude/settings.json (or the project's .claude/settings.json), and replace the <...> placeholder with your own API key. Model names and extra options are taken from each provider's official Claude Code docs — if anything stops working (e.g. a model is renamed), check the provider's website.

从下方任选一个供应商,把对应配置粘贴到 ~/.claude/settings.json(或项目的 .claude/settings.json),并把 <...> 占位符替换为你自己的 API key。模型名与额外选项均来自各供应商官方 Claude Code 文档;若出现问题(例如模型改名),请查询对应官网。

MiMo (小米)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.xiaomimimo.com/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-mimo-key>",
    "ANTHROPIC_MODEL": "mimo-v2.5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "mimo-v2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "mimo-v2.5-pro",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "mimo-v2.5"
  }
}

DeepSeek

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-deepseek-key>",
    "ANTHROPIC_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_SUBAGENT_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_EFFORT_LEVEL": "max"
  }
}

Kimi (Moonshot)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.moonshot.ai/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-moonshot-key>",
    "ANTHROPIC_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "kimi-k2.5",
    "CLAUDE_CODE_SUBAGENT_MODEL": "kimi-k2.5",
    "ENABLE_TOOL_SEARCH": "false"
  }
}

GLM (Z.AI)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-zai-key>",
    "API_TIMEOUT_MS": "3000000"
  }
}

Z.AI applies a default server-side model mapping, so no explicit ANTHROPIC_MODEL is needed. Z.AI 默认在服务端做模型映射,无需显式设置 ANTHROPIC_MODEL

Skip the Claude Code onboarding / 跳过 Claude Code 初始引导

When using a third-party key (instead of claude login), Claude Code's first-run onboarding won't complete automatically. Create or edit .claude.json and mark it done:

使用第三方 key 时不会走 claude login,Claude Code 首次启动的引导不会自动完成。创建或编辑 .claude.json,手动标记引导已完成:

  • macOS / Linux: ~/.claude.json
  • Windows: <user-home>\.claude.json
{
  "hasCompletedOnboarding": true
}

Then run claude as usual. / 保存后正常运行 claude 即可。


Community / 交流群

WeChat Group QR Code

Scan to join the ΩmegaWiki WeChat group / 扫码加入微信交流群

Acknowledgments

  • Andrej Karpathy — for the LLM-Wiki concept that inspired this project
  • Claude Code — the AI agent runtime that powers ΩmegaWiki

Star History

Star History Chart

License

MIT — use it, fork it, build on it.


中文

ΩmegaWiki 是什么?

Andrej Karpathy 提出了 LLM-Wiki 概念:让 LLM 构建并维护一个持久的、结构化的 wiki,而不是一次性的 RAG 回答。知识持续积累,每一篇新论文都让整个知识图谱更强。

ΩmegaWiki 将这个理念完整实现。 它不仅是 wiki 构建器,更是完整的研究全流程平台:从论文摄入 → 知识图谱 → 缺口检测 → 想法生成 → 实验设计 → 论文写作 → 同行评审回复。24 个 Claude Code Skills 驱动,一个 wiki 作为唯一的知识中枢。

为什么选择 Wiki 而不是 RAG?

RAGΩmegaWiki
知识持久性每次查询都重新发现编译一次,持续维护
结构扁平的 chunk 存储9 种实体类型 + 关系图
交叉引用无 — chunk 彼此孤立双向 wikilink + 类型化边
知识缺口不可见显式追踪,驱动研究方向
失败实验丢失一等公民,防止重复探索
输出聊天回答论文、综述、实验方案、审稿回复
复利效应无 — 每次查询成本相同有 — 每篇论文丰富整个图谱

快速开始

前置条件: Python 3.9+, Node.js 18+

git clone https://github.com/skyllwt/OmegaWiki.git && cd OmegaWiki

# 安装 Claude Code
npm install -g @anthropic-ai/claude-code
claude login

# 一键配置
chmod +x setup.sh && ./setup.sh --lang zh        # Linux / macOS
# Windows (PowerShell):
#   powershell -ExecutionPolicy Bypass -File .\setup.ps1 -Lang zh
# setup 会为 OmegaWiki 创建 .venv
# 脚本不会把你当前 shell 永久激活,但 /init 会自动使用 .venv

# 把你自己的论文放入 raw/papers/(.tex 或 .pdf)
# 可选:把意图笔记放入 raw/notes/,网页存档放入 raw/web/
# /init 与直接本地 /ingest 会自动管理 raw/discovered/ 与 raw/tmp/ 下的生成内容
# 启动 Claude Code
claude
# 输入:/init [你的研究方向]

Windows 用户:本地 pipeline 已原生支持。/exp-run --env remote 远程 GPU 实验依赖 ssh/rsync/screen,建议在 WSL2 或 Linux/macOS 下运行。

API Key 说明

Key必须?获取方式用途
ANTHROPIC_API_KEYclaude login驱动所有 Skill
CLAUDE_CODE_OAUTH_TOKEN可选claude setup-tokenPro/Max 用户的 GitHub Actions Claude Code auth
SEMANTIC_SCHOLAR_API_KEY可选semanticscholar.org(免费)引用图谱、论文搜索
DEEPXIV_TOKEN可选setup.sh 自动注册语义搜索、热门趋势
LLM_API_KEY + LLM_BASE_URL + LLM_MODEL可选任意 OpenAI 兼容 API跨模型评审;/daily-arxiv inform 推荐

自动化

GitHub Actions 每天 UTC 00:17(北京时间 08:17)运行 /daily-arxiv 推荐 pipeline:拉取 arXiv、按 wiki 去重、构建 recommendation context、上传 artifacts,并可通过 SMTP 发送 digest 邮件。

启用邮件时,在 repo Settings → Secrets 添加:SMTP_HOSTSMTP_PORTSMTP_USERSMTP_PASSWORDSMTP_FROMDAILY_ARXIV_EMAIL_TO

CI inform mode 可使用 ANTHROPIC_API_KEYCLAUDE_CODE_OAUTH_TOKEN 启动 Claude Code,也可使用 LLM_API_KEYLLM_BASE_URLLLM_MODEL 接入任意 OpenAI-compatible provider。auto-ingest 是显式模式,并且需要 Claude Code,因为普通 API LLM 不能调用 /ingest 这类 slash skill。手动触发时可设置 send_email=false,用于无 SMTP secrets 的 dry run。

Digest 示例 / Sample digest
/daily-arxiv digest 示例

一次真实的 /daily-arxiv 运行结果:带分数、理由、wiki 关联以及 auto-ingest 区块的推荐 digest。

24 个 Skill 命令

命令功能
/setup首次配置(API key、语言、依赖)
/reset按范围销毁性清理:wiki | raw | log | checkpoints | all
/prefill可选地预填 foundations/ 背景知识
/init基于用户 raw 素材并按需做外部发现来搭建 wiki
/ingest消化论文,创建页面 + 交叉引用
/discover从 anchor、topic 或当前 wiki 推荐排序后的下一批待读论文
/edit增删 raw 或更新 wiki
/ask对 wiki 提问
/checkwiki 健康检查
/daily-arxiv运行/管理每日 arXiv 推荐 feed(可选 CI 定时)
/ideate跨方向构思研究 idea
/novelty多源新颖性验证
/review跨模型评审
/exp-designidea 驱动的实验设计
/exp-run部署 + 监控实验
/exp-status实验状态看板
/exp-eval裁决 → 自动更新关联 idea
/refine多轮迭代改进
/survey生成 Related Work
/paper-planidea 图谱 + 实验证据 → 论文提纲
/paper-draft提纲 + wiki → LaTeX 草稿
/paper-compile编译 → PDF,自动修复
/research端到端研究编排器
/rebuttal解析评审意见 → 逐条回复

Built with Claude Code

If this project helps your research, give it a ⭐