Quickstart
June 5, 2026 · View on GitHub
Five minutes from zero to "I added a conversation, queried it back, and can read it as plain Markdown."
EverOS runs as a service — start the server, then call the HTTP API.
There is no in-process library mode; an everos server is always in
front of your agent.
Prerequisites
- Python 3.12+
- An OpenRouter API key — covers the chat LLM (memory extraction) and the multimodal LLM (parsing image / pdf / audio content items) with a single key.
- A DeepInfra API key — for the embedding + rerank models that OpenRouter doesn't ship.
Two keys total. Any OpenAI-compatible endpoint plugs in via the
matching *__BASE_URL env var if you'd rather use OpenAI directly,
self-host vLLM, route to Ollama, etc.
1. Install
pip install everos
# or: uv pip install everos
2. Configure
Generate a starter .env and drop in your two keys:
everos init # writes ./.env (use --xdg for ~/.config/everos/.env)
# Edit .env and fill four API key slots (only two distinct keys needed):
# EVEROS_LLM__API_KEY (OpenRouter — chat LLM)
# EVEROS_MULTIMODAL__API_KEY (OpenRouter — same key works)
# EVEROS_EMBEDDING__API_KEY (DeepInfra)
# EVEROS_RERANK__API_KEY (DeepInfra — same key works)
everos init reads the template bundled inside the wheel and writes it
with 0600 permissions (only your user can read the API keys).
The shipped template already points LLM + multimodal → OpenRouter
(openai/gpt-4.1-mini and google/gemini-3-flash-preview) and
embedding + rerank → DeepInfra (Qwen/Qwen3-Embedding-4B and
Qwen/Qwen3-Reranker-4B). To use a different OpenAI-compatible
endpoint, override the matching *__BASE_URL env var.
Where to store
.env—everos server startsearches in order:--env-file <path>→./.env(cwd) →${XDG_CONFIG_HOME:-~/.config}/everos/.env→~/.everos/.env. The first existing file wins. Useeveros init --xdgto write the XDG location so the same config works from any cwd.
3. Start the server
everos server start
You should see (port and host are configurable):
starting everos on 127.0.0.1:8000
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
- Default bind is
127.0.0.1(loopback only). To expose the API elsewhere, put your own auth/gateway in front first (see SECURITY.md). - The cascade index daemon runs in the same process as a FastAPI lifespan coroutine — you don't need a separate worker.
- The server runs in the foreground; open a second terminal for the
steps below, and use
Ctrl+Cto stop the server when you're done.
In the second terminal, verify the server is up:
$ curl http://127.0.0.1:8000/health
{"status":"ok"}
4. Add a conversation
EverOS ingests memory at the conversation level, not as standalone
sentences: you POST a batch of messages tied to a session_id, and
the server accumulates them until the boundary detector trips (you can
also force a flush — see step 5).
TS=$(($(date +%s)*1000)) # Unix epoch in **milliseconds** (v1 contract)
curl -X POST http://127.0.0.1:8000/api/v1/memory/add \
-H 'Content-Type: application/json' \
-d "{
\"session_id\": \"demo-001\",
\"app_id\": \"default\",
\"project_id\": \"default\",
\"messages\": [
{\"sender_id\": \"alice\", \"role\": \"user\", \"timestamp\": $TS, \"content\": \"I love climbing in Yosemite every spring.\"},
{\"sender_id\": \"alice\", \"role\": \"user\", \"timestamp\": $((TS+10000)), \"content\": \"My favorite coffee shop is Blue Bottle in SOMA.\"},
{\"sender_id\": \"alice\", \"role\": \"user\", \"timestamp\": $((TS+20000)), \"content\": \"I bike to work most days.\"}
]
}"
Response:
{
"request_id": "bf86e4e857834eba804841f8bff29106",
"data": {
"message_count": 3,
"status": "accumulated"
}
}
status: "accumulated" means the three messages are in the session
buffer, but the boundary detector hasn't decided to extract a memory
cell yet. For a quick demo we'll force it.
5. Force boundary extraction
curl -X POST http://127.0.0.1:8000/api/v1/memory/flush \
-H 'Content-Type: application/json' \
-d '{"session_id":"demo-001","app_id":"default","project_id":"default"}'
Response (this takes a few seconds — one LLM call for extraction):
{
"request_id": "ec0e7a00c3bd4b00bb21212a411b7763",
"data": {
"status": "extracted"
}
}
status: "extracted" means at least one memory cell was carved out and
written to disk + indexed.
/flushis OSS-only. The cloud edition decides boundary timing server-side and does not expose this endpoint.
6. Search the memory you just added
curl -X POST http://127.0.0.1:8000/api/v1/memory/search \
-H 'Content-Type: application/json' \
-d '{
"user_id": "alice",
"app_id": "default",
"project_id": "default",
"query": "Where do I like to climb?",
"top_k": 5
}'
Response (trimmed):
{
"request_id": "b53a3a94a080472d97692c503c88afdf",
"data": {
"episodes": [
{
"id": "alice_ep_20260528_00000002",
"user_id": "alice",
"session_id": "demo-001",
"summary": "On May 28, 2026 ... Alice shared that she loves climbing in Yosemite every spring ...",
"score": 0.6284722685813904,
"atomic_facts": [
{
"id": "alice_af_20260528_00000016",
"content": "Alice said she loves climbing in Yosemite every spring.",
"score": 0.6284722685813904
}
]
}
],
"profiles": [],
"agent_cases": [],
"agent_skills": []
}
}
The hybrid retrieval (BM25 + vector + scalar) returns the episode
that contains the climbing fact, with the matching atomic fact nested
under it. Other response arrays (profiles / agent_cases /
agent_skills) are always present for client-side symmetry, populated
only when the requested kind matches.
7. Your memory is just Markdown
This is what makes EverOS different — your memory persists as plain Markdown files on disk:
$ tree ~/.everos -L 5 -a
~/.everos
├── default_app/ ← app_id ("default" → "default_app")
│ └── default_project/ ← project_id ("default" → "default_project")
│ └── users/
│ └── alice/ ← user_id (mirror dir: agents/<agent_id>/)
│ ├── episodes/
│ │ └── episode-2026-05-28.md
│ ├── .atomic_facts/ ← hidden (dot-prefix)
│ │ └── atomic_fact-2026-05-28.md
│ ├── .foresights/
│ │ └── foresight-2026-05-28.md
│ └── user.md ← profile
├── .index/ ← derived indexes (rebuildable from md)
│ ├── sqlite/system.db
│ └── lancedb/*.lance/
└── .tmp/
The default scope id materialises as default_app / default_project
on disk (with the _app / _project suffix) so the default space is
visually distinct from any user-named space. Any other id maps to itself
(e.g. app_id: "my-app" → my-app/).
Top-level .index/ holds SQLite + LanceDB derived indexes — wipe it
and the cascade daemon rebuilds everything from the Markdown alone.
Read the episode we just created:
$ cat ~/.everos/default_app/default_project/users/alice/episodes/episode-2026-05-28.md
---
id: episode_log_alice_2026-05-28
type: episode_daily
file_type: episode_daily
schema_version: 1
user_id: alice
track: user
date: '2026-05-28'
entry_count: 1
last_appended_at: '2026-05-28T08:32:24.966944+00:00'
---
<!-- entry:ep_20260528_00000002 -->
## ep_20260528_00000002
**owner_id**: alice
**session_id**: demo-001
**timestamp**: 2026-05-28T08:32:13+00:00
**parent_type**: memcell
**parent_id**: mc_3779c20f1c53
**sender_ids**: [alice]
### Subject
Alice's Outdoor Activities and Daily Routine on May 28, 2026 Morning
### Content
On May 28, 2026 at 8:32 AM UTC, Alice shared that she loves climbing in
Yosemite every spring, highlighting a recurring seasonal outdoor activity.
She also mentioned that her favorite coffee shop is Blue Bottle located in
SOMA, indicating a preferred local spot. Additionally, Alice stated that
she bikes to work most days, revealing a habitual commuting practice.
<!-- /entry:ep_20260528_00000002 -->
Every memory entry is a plain Markdown file you can:
cat/grep/vimdirectly — no driver, no service to query- Version with Git (or rsync to backup)
- Open the
~/.everos/default_app/default_project/users/alice/folder in Obsidian (the dotfile directories stay hidden by default)
Stopping the server
Ctrl+C in the server terminal. Uvicorn catches SIGINT and shuts each
lifespan provider down in reverse order (cascade → LanceDB → SQLite →
LLM → metrics) before exiting.
Next steps
- Integrate into your agent — wrap the three endpoints (
/add,/flush,/search) in a thin Python client (httpx.AsyncClient) and call them from your agent loop. - App + project scope — set
app_id/project_idto anything other than"default"to partition memory spaces inside one server. - Multi-modal messages —
messages[].contentaccepts a list of typedContentItems (text/image/audio/doc/pdf/html/email) for non-text input. Install the optional extra to enable parsing:uv pip install 'everos[multimodal]'. Office documents (doc/docx/xls/ppt/…) additionally need LibreOffice on the host (brew install --cask libreoffice/apt-get install libreoffice) — without it those uploads return HTTP 415; PDF / image / audio / HTML still work. - Filter DSL and search modes —
/searchsupports a filter DSL (AND/OR/ scalar predicates) and four methods (HYBRID/KEYWORD/VECTOR/AGENTIC). See the OpenAPI schema served at/docs. - Architecture — see docs/architecture.md for the DDD layering and cascade design, and docs/storage_layout.md for the on-disk layout.
- Found a bug? — open an issue (see CONTRIBUTING.md; external pull requests are not merged).