Quickstart

June 5, 2026 · View on GitHub

Five minutes from zero to "I added a conversation, queried it back, and can read it as plain Markdown."

EverOS runs as a service — start the server, then call the HTTP API. There is no in-process library mode; an everos server is always in front of your agent.

Prerequisites

Python 3.12+
An OpenRouter API key — covers the chat LLM (memory extraction) and the multimodal LLM (parsing image / pdf / audio content items) with a single key.
A DeepInfra API key — for the embedding + rerank models that OpenRouter doesn't ship.

Two keys total. Any OpenAI-compatible endpoint plugs in via the matching *__BASE_URL env var if you'd rather use OpenAI directly, self-host vLLM, route to Ollama, etc.

1. Install

pip install everos
# or:  uv pip install everos

2. Configure

Generate a starter .env and drop in your two keys:

everos init                    # writes ./.env (use --xdg for ~/.config/everos/.env)
# Edit .env and fill four API key slots (only two distinct keys needed):
#   EVEROS_LLM__API_KEY         (OpenRouter — chat LLM)
#   EVEROS_MULTIMODAL__API_KEY  (OpenRouter — same key works)
#   EVEROS_EMBEDDING__API_KEY   (DeepInfra)
#   EVEROS_RERANK__API_KEY      (DeepInfra — same key works)

everos init reads the template bundled inside the wheel and writes it with 0600 permissions (only your user can read the API keys).

The shipped template already points LLM + multimodal → OpenRouter (openai/gpt-4.1-mini and google/gemini-3-flash-preview) and embedding + rerank → DeepInfra (Qwen/Qwen3-Embedding-4B and Qwen/Qwen3-Reranker-4B). To use a different OpenAI-compatible endpoint, override the matching *__BASE_URL env var.

Where to store .env — everos server start searches in order: --env-file <path> → ./.env (cwd) → ${XDG_CONFIG_HOME:-~/.config}/everos/.env → ~/.everos/.env. The first existing file wins. Use everos init --xdg to write the XDG location so the same config works from any cwd.

3. Start the server

everos server start

You should see (port and host are configurable):

starting everos on 127.0.0.1:8000
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

Default bind is 127.0.0.1 (loopback only). To expose the API elsewhere, put your own auth/gateway in front first (see SECURITY.md).
The cascade index daemon runs in the same process as a FastAPI lifespan coroutine — you don't need a separate worker.
The server runs in the foreground; open a second terminal for the steps below, and use Ctrl+C to stop the server when you're done.

In the second terminal, verify the server is up:

$ curl http://127.0.0.1:8000/health
{"status":"ok"}

4. Add a conversation

EverOS ingests memory at the conversation level, not as standalone sentences: you POST a batch of messages tied to a session_id, and the server accumulates them until the boundary detector trips (you can also force a flush — see step 5).

TS=$(($(date +%s)*1000))    # Unix epoch in **milliseconds** (v1 contract)
curl -X POST http://127.0.0.1:8000/api/v1/memory/add \
  -H 'Content-Type: application/json' \
  -d "{
    \"session_id\": \"demo-001\",
    \"app_id\": \"default\",
    \"project_id\": \"default\",
    \"messages\": [
      {\"sender_id\": \"alice\", \"role\": \"user\", \"timestamp\": $TS, \"content\": \"I love climbing in Yosemite every spring.\"},
      {\"sender_id\": \"alice\", \"role\": \"user\", \"timestamp\": $((TS+10000)), \"content\": \"My favorite coffee shop is Blue Bottle in SOMA.\"},
      {\"sender_id\": \"alice\", \"role\": \"user\", \"timestamp\": $((TS+20000)), \"content\": \"I bike to work most days.\"}
    ]
  }"

Response:

{
    "request_id": "bf86e4e857834eba804841f8bff29106",
    "data": {
        "message_count": 3,
        "status": "accumulated"
    }
}

status: "accumulated" means the three messages are in the session buffer, but the boundary detector hasn't decided to extract a memory cell yet. For a quick demo we'll force it.

5. Force boundary extraction

curl -X POST http://127.0.0.1:8000/api/v1/memory/flush \
  -H 'Content-Type: application/json' \
  -d '{"session_id":"demo-001","app_id":"default","project_id":"default"}'

Response (this takes a few seconds — one LLM call for extraction):

{
    "request_id": "ec0e7a00c3bd4b00bb21212a411b7763",
    "data": {
        "status": "extracted"
    }
}

status: "extracted" means at least one memory cell was carved out and written to disk + indexed.

/flush is OSS-only. The cloud edition decides boundary timing server-side and does not expose this endpoint.

6. Search the memory you just added

curl -X POST http://127.0.0.1:8000/api/v1/memory/search \
  -H 'Content-Type: application/json' \
  -d '{
    "user_id": "alice",
    "app_id": "default",
    "project_id": "default",
    "query": "Where do I like to climb?",
    "top_k": 5
  }'

Response (trimmed):

{
    "request_id": "b53a3a94a080472d97692c503c88afdf",
    "data": {
        "episodes": [
            {
                "id": "alice_ep_20260528_00000002",
                "user_id": "alice",
                "session_id": "demo-001",
                "summary": "On May 28, 2026 ... Alice shared that she loves climbing in Yosemite every spring ...",
                "score": 0.6284722685813904,
                "atomic_facts": [
                    {
                        "id": "alice_af_20260528_00000016",
                        "content": "Alice said she loves climbing in Yosemite every spring.",
                        "score": 0.6284722685813904
                    }
                ]
            }
        ],
        "profiles": [],
        "agent_cases": [],
        "agent_skills": []
    }
}

The hybrid retrieval (BM25 + vector + scalar) returns the episode that contains the climbing fact, with the matching atomic fact nested under it. Other response arrays (profiles / agent_cases / agent_skills) are always present for client-side symmetry, populated only when the requested kind matches.

7. Your memory is just Markdown

This is what makes EverOS different — your memory persists as plain Markdown files on disk:

$ tree ~/.everos -L 5 -a
~/.everos
├── default_app/                       ← app_id  ("default" → "default_app")
│   └── default_project/               ← project_id ("default" → "default_project")
│       └── users/
│           └── alice/                  ← user_id (mirror dir: agents/<agent_id>/)
│               ├── episodes/
│               │   └── episode-2026-05-28.md
│               ├── .atomic_facts/      ← hidden (dot-prefix)
│               │   └── atomic_fact-2026-05-28.md
│               ├── .foresights/
│               │   └── foresight-2026-05-28.md
│               └── user.md             ← profile
├── .index/                             ← derived indexes (rebuildable from md)
│   ├── sqlite/system.db
│   └── lancedb/*.lance/
└── .tmp/

The default scope id materialises as default_app / default_project on disk (with the _app / _project suffix) so the default space is visually distinct from any user-named space. Any other id maps to itself (e.g. app_id: "my-app" → my-app/).

Top-level .index/ holds SQLite + LanceDB derived indexes — wipe it and the cascade daemon rebuilds everything from the Markdown alone.

Read the episode we just created:

$ cat ~/.everos/default_app/default_project/users/alice/episodes/episode-2026-05-28.md
---
id: episode_log_alice_2026-05-28
type: episode_daily
file_type: episode_daily
schema_version: 1
user_id: alice
track: user
date: '2026-05-28'
entry_count: 1
last_appended_at: '2026-05-28T08:32:24.966944+00:00'
---
<!-- entry:ep_20260528_00000002 -->
## ep_20260528_00000002

**owner_id**: alice
**session_id**: demo-001
**timestamp**: 2026-05-28T08:32:13+00:00
**parent_type**: memcell
**parent_id**: mc_3779c20f1c53
**sender_ids**: [alice]

### Subject
Alice's Outdoor Activities and Daily Routine on May 28, 2026 Morning

### Content
On May 28, 2026 at 8:32 AM UTC, Alice shared that she loves climbing in
Yosemite every spring, highlighting a recurring seasonal outdoor activity.
She also mentioned that her favorite coffee shop is Blue Bottle located in
SOMA, indicating a preferred local spot. Additionally, Alice stated that
she bikes to work most days, revealing a habitual commuting practice.
<!-- /entry:ep_20260528_00000002 -->

Every memory entry is a plain Markdown file you can:

cat / grep / vim directly — no driver, no service to query
Version with Git (or rsync to backup)
Open the ~/.everos/default_app/default_project/users/alice/ folder in Obsidian (the dotfile directories stay hidden by default)

Stopping the server

Ctrl+C in the server terminal. Uvicorn catches SIGINT and shuts each lifespan provider down in reverse order (cascade → LanceDB → SQLite → LLM → metrics) before exiting.

Next steps

Integrate into your agent — wrap the three endpoints (/add, /flush, /search) in a thin Python client (httpx.AsyncClient) and call them from your agent loop.
App + project scope — set app_id / project_id to anything other than "default" to partition memory spaces inside one server.
Multi-modal messages — messages[].content accepts a list of typed ContentItems (text / image / audio / doc / pdf / html / email) for non-text input. Install the optional extra to enable parsing: uv pip install 'everos[multimodal]'. Office documents (doc / docx / xls / ppt / …) additionally need LibreOffice on the host (brew install --cask libreoffice / apt-get install libreoffice) — without it those uploads return HTTP 415; PDF / image / audio / HTML still work.
Filter DSL and search modes — /search supports a filter DSL (AND / OR / scalar predicates) and four methods (HYBRID / KEYWORD / VECTOR / AGENTIC). See the OpenAPI schema served at /docs.
Architecture — see docs/architecture.md for the DDD layering and cascade design, and docs/storage_layout.md for the on-disk layout.
Found a bug? — open an issue (see CONTRIBUTING.md; external pull requests are not merged).