Chapter 1: Getting Started and First Server
April 13, 2026 · View on GitHub
Welcome to Chapter 1: Getting Started and First Server. In this part of Tabby Tutorial: Self-Hosted AI Coding Assistant Architecture and Operations, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.
This chapter gets Tabby running with a clean local baseline so every later chapter can focus on architecture and operations instead of setup drift.
Learning Goals
- choose an installation path that matches your environment
- run a first Tabby server and create an account
- connect an editor extension and verify completions
- capture baseline checks for repeatable setup
Prerequisites
| Requirement | Why It Matters |
|---|---|
| Docker or a host runtime for Tabby binary | quickest path to a stable server |
| GPU optional, CPU acceptable for initial validation | avoid blocking first-time setup |
| modern editor (VS Code, JetBrains, Vim/Neovim) | validate client integration early |
Fastest Bootstrap: Docker
docker run -d \
--name tabby \
--gpus all \
-p 8080:8080 \
-v $HOME/.tabby:/data \
registry.tabbyml.com/tabbyml/tabby \
serve \
--model StarCoder-1B \
--chat-model Qwen2-1.5B-Instruct \
--device cuda
Then open http://localhost:8080 and complete account registration.
Setup Validation Checklist
- server process is running and reachable on port
8080 - account registration completes in the web UI
- personal token is generated in the homepage
- editor extension connects using endpoint + token
- inline completions appear in a real repository
Early Failure Triage
| Symptom | Likely Cause | First Fix |
|---|---|---|
| container exits quickly | model/device mismatch | switch to a smaller model and recheck runtime flags |
| extension cannot authenticate | missing/invalid token | regenerate token and update extension settings |
| slow or empty completions | model backend not healthy | verify server logs and reduce model size for baseline |
Source References
Summary
You now have a working Tabby deployment with at least one connected editor client.
Next: Chapter 2: Architecture and Runtime Components
Source Code Walkthrough
Use the following upstream sources to verify getting started and first server details while reading this chapter:
crates/tabby/src/main.rs— the Rust binary entry point for the Tabby server, handling CLI argument parsing, configuration loading, and the HTTP server startup sequence.crates/tabby/src/serve.rs— the primary server initialization module that wires together the completion API, chat API, health endpoint, and model backend into a running Axum HTTP service.
Suggested trace strategy:
- trace
main.rsto understand the startup flags (model path, port, device) and how they map to server behavior - review
serve.rsto see how the completion and chat routes are registered and which middleware is applied - check
ee/tabby-webserver/for the enterprise web server layer that adds authentication and UI
How These Components Connect
flowchart LR
A[tabby serve command] --> B[main.rs CLI parsing]
B --> C[serve.rs server init]
C --> D[Completion and chat routes registered]
D --> E[Server accepting requests on configured port]