Chapter 1: Getting Started and First Server

April 13, 2026 · View on GitHub

Welcome to Chapter 1: Getting Started and First Server. In this part of Tabby Tutorial: Self-Hosted AI Coding Assistant Architecture and Operations, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.

This chapter gets Tabby running with a clean local baseline so every later chapter can focus on architecture and operations instead of setup drift.

Learning Goals

choose an installation path that matches your environment
run a first Tabby server and create an account
connect an editor extension and verify completions
capture baseline checks for repeatable setup

Prerequisites

Requirement	Why It Matters
Docker or a host runtime for Tabby binary	quickest path to a stable server
GPU optional, CPU acceptable for initial validation	avoid blocking first-time setup
modern editor (VS Code, JetBrains, Vim/Neovim)	validate client integration early

Fastest Bootstrap: Docker

docker run -d \
  --name tabby \
  --gpus all \
  -p 8080:8080 \
  -v $HOME/.tabby:/data \
  registry.tabbyml.com/tabbyml/tabby \
    serve \
    --model StarCoder-1B \
    --chat-model Qwen2-1.5B-Instruct \
    --device cuda

Then open http://localhost:8080 and complete account registration.

Setup Validation Checklist

server process is running and reachable on port 8080
account registration completes in the web UI
personal token is generated in the homepage
editor extension connects using endpoint + token
inline completions appear in a real repository

Early Failure Triage

Symptom	Likely Cause	First Fix
container exits quickly	model/device mismatch	switch to a smaller model and recheck runtime flags
extension cannot authenticate	missing/invalid token	regenerate token and update extension settings
slow or empty completions	model backend not healthy	verify server logs and reduce model size for baseline

Source References

Summary

You now have a working Tabby deployment with at least one connected editor client.

Next: Chapter 2: Architecture and Runtime Components

Source Code Walkthrough

Use the following upstream sources to verify getting started and first server details while reading this chapter:

crates/tabby/src/main.rs — the Rust binary entry point for the Tabby server, handling CLI argument parsing, configuration loading, and the HTTP server startup sequence.
crates/tabby/src/serve.rs — the primary server initialization module that wires together the completion API, chat API, health endpoint, and model backend into a running Axum HTTP service.

Suggested trace strategy:

trace main.rs to understand the startup flags (model path, port, device) and how they map to server behavior
review serve.rs to see how the completion and chat routes are registered and which middleware is applied
check ee/tabby-webserver/ for the enterprise web server layer that adds authentication and UI

How These Components Connect

flowchart LR
    A[tabby serve command] --> B[main.rs CLI parsing]
    B --> C[serve.rs server init]
    C --> D[Completion and chat routes registered]
    D --> E[Server accepting requests on configured port]