LamaCLI 🦙✨

March 1, 2026 · View on GitHub

🚀 Your Local LLM Assistant, Right in Your Terminal!

LamaCLI is a powerful and intuitive command-line interface (CLI) tool that brings the magic of Large Language Models (LLMs) directly to your terminal, powered by Ollama.

LamaCLI Banner

Engage with your AI assistant in both interactive mode and command-line mode — perfect for quick queries or extended conversations, all without leaving your terminal.

✨ Features

🎯 Dual Operation Modes

Interactive Mode: Full-featured TUI with real-time chat, file browsing, and model switching
CLI Mode: Quick one-shot commands for ask, suggest, and explain operations

💬 Interactive Chat Features

Real-time Streaming: Beautiful chat experience with live response streaming
Markdown Support: Fully rendered markdown with syntax-highlighted code blocks
Chat Templates: Predefined templates for common tasks (code review, documentation, debugging)
Chat History: Persistent session storage with load/save functionality
Auto-save Sessions: Conversations automatically saved after each interaction
Code Block Management: Extract, navigate, and copy code snippets with ease
File Context Integration: Inject file content into prompts using @ command

🗂️ File Management

Built-in File Explorer: Browse project files with keyboard navigation
File Viewer: Preview file contents within the application
Context-aware Operations: Include directory contents in your queries
Pattern Matching: Filter files by patterns (e.g., *.md, *.go)

🤖 Model Management

Multiple Model Support: Switch between any Ollama models seamlessly
Model Override: Specify different models for different commands
Default Model Detection: Automatically uses your first available model
Model Information: View all available models and their status

⚡️ Get Started

Prerequisites

Before you begin, ensure you have Ollama installed and running on your system. You can install Ollama using one of the following methods:

macOS

Download the app from Ollama's website.
Or install via Homebrew:
```
brew install ollama
```

Windows

Download the installer from Ollama's website.

Linux

Run the official installation script:

curl -fsSL https://ollama.com/install.sh | sh

After installation, make sure Ollama is running, then pull at least one model (e.g., ollama pull llama3.2:3b).

ollama pull llama3.2:3b

Installation

LamaCLI is built with Go and available through multiple installation methods:

📦 Via npm (Recommended)

# Install globally
npm install -g lamacli

# Or run without installing
npx lamacli

🛠️ Via Go

go install github.com/hariharen9/lamacli@latest

📥 Download Binary

Download the latest binary for your platform from the releases page.

Usage

Simply run LamaCLI from your terminal:

lamacli

Key Bindings

Key	Description
`Enter`	Send message (in chat), Open file/folder (in file explorer)
`↑`/`↓`	Scroll history (in chat), Navigate items (in file tree/model select)
`@`	Trigger file context selection (in chat input)
`F`	Open File Explorer
`M`	Switch AI Model
`R`	Reset/Clear Chat History
`S`	Save current session manually
`C`	Copy Code Blocks (when available in chat)
`Ctrl+H`	Show detailed Help screen
`Backspace`	Go to parent folder (in file explorer), Back to explorer (in file viewer)
`Esc`	Cancel streaming / Return to chat from any view
`Ctrl+C`	Exit application (requires two presses for confirmation)
`L`	Load Chat History (browse and restore previous sessions)
`Alt+T`	Cycle through chat templates (code review, documentation, debugging)
`Ctrl+T`	Cycle through themes

📺 Demo Videos

Chatting with LLM	File History & Code Copy
Model Switching	Themes & Help

🖥️ CLI Mode Examples

While the interactive mode is the main feature, LamaCLI also supports quick CLI commands for rapid queries:

Output Modes

LamaCLI supports two output modes when using the CLI commands:

Markdown Rendering (Default) - Displays a nicely formatted response with proper Markdown rendering after the LLM completes its response. A spinner animation with "Thinking..." text is shown while waiting for the complete response.
Streaming Mode - Displays the raw LLM response in real-time as it's generated, without Markdown rendering. Enable this mode with the --stream flag. The spinner stops after the first chunk of the response appears.

Examples in CLI Mode:

# Basic question with Markdown rendering (default)
lamacli ask "How do I list files in Linux?"

# With streaming output (no Markdown rendering)
lamacli ask --stream "How do I list files in Linux?"

# With model override
lamacli a --model=qwen2.5-coder:1.5b "Explain async/await in JavaScript"

# With project context
lamacli ask --context=. --include="*.md" "Summarize this project"

Get Command Suggestions

# Get command suggestions
lamacli suggest "find large files over 100MB"

# With specific model
lamacli s --model=llama3.2:1b "git workflow for teams"

Explain Commands

# Explain a command
lamacli explain "find . -name '*.go' -exec grep -l 'func main' {} \;"

# With model override
lamacli e --model=qwen2.5-coder "docker compose up -d"

Other Commands

# Show available models
lamacli models

# Show version
lamacli version

# Show help
lamacli help

Note: All CLI commands support the following flags for customization:

--model: Override the default model
--context: Specify a directory for context
--include: Filter files for context
--theme: Set a specific theme
--stream: Enable real-time streaming output (disables Markdown rendering)