Foundry Local Workshop - Build AI Apps On-Device
April 21, 2026 ยท View on GitHub
Foundry Local Workshop - Build AI Apps On-Device
A hands-on workshop for running language models on your own machine and building intelligent applications with Foundry Local and the Microsoft Agent Framework.
What is Foundry Local? Foundry Local is a lightweight runtime that lets you download, manage, and serve language models entirely on your hardware. It exposes an OpenAI-compatible API so any tool or SDK that speaks OpenAI can connect - no cloud account required.
๐ Multi-Language Support
Supported via GitHub Action (Automated & Always Up-to-Date)
Arabic | Bengali | Bulgarian | Burmese (Myanmar) | Chinese (Simplified) | Chinese (Traditional, Hong Kong) | Chinese (Traditional, Macau) | Chinese (Traditional, Taiwan) | Croatian | Czech | Danish | Dutch | Estonian | Finnish | French | German | Greek | Hebrew | Hindi | Hungarian | Indonesian | Italian | Japanese | Kannada | Khmer | Korean | Lithuanian | Malay | Malayalam | Marathi | Nepali | Nigerian Pidgin | Norwegian | Persian (Farsi) | Polish | Portuguese (Brazil) | Portuguese (Portugal) | Punjabi (Gurmukhi) | Romanian | Russian | Serbian (Cyrillic) | Slovak | Slovenian | Spanish | Swahili | Swedish | Tagalog (Filipino) | Tamil | Telugu | Thai | Turkish | Ukrainian | Urdu | Vietnamese
Prefer to Clone Locally?
This repository includes 50+ language translations which significantly increases the download size. To clone without translations, use sparse checkout:
Bash / macOS / Linux:
git clone --filter=blob:none --sparse https://github.com/microsoft-foundry/Foundry-Local-Lab.git cd Foundry-Local-Lab git sparse-checkout set --no-cone '/*' '!translations' '!translated_images'CMD (Windows):
git clone --filter=blob:none --sparse https://github.com/microsoft-foundry/Foundry-Local-Lab.git cd Foundry-Local-Lab git sparse-checkout set --no-cone "/*" "!translations" "!translated_images"This gives you everything you need to complete the course with a much faster download.
Learning Objectives
By the end of this workshop you will be able to:
| # | Objective |
|---|---|
| 1 | Install Foundry Local and manage models with the CLI |
| 2 | Master the Foundry Local SDK API for programmatic model management |
| 3 | Connect to the local inference server using the Python, JavaScript, and C# SDKs |
| 4 | Build a Retrieval-Augmented Generation (RAG) pipeline that grounds answers in your own data |
| 5 | Create AI agents with persistent instructions and personas |
| 6 | Orchestrate multi-agent workflows with feedback loops |
| 7 | Explore a production capstone app - the Zava Creative Writer |
| 8 | Build evaluation frameworks with golden datasets and LLM-as-judge scoring |
| 9 | Transcribe audio with Whisper - speech-to-text on-device using the Foundry Local SDK |
| 10 | Compile and run custom or Hugging Face models with ONNX Runtime GenAI and Foundry Local |
| 11 | Enable local models to call external functions with the tool-calling pattern |
| 12 | Build a browser-based UI for the Zava Creative Writer with real-time streaming |
Prerequisites
| Requirement | Details |
|---|---|
| Hardware | 8 GB RAM minimum (16 GB recommended); AVX2-capable CPU or a supported GPU |
| OS | Windows 10/11 (x64/ARM), Windows Server 2025, or macOS 13+ |
| Foundry Local CLI | Install via winget install Microsoft.FoundryLocal (Windows) or brew tap microsoft/foundrylocal && brew install foundrylocal (macOS). See the getting started guide for details. |
| Language runtime | Python 3.9+ and/or .NET 9.0+ and/or Node.js 18+ |
| Git | For cloning this repository |
Getting Started
# 1. Clone the repository
git clone https://github.com/microsoft-foundry/foundry-local-lab.git
cd foundry-local-lab
# 2. Verify Foundry Local is installed
foundry model list # List available models
foundry model run phi-3.5-mini # Start an interactive chat
# 3. Choose your language track (see Part 2 lab for full setup)
| Language | Quick Start |
|---|---|
| Python | cd python && pip install -r requirements.txt && python foundry-local.py |
| C# | cd csharp && dotnet run |
| JavaScript | cd javascript && npm install && node foundry-local.mjs |
Workshop Parts
Part 1: Getting Started with Foundry Local
Lab guide: labs/part1-getting-started.md
- What is Foundry Local and how it works
- Installing the CLI on Windows and macOS
- Exploring models - listing, downloading, running
- Understanding model aliases and dynamic ports
Part 2: Foundry Local SDK Deep Dive
Lab guide: labs/part2-foundry-local-sdk.md
- Why use the SDK over the CLI for application development
- Full SDK API reference for Python, JavaScript, and C#
- Service management, catalog browsing, model lifecycle (download, load, unload)
- Quick-start patterns: Python constructor bootstrap, JavaScript
init(), C#CreateAsync() FoundryModelInfometadata, aliases, and hardware-optimal model selection
Part 3: SDKs and APIs
Lab guide: labs/part3-sdk-and-apis.md
- Connecting to Foundry Local from Python, JavaScript, and C#
- Using the Foundry Local SDK to manage the service programmatically
- Streaming chat completions via the OpenAI-compatible API
- SDK method reference for each language
Code samples:
| Language | File | Description |
|---|---|---|
| Python | python/foundry-local.py | Basic streaming chat |
| C# | csharp/BasicChat.cs | Streaming chat with .NET |
| JavaScript | javascript/foundry-local.mjs | Streaming chat with Node.js |
Part 4: Retrieval-Augmented Generation (RAG)
Lab guide: labs/part4-rag-fundamentals.md
- What is RAG and why it matters
- Building an in-memory knowledge base
- Keyword-overlap retrieval with scoring
- Composing grounded system prompts
- Running a complete RAG pipeline on-device
Code samples:
| Language | File |
|---|---|
| Python | python/foundry-local-rag.py |
| C# | csharp/RagPipeline.cs |
| JavaScript | javascript/foundry-local-rag.mjs |
Part 5: Building AI Agents
Lab guide: labs/part5-single-agents.md
- What is an AI agent (vs. a raw LLM call)
- The
ChatAgentpattern and the Microsoft Agent Framework - System instructions, personas, and multi-turn conversations
- Structured output (JSON) from agents
Code samples:
| Language | File | Description |
|---|---|---|
| Python | python/foundry-local-with-agf.py | Single agent with Agent Framework |
| C# | csharp/SingleAgent.cs | Single agent (ChatAgent pattern) |
| JavaScript | javascript/foundry-local-with-agent.mjs | Single agent (ChatAgent pattern) |
Part 6: Multi-Agent Workflows
Lab guide: labs/part6-multi-agent-workflows.md
- Multi-agent pipelines: Researcher โ Writer โ Editor
- Sequential orchestration and feedback loops
- Shared configuration and structured hand-offs
- Designing your own multi-agent workflow
Code samples:
| Language | File | Description |
|---|---|---|
| Python | python/foundry-local-multi-agent.py | Three-agent pipeline |
| C# | csharp/MultiAgent.cs | Three-agent pipeline |
| JavaScript | javascript/foundry-local-multi-agent.mjs | Three-agent pipeline |
Part 7: Zava Creative Writer - Capstone Application
Lab guide: labs/part7-zava-creative-writer.md
- A production-style multi-agent app with 4 specialised agents
- Sequential pipeline with evaluator-driven feedback loops
- Streaming output, product catalog search, structured JSON hand-offs
- Full implementation in Python (FastAPI), JavaScript (Node.js CLI), and C# (.NET console)
Code samples:
| Language | Directory | Description |
|---|---|---|
| Python | zava-creative-writer-local/src/api/ | FastAPI web service with orchestrator |
| JavaScript | zava-creative-writer-local/src/javascript/ | Node.js CLI application |
| C# | zava-creative-writer-local/src/csharp/ | .NET 9 console application |
Part 8: Evaluation-Led Development
Lab guide: labs/part8-evaluation-led-development.md
- Build a systematic evaluation framework for AI agents using golden datasets
- Rule-based checks (length, keyword coverage, forbidden terms) + LLM-as-judge scoring
- Side-by-side comparison of prompt variants with aggregate scorecards
- Extends the Zava Editor agent pattern from Part 7 into an offline test suite
- Python, JavaScript, and C# tracks
Code samples:
| Language | File | Description |
|---|---|---|
| Python | python/foundry-local-eval.py | Evaluation framework |
| C# | csharp/AgentEvaluation.cs | Evaluation framework |
| JavaScript | javascript/foundry-local-eval.mjs | Evaluation framework |
Part 9: Voice Transcription with Whisper
Lab guide: labs/part9-whisper-voice-transcription.md
- Speech-to-text transcription using OpenAI Whisper running locally
- Privacy-first audio processing - audio never leaves your device
- Python, JavaScript, and C# tracks with
client.audio.transcriptions.create()(Python/JS) andAudioClient.TranscribeAudioAsync()(C#) - Includes Zava-themed sample audio files for hands-on practice
Code samples:
| Language | File | Description |
|---|---|---|
| Python | python/foundry-local-whisper.py | Whisper voice transcription |
| C# | csharp/WhisperTranscription.cs | Whisper voice transcription |
| JavaScript | javascript/foundry-local-whisper.mjs | Whisper voice transcription |
Note: This lab uses the Foundry Local SDK to programmatically download and load the Whisper model, then sends audio to the local OpenAI-compatible endpoint for transcription. The Whisper model (
whisper) is listed in the Foundry Local catalog and runs entirely on-device - no cloud API keys or network access required.
Part 10: Using Custom or Hugging Face Models
Lab guide: labs/part10-custom-models.md
- Compiling Hugging Face models to optimised ONNX format using the ONNX Runtime GenAI model builder
- Hardware-specific compilation (CPU, NVIDIA GPU, DirectML, WebGPU) and quantisation (int4, fp16, bf16)
- Creating chat-template configuration files for Foundry Local
- Adding compiled models to the Foundry Local cache
- Running custom models via the CLI, REST API, and OpenAI SDK
- Reference example: compiling Qwen/Qwen3-0.6B end-to-end
Part 11: Tool Calling with Local Models
Lab guide: labs/part11-tool-calling.md
- Enable local models to call external functions (tool/function calling)
- Define tool schemas using the OpenAI function-calling format
- Handle the multi-turn tool-calling conversation flow
- Execute tool calls locally and return results to the model
- Choose the right model for tool-calling scenarios (Qwen 2.5, Phi-4-mini)
- Use the SDK's native
ChatClientfor tool calling (JavaScript)
Code samples:
| Language | File | Description |
|---|---|---|
| Python | python/foundry-local-tool-calling.py | Tool calling with weather/population tools |
| C# | csharp/ToolCalling.cs | Tool calling with .NET |
| JavaScript | javascript/foundry-local-tool-calling.mjs | Tool calling with ChatClient |
Part 12: Building a Web UI for the Zava Creative Writer
Lab guide: labs/part12-zava-ui.md
- Add a browser-based front end to the Zava Creative Writer
- Serve the shared UI from Python (FastAPI), JavaScript (Node.js HTTP), and C# (ASP.NET Core)
- Consume streaming NDJSON in the browser with the Fetch API and ReadableStream
- Live agent status badges and real-time article text streaming
Code (shared UI):
| File | Description |
|---|---|
zava-creative-writer-local/ui/index.html | Page layout |
zava-creative-writer-local/ui/style.css | Styling |
zava-creative-writer-local/ui/app.js | Stream reader and DOM update logic |
Backend additions:
| Language | File | Description |
|---|---|---|
| Python | zava-creative-writer-local/src/api/main.py | Updated to serve static UI |
| JavaScript | zava-creative-writer-local/src/javascript/server.mjs | New HTTP server wrapping the orchestrator |
| C# | zava-creative-writer-local/src/csharp-web/Program.cs | New ASP.NET Core minimal API project |
Part 13: Workshop Complete
Lab guide: labs/part13-workshop-complete.md
- Summary of everything you have built across all 12 parts
- Further ideas for extending your applications
- Links to resources and documentation
Project Structure
โโโ python/ # Python examples
โ โโโ foundry-local.py # Basic chat
โ โโโ foundry-local-with-agf.py # Single agent (AGF)
โ โโโ foundry-local-rag.py # RAG pipeline
โ โโโ foundry-local-multi-agent.py # Multi-agent workflow
โ โโโ foundry-local-eval.py # Agent evaluation framework
โ โโโ foundry-local-whisper.py # Whisper voice transcription
โ โโโ foundry-local-tool-calling.py # Tool/function calling
โ โโโ requirements.txt
โโโ csharp/ # C# examples
โ โโโ Program.cs # CLI router (chat|rag|agent|multi|eval|whisper|toolcall)
โ โโโ BasicChat.cs # Basic chat
โ โโโ RagPipeline.cs # RAG pipeline
โ โโโ SingleAgent.cs # Single agent (ChatAgent pattern)
โ โโโ MultiAgent.cs # Multi-agent workflow
โ โโโ AgentEvaluation.cs # Agent evaluation framework
โ โโโ WhisperTranscription.cs # Whisper voice transcription
โ โโโ ToolCalling.cs # Tool/function calling
โ โโโ csharp.csproj
โโโ javascript/ # JavaScript examples
โ โโโ foundry-local.mjs # Basic chat
โ โโโ foundry-local-with-agent.mjs # Single agent
โ โโโ foundry-local-rag.mjs # RAG pipeline
โ โโโ foundry-local-multi-agent.mjs # Multi-agent workflow
โ โโโ foundry-local-eval.mjs # Agent evaluation framework
โ โโโ foundry-local-whisper.mjs # Whisper voice transcription
โ โโโ foundry-local-tool-calling.mjs # Tool/function calling
โ โโโ package.json
โโโ zava-creative-writer-local/ # Production multi-agent app
โ โโโ ui/ # Shared browser UI (Part 12)
โ โ โโโ index.html # Page layout
โ โ โโโ style.css # Styling
โ โ โโโ app.js # Stream reader and DOM updates
โ โโโ src/
โ โโโ api/ # Python FastAPI service
โ โ โโโ main.py # FastAPI server (serves UI)
โ โ โโโ orchestrator.py # Pipeline coordinator
โ โ โโโ foundry_config.py # Shared Foundry Local config
โ โ โโโ requirements.txt
โ โ โโโ agents/ # Researcher, Product, Writer, Editor
โ โโโ javascript/ # Node.js CLI and web server
โ โ โโโ main.mjs # CLI entry point
โ โ โโโ server.mjs # HTTP server with UI (Part 12)
โ โ โโโ foundryConfig.mjs
โ โ โโโ package.json
โ โโโ csharp/ # .NET 9 console app
โ โ โโโ Program.cs
โ โ โโโ ZavaCreativeWriter.csproj
โ โโโ csharp-web/ # .NET 9 web API (Part 12)
โ โโโ Program.cs
โ โโโ ZavaCreativeWriterWeb.csproj
โโโ labs/ # Step-by-step lab guides
โ โโโ part1-getting-started.md
โ โโโ part2-foundry-local-sdk.md
โ โโโ part3-sdk-and-apis.md
โ โโโ part4-rag-fundamentals.md
โ โโโ part5-single-agents.md
โ โโโ part6-multi-agent-workflows.md
โ โโโ part7-zava-creative-writer.md
โ โโโ part8-evaluation-led-development.md
โ โโโ part9-whisper-voice-transcription.md
โ โโโ part10-custom-models.md
โ โโโ part11-tool-calling.md
โ โโโ part12-zava-ui.md
โ โโโ part13-workshop-complete.md
โโโ samples/
โ โโโ audio/ # Zava-themed WAV files for Part 9
โ โโโ generate_samples.py # TTS script (pyttsx3) to create WAVs
โ โโโ README.md # Sample descriptions
โโโ AGENTS.md # Coding agent instructions
โโโ package.json # Root devDependency (mermaid-cli)
โโโ LICENSE # MIT licence
โโโ README.md
Resources
| Resource | Link |
|---|---|
| Foundry Local website | foundrylocal.ai |
| Model catalog | foundrylocal.ai/models |
| Foundry Local GitHub | github.com/microsoft/foundry-local |
| Getting started guide | Microsoft Learn - Foundry Local |
| Foundry Local SDK Reference | Microsoft Learn - SDK Reference |
| Microsoft Agent Framework | Microsoft Learn - Agent Framework |
| OpenAI Whisper | github.com/openai/whisper |
| ONNX Runtime GenAI | github.com/microsoft/onnxruntime-genai |
Licence
This workshop material is provided for educational purposes.
Happy building! ๐