Foundry Local Workshop - Build AI Apps On-Device

April 21, 2026 ยท View on GitHub

Foundry Local

Foundry Local Workshop - Build AI Apps On-Device

A hands-on workshop for running language models on your own machine and building intelligent applications with Foundry Local and the Microsoft Agent Framework.

What is Foundry Local? Foundry Local is a lightweight runtime that lets you download, manage, and serve language models entirely on your hardware. It exposes an OpenAI-compatible API so any tool or SDK that speaks OpenAI can connect - no cloud account required.

๐ŸŒ Multi-Language Support

Supported via GitHub Action (Automated & Always Up-to-Date)

Arabic | Bengali | Bulgarian | Burmese (Myanmar) | Chinese (Simplified) | Chinese (Traditional, Hong Kong) | Chinese (Traditional, Macau) | Chinese (Traditional, Taiwan) | Croatian | Czech | Danish | Dutch | Estonian | Finnish | French | German | Greek | Hebrew | Hindi | Hungarian | Indonesian | Italian | Japanese | Kannada | Khmer | Korean | Lithuanian | Malay | Malayalam | Marathi | Nepali | Nigerian Pidgin | Norwegian | Persian (Farsi) | Polish | Portuguese (Brazil) | Portuguese (Portugal) | Punjabi (Gurmukhi) | Romanian | Russian | Serbian (Cyrillic) | Slovak | Slovenian | Spanish | Swahili | Swedish | Tagalog (Filipino) | Tamil | Telugu | Thai | Turkish | Ukrainian | Urdu | Vietnamese

Prefer to Clone Locally?

This repository includes 50+ language translations which significantly increases the download size. To clone without translations, use sparse checkout:

Bash / macOS / Linux:

git clone --filter=blob:none --sparse https://github.com/microsoft-foundry/Foundry-Local-Lab.git
cd Foundry-Local-Lab
git sparse-checkout set --no-cone '/*' '!translations' '!translated_images'

CMD (Windows):

git clone --filter=blob:none --sparse https://github.com/microsoft-foundry/Foundry-Local-Lab.git
cd Foundry-Local-Lab
git sparse-checkout set --no-cone "/*" "!translations" "!translated_images"

This gives you everything you need to complete the course with a much faster download.


Learning Objectives

By the end of this workshop you will be able to:

#Objective
1Install Foundry Local and manage models with the CLI
2Master the Foundry Local SDK API for programmatic model management
3Connect to the local inference server using the Python, JavaScript, and C# SDKs
4Build a Retrieval-Augmented Generation (RAG) pipeline that grounds answers in your own data
5Create AI agents with persistent instructions and personas
6Orchestrate multi-agent workflows with feedback loops
7Explore a production capstone app - the Zava Creative Writer
8Build evaluation frameworks with golden datasets and LLM-as-judge scoring
9Transcribe audio with Whisper - speech-to-text on-device using the Foundry Local SDK
10Compile and run custom or Hugging Face models with ONNX Runtime GenAI and Foundry Local
11Enable local models to call external functions with the tool-calling pattern
12Build a browser-based UI for the Zava Creative Writer with real-time streaming

Prerequisites

RequirementDetails
Hardware8 GB RAM minimum (16 GB recommended); AVX2-capable CPU or a supported GPU
OSWindows 10/11 (x64/ARM), Windows Server 2025, or macOS 13+
Foundry Local CLIInstall via winget install Microsoft.FoundryLocal (Windows) or brew tap microsoft/foundrylocal && brew install foundrylocal (macOS). See the getting started guide for details.
Language runtimePython 3.9+ and/or .NET 9.0+ and/or Node.js 18+
GitFor cloning this repository

Getting Started

# 1. Clone the repository
git clone https://github.com/microsoft-foundry/foundry-local-lab.git
cd foundry-local-lab

# 2. Verify Foundry Local is installed
foundry model list              # List available models
foundry model run phi-3.5-mini  # Start an interactive chat

# 3. Choose your language track (see Part 2 lab for full setup)
LanguageQuick Start
Pythoncd python && pip install -r requirements.txt && python foundry-local.py
C#cd csharp && dotnet run
JavaScriptcd javascript && npm install && node foundry-local.mjs

Workshop Parts

Part 1: Getting Started with Foundry Local

Lab guide: labs/part1-getting-started.md

  • What is Foundry Local and how it works
  • Installing the CLI on Windows and macOS
  • Exploring models - listing, downloading, running
  • Understanding model aliases and dynamic ports

Part 2: Foundry Local SDK Deep Dive

Lab guide: labs/part2-foundry-local-sdk.md

  • Why use the SDK over the CLI for application development
  • Full SDK API reference for Python, JavaScript, and C#
  • Service management, catalog browsing, model lifecycle (download, load, unload)
  • Quick-start patterns: Python constructor bootstrap, JavaScript init(), C# CreateAsync()
  • FoundryModelInfo metadata, aliases, and hardware-optimal model selection

Part 3: SDKs and APIs

Lab guide: labs/part3-sdk-and-apis.md

  • Connecting to Foundry Local from Python, JavaScript, and C#
  • Using the Foundry Local SDK to manage the service programmatically
  • Streaming chat completions via the OpenAI-compatible API
  • SDK method reference for each language

Code samples:

LanguageFileDescription
Pythonpython/foundry-local.pyBasic streaming chat
C#csharp/BasicChat.csStreaming chat with .NET
JavaScriptjavascript/foundry-local.mjsStreaming chat with Node.js

Part 4: Retrieval-Augmented Generation (RAG)

Lab guide: labs/part4-rag-fundamentals.md

  • What is RAG and why it matters
  • Building an in-memory knowledge base
  • Keyword-overlap retrieval with scoring
  • Composing grounded system prompts
  • Running a complete RAG pipeline on-device

Code samples:

LanguageFile
Pythonpython/foundry-local-rag.py
C#csharp/RagPipeline.cs
JavaScriptjavascript/foundry-local-rag.mjs

Part 5: Building AI Agents

Lab guide: labs/part5-single-agents.md

  • What is an AI agent (vs. a raw LLM call)
  • The ChatAgent pattern and the Microsoft Agent Framework
  • System instructions, personas, and multi-turn conversations
  • Structured output (JSON) from agents

Code samples:

LanguageFileDescription
Pythonpython/foundry-local-with-agf.pySingle agent with Agent Framework
C#csharp/SingleAgent.csSingle agent (ChatAgent pattern)
JavaScriptjavascript/foundry-local-with-agent.mjsSingle agent (ChatAgent pattern)

Part 6: Multi-Agent Workflows

Lab guide: labs/part6-multi-agent-workflows.md

  • Multi-agent pipelines: Researcher โ†’ Writer โ†’ Editor
  • Sequential orchestration and feedback loops
  • Shared configuration and structured hand-offs
  • Designing your own multi-agent workflow

Code samples:

LanguageFileDescription
Pythonpython/foundry-local-multi-agent.pyThree-agent pipeline
C#csharp/MultiAgent.csThree-agent pipeline
JavaScriptjavascript/foundry-local-multi-agent.mjsThree-agent pipeline

Part 7: Zava Creative Writer - Capstone Application

Lab guide: labs/part7-zava-creative-writer.md

  • A production-style multi-agent app with 4 specialised agents
  • Sequential pipeline with evaluator-driven feedback loops
  • Streaming output, product catalog search, structured JSON hand-offs
  • Full implementation in Python (FastAPI), JavaScript (Node.js CLI), and C# (.NET console)

Code samples:

LanguageDirectoryDescription
Pythonzava-creative-writer-local/src/api/FastAPI web service with orchestrator
JavaScriptzava-creative-writer-local/src/javascript/Node.js CLI application
C#zava-creative-writer-local/src/csharp/.NET 9 console application

Part 8: Evaluation-Led Development

Lab guide: labs/part8-evaluation-led-development.md

  • Build a systematic evaluation framework for AI agents using golden datasets
  • Rule-based checks (length, keyword coverage, forbidden terms) + LLM-as-judge scoring
  • Side-by-side comparison of prompt variants with aggregate scorecards
  • Extends the Zava Editor agent pattern from Part 7 into an offline test suite
  • Python, JavaScript, and C# tracks

Code samples:

LanguageFileDescription
Pythonpython/foundry-local-eval.pyEvaluation framework
C#csharp/AgentEvaluation.csEvaluation framework
JavaScriptjavascript/foundry-local-eval.mjsEvaluation framework

Part 9: Voice Transcription with Whisper

Lab guide: labs/part9-whisper-voice-transcription.md

  • Speech-to-text transcription using OpenAI Whisper running locally
  • Privacy-first audio processing - audio never leaves your device
  • Python, JavaScript, and C# tracks with client.audio.transcriptions.create() (Python/JS) and AudioClient.TranscribeAudioAsync() (C#)
  • Includes Zava-themed sample audio files for hands-on practice

Code samples:

LanguageFileDescription
Pythonpython/foundry-local-whisper.pyWhisper voice transcription
C#csharp/WhisperTranscription.csWhisper voice transcription
JavaScriptjavascript/foundry-local-whisper.mjsWhisper voice transcription

Note: This lab uses the Foundry Local SDK to programmatically download and load the Whisper model, then sends audio to the local OpenAI-compatible endpoint for transcription. The Whisper model (whisper) is listed in the Foundry Local catalog and runs entirely on-device - no cloud API keys or network access required.


Part 10: Using Custom or Hugging Face Models

Lab guide: labs/part10-custom-models.md

  • Compiling Hugging Face models to optimised ONNX format using the ONNX Runtime GenAI model builder
  • Hardware-specific compilation (CPU, NVIDIA GPU, DirectML, WebGPU) and quantisation (int4, fp16, bf16)
  • Creating chat-template configuration files for Foundry Local
  • Adding compiled models to the Foundry Local cache
  • Running custom models via the CLI, REST API, and OpenAI SDK
  • Reference example: compiling Qwen/Qwen3-0.6B end-to-end

Part 11: Tool Calling with Local Models

Lab guide: labs/part11-tool-calling.md

  • Enable local models to call external functions (tool/function calling)
  • Define tool schemas using the OpenAI function-calling format
  • Handle the multi-turn tool-calling conversation flow
  • Execute tool calls locally and return results to the model
  • Choose the right model for tool-calling scenarios (Qwen 2.5, Phi-4-mini)
  • Use the SDK's native ChatClient for tool calling (JavaScript)

Code samples:

LanguageFileDescription
Pythonpython/foundry-local-tool-calling.pyTool calling with weather/population tools
C#csharp/ToolCalling.csTool calling with .NET
JavaScriptjavascript/foundry-local-tool-calling.mjsTool calling with ChatClient

Part 12: Building a Web UI for the Zava Creative Writer

Lab guide: labs/part12-zava-ui.md

  • Add a browser-based front end to the Zava Creative Writer
  • Serve the shared UI from Python (FastAPI), JavaScript (Node.js HTTP), and C# (ASP.NET Core)
  • Consume streaming NDJSON in the browser with the Fetch API and ReadableStream
  • Live agent status badges and real-time article text streaming

Code (shared UI):

FileDescription
zava-creative-writer-local/ui/index.htmlPage layout
zava-creative-writer-local/ui/style.cssStyling
zava-creative-writer-local/ui/app.jsStream reader and DOM update logic

Backend additions:

LanguageFileDescription
Pythonzava-creative-writer-local/src/api/main.pyUpdated to serve static UI
JavaScriptzava-creative-writer-local/src/javascript/server.mjsNew HTTP server wrapping the orchestrator
C#zava-creative-writer-local/src/csharp-web/Program.csNew ASP.NET Core minimal API project

Part 13: Workshop Complete

Lab guide: labs/part13-workshop-complete.md

  • Summary of everything you have built across all 12 parts
  • Further ideas for extending your applications
  • Links to resources and documentation

Project Structure

โ”œโ”€โ”€ python/                        # Python examples
โ”‚   โ”œโ”€โ”€ foundry-local.py           # Basic chat
โ”‚   โ”œโ”€โ”€ foundry-local-with-agf.py  # Single agent (AGF)
โ”‚   โ”œโ”€โ”€ foundry-local-rag.py       # RAG pipeline
โ”‚   โ”œโ”€โ”€ foundry-local-multi-agent.py # Multi-agent workflow
โ”‚   โ”œโ”€โ”€ foundry-local-eval.py      # Agent evaluation framework
โ”‚   โ”œโ”€โ”€ foundry-local-whisper.py   # Whisper voice transcription
โ”‚   โ”œโ”€โ”€ foundry-local-tool-calling.py # Tool/function calling
โ”‚   โ””โ”€โ”€ requirements.txt
โ”œโ”€โ”€ csharp/                        # C# examples
โ”‚   โ”œโ”€โ”€ Program.cs                 # CLI router (chat|rag|agent|multi|eval|whisper|toolcall)
โ”‚   โ”œโ”€โ”€ BasicChat.cs               # Basic chat
โ”‚   โ”œโ”€โ”€ RagPipeline.cs             # RAG pipeline
โ”‚   โ”œโ”€โ”€ SingleAgent.cs             # Single agent (ChatAgent pattern)
โ”‚   โ”œโ”€โ”€ MultiAgent.cs              # Multi-agent workflow
โ”‚   โ”œโ”€โ”€ AgentEvaluation.cs         # Agent evaluation framework
โ”‚   โ”œโ”€โ”€ WhisperTranscription.cs    # Whisper voice transcription
โ”‚   โ”œโ”€โ”€ ToolCalling.cs             # Tool/function calling
โ”‚   โ””โ”€โ”€ csharp.csproj
โ”œโ”€โ”€ javascript/                    # JavaScript examples
โ”‚   โ”œโ”€โ”€ foundry-local.mjs          # Basic chat
โ”‚   โ”œโ”€โ”€ foundry-local-with-agent.mjs # Single agent
โ”‚   โ”œโ”€โ”€ foundry-local-rag.mjs     # RAG pipeline
โ”‚   โ”œโ”€โ”€ foundry-local-multi-agent.mjs # Multi-agent workflow
โ”‚   โ”œโ”€โ”€ foundry-local-eval.mjs     # Agent evaluation framework
โ”‚   โ”œโ”€โ”€ foundry-local-whisper.mjs  # Whisper voice transcription
โ”‚   โ”œโ”€โ”€ foundry-local-tool-calling.mjs # Tool/function calling
โ”‚   โ””โ”€โ”€ package.json
โ”œโ”€โ”€ zava-creative-writer-local/ # Production multi-agent app
โ”‚   โ”œโ”€โ”€ ui/                        # Shared browser UI (Part 12)
โ”‚   โ”‚   โ”œโ”€โ”€ index.html             # Page layout
โ”‚   โ”‚   โ”œโ”€โ”€ style.css              # Styling
โ”‚   โ”‚   โ””โ”€โ”€ app.js                 # Stream reader and DOM updates
โ”‚   โ””โ”€โ”€ src/
โ”‚       โ”œโ”€โ”€ api/                   # Python FastAPI service
โ”‚       โ”‚   โ”œโ”€โ”€ main.py            # FastAPI server (serves UI)
โ”‚       โ”‚   โ”œโ”€โ”€ orchestrator.py    # Pipeline coordinator
โ”‚       โ”‚   โ”œโ”€โ”€ foundry_config.py  # Shared Foundry Local config
โ”‚       โ”‚   โ”œโ”€โ”€ requirements.txt
โ”‚       โ”‚   โ””โ”€โ”€ agents/            # Researcher, Product, Writer, Editor
โ”‚       โ”œโ”€โ”€ javascript/            # Node.js CLI and web server
โ”‚       โ”‚   โ”œโ”€โ”€ main.mjs           # CLI entry point
โ”‚       โ”‚   โ”œโ”€โ”€ server.mjs         # HTTP server with UI (Part 12)
โ”‚       โ”‚   โ”œโ”€โ”€ foundryConfig.mjs
โ”‚       โ”‚   โ””โ”€โ”€ package.json
โ”‚       โ”œโ”€โ”€ csharp/                # .NET 9 console app
โ”‚       โ”‚   โ”œโ”€โ”€ Program.cs
โ”‚       โ”‚   โ””โ”€โ”€ ZavaCreativeWriter.csproj
โ”‚       โ””โ”€โ”€ csharp-web/            # .NET 9 web API (Part 12)
โ”‚           โ”œโ”€โ”€ Program.cs
โ”‚           โ””โ”€โ”€ ZavaCreativeWriterWeb.csproj
โ”œโ”€โ”€ labs/                          # Step-by-step lab guides
โ”‚   โ”œโ”€โ”€ part1-getting-started.md
โ”‚   โ”œโ”€โ”€ part2-foundry-local-sdk.md
โ”‚   โ”œโ”€โ”€ part3-sdk-and-apis.md
โ”‚   โ”œโ”€โ”€ part4-rag-fundamentals.md
โ”‚   โ”œโ”€โ”€ part5-single-agents.md
โ”‚   โ”œโ”€โ”€ part6-multi-agent-workflows.md
โ”‚   โ”œโ”€โ”€ part7-zava-creative-writer.md
โ”‚   โ”œโ”€โ”€ part8-evaluation-led-development.md
โ”‚   โ”œโ”€โ”€ part9-whisper-voice-transcription.md
โ”‚   โ”œโ”€โ”€ part10-custom-models.md
โ”‚   โ”œโ”€โ”€ part11-tool-calling.md
โ”‚   โ”œโ”€โ”€ part12-zava-ui.md
โ”‚   โ””โ”€โ”€ part13-workshop-complete.md
โ”œโ”€โ”€ samples/
โ”‚   โ””โ”€โ”€ audio/                     # Zava-themed WAV files for Part 9
โ”‚       โ”œโ”€โ”€ generate_samples.py    # TTS script (pyttsx3) to create WAVs
โ”‚       โ””โ”€โ”€ README.md              # Sample descriptions
โ”œโ”€โ”€ AGENTS.md                      # Coding agent instructions
โ”œโ”€โ”€ package.json                   # Root devDependency (mermaid-cli)
โ”œโ”€โ”€ LICENSE                        # MIT licence
โ””โ”€โ”€ README.md

Resources

ResourceLink
Foundry Local websitefoundrylocal.ai
Model catalogfoundrylocal.ai/models
Foundry Local GitHubgithub.com/microsoft/foundry-local
Getting started guideMicrosoft Learn - Foundry Local
Foundry Local SDK ReferenceMicrosoft Learn - SDK Reference
Microsoft Agent FrameworkMicrosoft Learn - Agent Framework
OpenAI Whispergithub.com/openai/whisper
ONNX Runtime GenAIgithub.com/microsoft/onnxruntime-genai

Licence

This workshop material is provided for educational purposes.


Happy building! ๐Ÿš€