Workshop Notebooks - Troubleshooting Guide
October 8, 2025 ยท View on GitHub
Table of Contents
- Common Issues and Solutions
- Session 04 Specific Issues
- Session 05 Specific Issues
- Session 06 Specific Issues
- Hardware-Specific Issues
- Diagnostic Commands
- Getting Help
Common Issues and Solutions
๐ด CUDA Out of Memory
Error Message:
CUDA failure 2: out of memory
Root Cause: GPU doesn't have enough VRAM to load the model.
Solutions:
Option 1: Use CPU Variants (Recommended)
# In your notebook, update the CATALOG to use CPU variants
CATALOG = {
'phi-4-mini-cpu': {'capabilities':['general','summarize'],'priority':2},
'qwen2.5-0.5b-cpu': {'capabilities':['classification','fast'],'priority':1},
'phi-3.5-mini-cpu': {'capabilities':['code','refactor'],'priority':3},
}
Option 2: Use Smaller Models on GPU
# Use only the smallest model
CATALOG = {
'qwen2.5-0.5b': {'capabilities':['general','code','summarize','classification'],'priority':1},
}
Option 3: Clear GPU Memory
# Close other GPU-intensive applications
# Check GPU memory usage
nvidia-smi
# Restart Foundry Local service
foundry service stop
foundry service start
Option 4: Increase GPU Memory (if possible)
- Close browser tabs, video editors, or other GPU apps
- Reduce Windows visual effects
- Use dedicated GPU if you have integrated + dedicated
๐ด APIConnectionError: Connection Error
Error Message:
APIConnectionError: Connection error
[Retry 1/2] Setup failed for 'phi-4-mini': APIConnectionError: Connection error.
Root Cause: Foundry Local service is not running or not accessible.
Solutions:
Step 1: Check Service Status
foundry service status
Step 2: Start Service if Stopped
foundry service start
Step 3: Verify Endpoint
# Check what port the service is using
foundry service status
# Test with curl (adjust port if different)
curl http://localhost:59959/health
curl http://localhost:55769/health
Step 4: Load Required Models
foundry model run phi-4-mini
foundry model run qwen2.5-3b
foundry model run phi-3.5-mini
Step 5: Restart Notebook Kernel
After starting the service and loading models, restart the notebook kernel and re-run all cells.
๐ด Model Not Found in Catalog
Error Message:
ValueError: Model phi-3.5-mini-cpu not found in the catalog.
[ERROR] Model 'phi-4-mini' not found in the catalog
Root Cause: Model isn't downloaded or loaded in Foundry Local.
Solutions:
Option 1: Download and Load Models
# List available models
foundry model ls
# Download the model if not present
foundry model download phi-4-mini
foundry model download qwen2.5-3b
foundry model download phi-3.5-mini
# Load the model into memory
foundry model run phi-4-mini
foundry model run qwen2.5-3b
foundry model run phi-3.5-mini
Option 2: Use Auto-Selection Mode
Update your CATALOG to use base model names (without -cpu suffix):
CATALOG = {
'phi-4-mini': {'capabilities':['general','summarize'],'priority':2},
'qwen2.5-0.5b': {'capabilities':['classification','fast'],'priority':1},
'phi-3.5-mini': {'capabilities':['code','refactor'],'priority':3},
}
Foundry Local SDK will automatically select the best variant (CPU, GPU, or NPU) for your hardware.
๐ด HttpResponseError: 500 - Failed Loading Model
Error Message:
HttpResponseError: 500 - Failed loading model
Root Cause: Model file is corrupted or incompatible with hardware.
Solutions:
Redownload Model
# Remove corrupted model
foundry model remove phi-3.5-mini
# Download fresh copy
foundry model download phi-3.5-mini
Use Different Variant
# Try CPU variant instead
foundry model download phi-3.5-mini-cpu
๐ด ImportError: No Module Found
Error Message:
ModuleNotFoundError: No module named 'foundry_local'
Root Cause: foundry-local-sdk package not installed.
Solutions:
# Install SDK
pip install foundry-local-sdk
# Verify installation
pip show foundry-local-sdk
python -c "from foundry_local import FoundryLocalManager; print('SDK OK')"
๏ฟฝ Slow First Request
Symptom: First completion takes 30+ seconds, subsequent requests are fast.
Root Cause: This is normal behavior - the service is loading the model into memory on first request.
Solutions:
Pre-load Models
# Download and load all models you'll use before running notebooks
foundry model download phi-4-mini
foundry model download qwen2.5-3b
foundry model download phi-3.5-mini
foundry model run phi-4-mini
foundry model run qwen2.5-3b
Keep Service Running
# Start service manually and leave it running
foundry service start
Session 04 Specific Issues
Wrong Port Configuration
Issue: Notebook connecting to wrong port (55769 vs 59959 vs 57127)
Solution:
- Check which port Foundry Local is using:
foundry service status
# Note the port number displayed
- Update Cell 10 in the notebook:
ENDPOINT = os.getenv('FOUNDRY_LOCAL_ENDPOINT', 'http://localhost:59959/v1')
# Replace 59959 with your actual port
- Restart kernel and re-run cells 6, 8, 10, 12, 16, 20, 22
Wrong Model Selection
Issue: Notebook showing gpt-oss-20b or qwen2.5-7b instead of qwen2.5-3b
Solution:
- Restart notebook kernel (clears old variable state)
- Re-run Cell 10 (Set Model Aliases)
- Re-run Cell 12 (Display Configuration)
- Verify: Cell 12 should show
LLM Model: qwen2.5-3b
Diagnostic Cell Fails
Issue: Cell 16 shows "โ Foundry Local service not found!"
Solution:
- Verify service is running:
foundry service status
- Test endpoint manually:
curl http://localhost:59959/v1/models
-
If curl works but notebook doesn't:
- Restart notebook kernel
- Re-run cells in order: 6, 8, 10, 12, 14, 16
-
If curl fails:
- Start service:
foundry service start - Load models:
foundry model run phi-4-miniandfoundry model run qwen2.5-3b
- Start service:
Pre-flight Check Fails
Issue: Cell 20 shows connection errors even though diagnostic passed
Solution:
- Check models are loaded:
foundry model ls
# Should show phi-4-mini and qwen2.5-3b
- If models missing:
foundry model run phi-4-mini
foundry model run qwen2.5-3b
- Re-run Cell 20 after models are loaded
Comparison Fails with None Values
Issue: Cell 22 completes but shows latency as None
Solution:
- Check that pre-flight passed first (Cell 20)
- Re-run Cell 22 - models may need to warm up on first request
- Verify both models are loaded:
foundry model ls
Session 05 Specific Issues
Agent Using Wrong Model
Issue: Agent not using expected model
Solution:
Verify configuration:
# Check which models are assigned
print(f"Researcher: {researcher.model_id}")
print(f"Editor: {editor.model_id}")
Restart kernel if models are incorrect.
Memory Context Overflow
Issue: Agent responses degrading over time
Solution: Already handled automatically - agents keep only last 6 messages in memory.
Session 06 Specific Issues
CPU vs GPU Model Confusion
Issue: Getting CUDA errors when using default configuration
Solution: The default configuration now uses CPU models. If you see CUDA errors:
- Verify you're using the default CPU catalog:
# Cell should show CPU variants
CATALOG = {
'phi-4-mini-cpu': {'capabilities':['general','summarize'],'priority':2},
'qwen2.5-0.5b-cpu': {'capabilities':['classification','fast'],'priority':1},
'phi-3.5-mini-cpu': {'capabilities':['code','refactor'],'priority':3},
}
- Restart kernel and re-run all cells
Intent Detection Not Working
Issue: Prompts being routed to wrong models
Solution:
Test intent detection:
# Run the intent detection test cell
for prompt in test_prompts:
intent = detect_intent(prompt)
print(f"{prompt[:50]}... โ {intent}")
Update RULES if patterns need adjustment.
Hardware-Specific Issues
NVIDIA GPU
Check GPU Status:
nvidia-smi
Common Issues:
- Driver outdated: Update NVIDIA drivers
- CUDA version mismatch: Reinstall Foundry Local
- GPU memory fragmented: Restart system
Qualcomm NPU
Check NPU Status:
foundry device info
Common Issues:
- NPU drivers not installed: Install Qualcomm NPU drivers
- NPU variant not available: Use CPU variant
- Windows version too old: Update to latest Windows 11
CPU-Only Systems
Recommended Models:
CATALOG = {
'qwen2.5-0.5b-cpu': {'capabilities':['classification','fast'],'priority':1},
'phi-3.5-mini-cpu': {'capabilities':['code','general'],'priority':2},
}
Performance Tips:
- Use smallest models
- Reduce max_tokens
- Increase patience for first request
Diagnostic Commands
Check Everything
# Service status
foundry service status
# List models
foundry model ls
# Device info
foundry device info
# Version info
foundry --version
# Health check
curl http://localhost:55769/health
In Python
# Check SDK import
try:
from foundry_local import FoundryLocalManager
print('โ SDK imported')
except ImportError as e:
print(f'โ SDK import failed: {e}')
# Check service connectivity
from openai import OpenAI
try:
client = OpenAI(base_url='http://localhost:59959/v1', api_key='test')
models = client.models.list()
print(f'โ Service reachable, {len(list(models.data))} models available')
except Exception as e:
print(f'โ Service not reachable: {e}')
Getting Help
1. Check Logs
# Windows
foundry service logs
# Or check Windows Event Viewer
# Application Logs -> Microsoft-FoundryLocal
2. GitHub Issues
- Search existing issues: https://github.com/microsoft/Foundry-Local/issues
- Create new issue with:
- Error message (full text)
- Output of
foundry service status - Output of
foundry --version - GPU/NPU info (nvidia-smi, etc.)
- Steps to reproduce
3. Documentation
- Main Repository: https://github.com/microsoft/Foundry-Local
- Python SDK: https://github.com/microsoft/Foundry-Local/tree/main/sdk/python
- CLI Reference: https://github.com/microsoft/Foundry-Local/blob/main/docs/reference/reference-cli.md
- Troubleshooting: https://github.com/microsoft/Foundry-Local/blob/main/docs/reference/reference-troubleshooting.md
Quick Fixes Checklist
When things go wrong, try these in order:
- Check service is running:
foundry service status - Restart service:
foundry service stop && foundry service start - Verify model exists:
foundry model ls | grep phi-4-mini - Load required models:
foundry model run phi-4-mini - Check GPU memory:
nvidia-smi(if NVIDIA) - Try CPU variant: Use
phi-4-mini-cpuinstead ofphi-4-mini - Restart notebook kernel
- Clear notebook outputs and re-run all cells
- Reinstall SDK:
pip install --upgrade --force-reinstall foundry-local-sdk - Reboot system (last resort)
Prevention Tips
Best Practices
-
Always check service first:
foundry service status -
Use CPU variants by default:
CATALOG = { 'phi-4-mini-cpu': {...}, 'qwen2.5-0.5b-cpu': {...}, } -
Pre-load models before running notebooks:
foundry model run phi-4-mini foundry model run qwen2.5-3b -
Keep service running:
- Don't stop/start service unnecessarily
- Let it run in background between sessions
-
Monitor resources:
- Check GPU memory before running
- Close unnecessary GPU applications
- Use Task Manager / nvidia-smi
-
Update regularly:
winget upgrade Microsoft.FoundryLocal pip install --upgrade foundry-local-sdk
Last Updated: October 8, 2025