06.ModelsAsTools.md

January 29, 2026 · View on GitHub

نمای کلی

مدل‌های هوش مصنوعی را به‌عنوان ابزارهای ماژولار و قابل تنظیم در نظر بگیرید که به‌صورت مستقیم روی دستگاه با Foundry Local اجرا می‌شوند. این جلسه بر روی جریان‌های کاری عملی برای استنتاج با حفظ حریم خصوصی و تأخیر کم تأکید دارد و نحوه ادغام این ابزارها از طریق SDKها، APIها یا CLI را آموزش می‌دهد. همچنین یاد خواهید گرفت که چگونه در صورت نیاز به Azure AI Foundry مقیاس‌بندی کنید.

🔄 به‌روزرسانی شده برای SDK مدرن: این ماژول با الگوهای مخزن Foundry-Local مایکروسافت به‌روز شده و با پیاده‌سازی مسیریابی هوشمند در samples/06/ مطابقت دارد. مثال‌ها اکنون از foundry-local-sdk مدرن و استراتژی‌های پیشرفته انتخاب مدل استفاده می‌کنند.

🏗️ نکات برجسته معماری:

مسیریابی هوشمند مدل: انتخاب مبتنی بر کلمات کلیدی بین مدل‌های عمومی، استدلالی، کدنویسی و خلاقانه
ادغام SDK مدرن: استفاده از FoundryLocalManager با کشف خودکار سرویس
پیکربندی محیط: تخصیص انعطاف‌پذیر مدل از طریق متغیرهای محیطی
پایش سلامت: اعتبارسنجی سرویس و بررسی دسترسی مدل
آماده برای تولید: مدیریت جامع خطاها و مکانیسم‌های جایگزین

📁 پیاده‌سازی محلی:

samples/06/router.py - مسیریاب هوشمند مدل با انتخاب مبتنی بر کلمات کلیدی
samples/06/model_router.ipynb - مثال‌های تعاملی و معیارها
samples/06/README.md - دستورالعمل‌های پیکربندی و استفاده

مراجع:

مستندات Foundry Local: https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/
ادغام با SDKهای استنتاج: https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/how-to/how-to-integrate-with-inference-sdks
کامپایل مدل‌های Hugging Face: https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/how-to/how-to-compile-hugging-face-models

نمای کلی

مراجع:

مستندات Foundry Local: https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/
ادغام با SDKهای استنتاج: https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/how-to/how-to-integrate-with-inference-sdks
کامپایل مدل‌های Hugging Face: https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/how-to/how-to-compile-hugging-face-models

اهداف یادگیری

طراحی الگوهای مدل به‌عنوان ابزار روی دستگاه
ادغام از طریق API REST سازگار با OpenAI یا SDKها
سفارشی‌سازی مدل‌ها برای موارد استفاده خاص دامنه
برنامه‌ریزی برای مقیاس‌بندی ترکیبی به Azure AI Foundry

بخش 1: مسیریاب هوشمند مدل (پیاده‌سازی مدرن)

هدف: پیاده‌سازی انتخاب هوشمند مدل با مسیریابی خودکار بر اساس محتوای پرسش.

📋 توجه: این پیاده‌سازی با الگوهای استفاده شده در samples/06/router.py و انتخاب مدل پیشرفته مبتنی بر کلمات کلیدی مطابقت دارد.

مرحله 1) تعریف مسیریاب مدل مدرن با FoundryLocalManager

# router/intelligent_router.py
from foundry_local import FoundryLocalManager
from openai import OpenAI
from typing import Dict, Any, Optional
import os
import json

class ModelRouter:
    """Intelligent model router that selects appropriate models for different task types."""
    
    def __init__(self):
        self.client = None
        self.base_url = None
        self.tools = self._load_tool_registry()
        self._initialize_client()
    
    def _load_tool_registry(self) -> Dict[str, Dict[str, Any]]:
        """Load tool registry from environment or use defaults."""
        default_tools = {
            "general": {
                "model": os.environ.get("GENERAL_MODEL", "phi-4-mini"),
                "notes": "Fast general-purpose chat and Q&A",
                "temperature": 0.7
            },
            "reasoning": {
                "model": os.environ.get("REASONING_MODEL", "deepseek-r1-7b"),
                "notes": "Step-by-step analysis and logical reasoning",
                "temperature": 0.3
            },
            "code": {
                "model": os.environ.get("CODE_MODEL", "qwen2.5-7b"),
                "notes": "Code generation, debugging, and technical tasks",
                "temperature": 0.2
            },
            "creative": {
                "model": os.environ.get("CREATIVE_MODEL", "phi-4-mini"),
                "notes": "Creative writing and storytelling",
                "temperature": 0.9
            }
        }
        
        # Check for environment override
        tools_env = os.environ.get("TOOL_REGISTRY")
        if tools_env:
            try:
                return json.loads(tools_env)
            except json.JSONDecodeError:
                print("Warning: Invalid TOOL_REGISTRY JSON, using defaults")
        
        return default_tools

مرحله 2) مقداردهی اولیه کلاینت با SDK مدرن و کشف سرویس

    def _initialize_client(self):
        """Initialize OpenAI client with Foundry Local or fallback configuration."""
        try:
            from foundry_local import FoundryLocalManager
            # Try to use any available model for client initialization
            first_model = next(iter(self.tools.values()))["model"]
            manager = FoundryLocalManager(first_model)
            
            self.client = OpenAI(
                base_url=manager.endpoint,
                api_key=manager.api_key
            )
            self.base_url = manager.endpoint
            print(f"✅ Foundry Local SDK initialized")
        except Exception as e:
            print(f"Warning: Could not use Foundry SDK ({e}), falling back to manual configuration")
            # Fallback to manual configuration
            self.base_url = os.environ.get("BASE_URL", "http://localhost:8000")
            api_key = os.environ.get("API_KEY", "")
            
            self.client = OpenAI(
                base_url=f"{self.base_url}/v1",
                api_key=api_key
            )
            print(f"Initialized manual configuration at {self.base_url}")
    
    def select_tool(self, user_query: str) -> str:
        """Select the most appropriate tool based on the user query."""
        query_lower = user_query.lower()
        
        # Code-related keywords
        code_keywords = ["code", "python", "function", "class", "method", "bug", "debug", 
                        "programming", "script", "algorithm", "implementation", "refactor"]
        if any(keyword in query_lower for keyword in code_keywords):
            return "code"
        
        # Reasoning keywords
        reasoning_keywords = ["why", "how", "explain", "step-by-step", "reason", "analyze", 
                             "think", "logic", "because", "cause", "compare", "evaluate"]
        if any(keyword in query_lower for keyword in reasoning_keywords):
            return "reasoning"
        
        # Creative keywords
        creative_keywords = ["story", "poem", "creative", "imagine", "write", "tale", 
                           "narrative", "fiction", "character", "plot"]
        if any(keyword in query_lower for keyword in creative_keywords):
            return "creative"
        
        # Default to general
        return "general"
    
    def chat(self, model: str, content: str, max_tokens: int = 300, temperature: Optional[float] = None) -> str:
        """Send chat completion request to the specified model."""
        try:
            params = {
                "model": model,
                "messages": [{"role": "user", "content": content}],
                "max_tokens": max_tokens
            }
            
            if temperature is not None:
                params["temperature"] = temperature
            
            response = self.client.chat.completions.create(**params)
            return response.choices[0].message.content
        except Exception as e:
            return f"Error generating response with model {model}: {str(e)}"

مرحله 3) پیاده‌سازی مسیریابی هوشمند و اجرا (به samples/06/router.py مراجعه کنید)

    def route_and_run(self, prompt: str) -> Dict[str, Any]:
        """Route the prompt to the appropriate model and generate response."""
        tool_key = self.select_tool(prompt)
        tool_config = self.tools[tool_key]
        model = tool_config["model"]
        temperature = tool_config.get("temperature", 0.7)
        
        print(f"🎯 Selected tool: {tool_key} (model: {model})")
        
        answer = self.chat(
            model=model, 
            content=prompt, 
            max_tokens=400, 
            temperature=temperature
        )
        
        return {
            "tool": tool_key,
            "model": model,
            "tool_description": tool_config["notes"],
            "temperature": temperature,
            "answer": answer
        }
    
    def check_service_health(self) -> Dict[str, Any]:
        """Check Foundry Local service health and available models."""
        try:
            models_response = self.client.models.list()
            available_models = [model.id for model in models_response.data]
            
            return {
                "status": "healthy",
                "base_url": self.base_url,
                "available_models": available_models,
                "tools_configured": list(self.tools.keys())
            }
        except Exception as e:
            return {
                "status": "error",
                "base_url": self.base_url,
                "error": str(e)
            }

if __name__ == "__main__":
    # Ensure: foundry model run phi-4-mini
    router = ModelRouter()
    
    # Check health
    health = router.check_service_health()
    print(f"Service Health: {json.dumps(health, indent=2)}")
    
    # Test different query types
    queries = [
        "Write a Python function to calculate fibonacci numbers",  # -> code
        "Explain step-by-step why the sky is blue",  # -> reasoning
        "Tell me a creative story about AI",  # -> creative
        "What's the weather like today?"  # -> general
    ]
    
    for query in queries:
        result = router.route_and_run(query)
        print(f"\nQuery: {query}")
        print(f"Selected: {result['tool']} -> {result['model']}")
        print(f"Answer: {result['answer'][:100]}...")

بخش 2: ادغام SDK مدرن (گام‌به‌گام)

هدف: استفاده از SDK Foundry Local با SDK پایتون OpenAI برای ادغام یکپارچه.

مرحله 1) نصب وابستگی‌ها

cd Module08
.\.venv\Scripts\activate
pip install foundry-local-sdk openai

مرحله 2) پیکربندی محیط (اختیاری - به samples/06/README.md مراجعه کنید)

REM Override default models per tool
set GENERAL_MODEL=phi-4-mini
set REASONING_MODEL=deepseek-r1-7b
set CODE_MODEL=qwen2.5-7b
REM Or provide a full JSON registry
set TOOL_REGISTRY={"general":{"model":"phi-4-mini"},"reasoning":{"model":"deepseek-r1-7b"}}

مرحله 3) ادغام SDK مدرن

# modern_sdk_demo.py
from foundry_local import FoundryLocalManager
from openai import OpenAI
import sys

def main():
    """Demonstrate modern SDK integration."""
    try:
        # Initialize with FoundryLocalManager
        alias = "phi-4-mini"
        manager = FoundryLocalManager(alias)
        
        # Create OpenAI client using Foundry Local endpoint
        client = OpenAI(
            base_url=manager.endpoint,
            api_key=manager.api_key
        )
        
        # Get model info
        model_info = manager.get_model_info(alias)
        print(f"Using model: {model_info.id}")
        
        # Make request with streaming
        stream = client.chat.completions.create(
            model=model_info.id,
            messages=[{"role": "user", "content": "Explain edge AI benefits in one paragraph."}],
            stream=True,
            max_tokens=200
        )
        
        print("Response: ", end="")
        for chunk in stream:
            if chunk.choices[0].delta.content:
                print(chunk.choices[0].delta.content, end="", flush=True)
        print()
        
    except Exception as e:
        print(f"Error: {e}")
        print("Ensure Foundry Local is running with: foundry model run phi-4-mini")
        sys.exit(1)

if __name__ == "__main__":
    main()

بخش 3: سفارشی‌سازی دامنه (گام‌به‌گام)

هدف: تنظیم خروجی‌ها برای یک دامنه با استفاده از قالب‌های پرامپت و JSON schema.

مرحله 1) ایجاد قالب پرامپت دامنه

# domain/templates.py
BUSINESS_ANALYST_SYSTEM = """
You are a senior business analyst. Provide:
1) Key insights
2) Risks
3) Next steps
Respond in valid JSON with fields: insights, risks, next_steps.
"""

مرحله 2) اعمال خروجی JSON

# domain/analyst.py
import requests, os, json

BASE_URL = os.getenv("OPENAI_BASE_URL", "http://localhost:8000/v1")
API_KEY = os.getenv("OPENAI_API_KEY", "local-key")
HEADERS = {"Content-Type":"application/json","Authorization":f"Bearer {API_KEY}"}

from domain.templates import BUSINESS_ANALYST_SYSTEM

def analyze(text: str) -> dict:
    messages = [
        {"role":"system","content": BUSINESS_ANALYST_SYSTEM},
        {"role":"user","content": f"Analyze this business text:\n{text}"}
    ]
    r = requests.post(f"{BASE_URL}/chat/completions", json={
    "model":"phi-4-mini",
        "messages": messages,
        "response_format": {"type":"json_object"},
        "temperature": 0.3
    }, headers=HEADERS, timeout=60)
    r.raise_for_status()
    # Parse JSON content
    content = r.json()["choices"][0]["message"]["content"]
    return json.loads(content)

if __name__ == "__main__":
    print(analyze("Sales dipped 12% in Q3 due to supply constraints and marketing cuts."))

بخش 4: حالت آفلاین و امنیت (گام‌به‌گام)

هدف: اطمینان از حفظ حریم خصوصی و مقاومت هنگام اجرای مدل‌ها به‌عنوان ابزار به‌صورت محلی.

مرحله 1) پیش‌گرم کردن و اعتبارسنجی نقطه پایانی محلی

foundry model run phi-4-mini
curl http://localhost:8000/v1/models

مرحله 2) پاک‌سازی ورودی‌ها

# security/sanitize.py
import re
EMAIL_RE = re.compile(r"[\w\.-]+@[\w\.-]+")
PHONE_RE = re.compile(r"\+?\d[\d\s\-]{7,}\d")

def sanitize(text: str) -> str:
    text = EMAIL_RE.sub("[REDACTED_EMAIL]", text)
    text = PHONE_RE.sub("[REDACTED_PHONE]", text)
    return text

مرحله 3) پرچم فقط محلی و ثبت گزارش

# security/local_only.py
import os, json, time
LOG = os.getenv("MODELS_AS_TOOLS_LOG", "./tools_logs.jsonl")

def record(event: dict):
    with open(LOG, "a", encoding="utf-8") as f:
        f.write(json.dumps(event) + "\n")

# Usage before each call
def before_call(tool_name, payload):
    record({"ts": time.time(), "tool": tool_name, "event": "before_call"})

# After each call
def after_call(tool_name, result):
    record({"ts": time.time(), "tool": tool_name, "event": "after_call"})

بخش 5: استقرار تولید و مقیاس‌بندی

هدف: استقرار مسیریاب هوشمند با پایش و ادغام Azure AI Foundry.

📋 توجه: پیاده‌سازی محلی در samples/06/model_router.ipynb شامل مثال‌های جامع از الگوهای استقرار تولید است.

مرحله 1) مسیریاب تولید با پایش (به samples/06/router.py مراجعه کنید)

# production/router.py
from router.intelligent_router import ModelRouter
import json
import time
import sys

class ProductionModelRouter(ModelRouter):
    """Production-ready model router with monitoring and logging."""
    
    def __init__(self):
        super().__init__()
        self.request_count = 0
        self.error_count = 0
        self.start_time = time.time()
    
    def route_and_run_with_monitoring(self, prompt: str) -> Dict[str, Any]:
        """Route with comprehensive monitoring and error handling."""
        start_time = time.time()
        self.request_count += 1
        
        try:
            result = self.route_and_run(prompt)
            processing_time = time.time() - start_time
            
            # Log successful request
            self._log_request({
                "status": "success",
                "tool": result["tool"],
                "model": result["model"],
                "processing_time": processing_time,
                "timestamp": time.strftime("%Y-%m-%d %H:%M:%S")
            })
            
            result["processing_time"] = processing_time
            return result
            
        except Exception as e:
            self.error_count += 1
            error_result = {
                "status": "error",
                "error": str(e),
                "processing_time": time.time() - start_time,
                "timestamp": time.strftime("%Y-%m-%d %H:%M:%S")
            }
            
            self._log_request(error_result)
            return error_result
    
    def _log_request(self, data: Dict[str, Any]):
        """Log request data for monitoring."""
        print(f"📊 {json.dumps(data)}")
    
    def get_stats(self) -> Dict[str, Any]:
        """Get router statistics."""
        uptime = time.time() - self.start_time
        return {
            "uptime_seconds": uptime,
            "total_requests": self.request_count,
            "error_count": self.error_count,
            "success_rate": (self.request_count - self.error_count) / max(1, self.request_count),
            "requests_per_minute": self.request_count / max(1, uptime / 60)
        }

def main():
    """Production router demo."""
    router = ProductionModelRouter()
    
    # Health check
    health = router.check_service_health()
    if health["status"] == "error":
        print(f"❌ Service health check failed: {health['error']}")
        sys.exit(1)
    
    print(f"✅ Service healthy with {len(health['available_models'])} models")
    
    # Process user query
    user_prompt = " ".join(sys.argv[1:]) or "Write three benefits of on-device AI in JSON format."
    print(f"\n🎯 Processing: {user_prompt}")
    
    result = router.route_and_run_with_monitoring(user_prompt)
    
    if result.get("status") == "error":
        print(f"❌ Error: {result['error']}")
    else:
        print(f"\n📋 Result:")
        print(f"Tool: {result['tool']} -> Model: {result['model']}")
        print(f"Processing Time: {result['processing_time']:.2f}s")
        print(f"Answer: {result['answer']}")
    
    # Show stats
    stats = router.get_stats()
    print(f"\n📊 Statistics: {json.dumps(stats, indent=2)}")

if __name__ == "__main__":
    main()

چک‌لیست عملی

پیاده‌سازی مسیریاب هوشمند مدل با انتخاب مبتنی بر کلمات کلیدی (samples/06/router.py)
پیکربندی چندین مدل تخصصی (عمومی، استدلالی، کدنویسی، خلاقانه)
آزمایش نوت‌بوک تعاملی (samples/06/model_router.ipynb)
تنظیم پیکربندی مدل مبتنی بر محیط
پیاده‌سازی پایش سلامت سرویس و مدیریت خطا
استقرار مسیریاب تولید با ثبت گزارش جامع

ادغام نمونه محلی

اجرای پیاده‌سازی کامل:

cd Module08
.\.venv\Scripts\activate

REM Start required models
foundry model run phi-4-mini
foundry model run qwen2.5-7b
foundry model run deepseek-r1-7b

REM Test the intelligent router
python samples\06\router.py "Write a Python function to sort a list"
python samples\06\router.py "Explain step-by-step how bubble sort works"
python samples\06\router.py "Tell me a creative story about robots"

REM Explore the interactive notebook
jupyter notebook samples/06/model_router.ipynb

مراجع و مراحل بعدی

پیاده‌سازی محلی: samples/06/ - مسیریاب هوشمند کامل با پشتیبانی از چندین مدل
نمونه‌های مایکروسافت: Hello Foundry Local
مستندات ادغام: ادغام با SDKهای استنتاج
الگوهای پیشرفته: بررسی فراخوانی توابع و ارکستراسیون چند عامل در ماژول 5

جمع‌بندی

Foundry Local امکان هوش مصنوعی قدرتمند روی دستگاه را فراهم می‌کند، جایی که مدل‌ها به ابزارهای هوشمند و تخصصی تبدیل می‌شوند. با انتخاب خودکار مدل، پایش جامع و الگوهای آماده تولید، تیم‌ها می‌توانند برنامه‌های هوش مصنوعی پیچیده‌ای را ارائه دهند که به انواع وظایف مختلف سازگار می‌شوند و در عین حال حریم خصوصی و عملکرد را حفظ می‌کنند. الگوی مسیریاب هوشمند که در اینجا نشان داده شده است، پایه‌ای برای ساخت سیستم‌های هوش مصنوعی پیچیده فراهم می‌کند که می‌توانند از توسعه محلی به استقرار تولید مقیاس‌بندی شوند.

سلب مسئولیت:
این سند با استفاده از سرویس ترجمه هوش مصنوعی Co-op Translator ترجمه شده است. در حالی که ما تلاش می‌کنیم ترجمه‌ها دقیق باشند، لطفاً توجه داشته باشید که ترجمه‌های خودکار ممکن است شامل خطاها یا نادرستی‌ها باشند. سند اصلی به زبان اصلی آن باید به عنوان منبع معتبر در نظر گرفته شود. برای اطلاعات حساس، توصیه می‌شود از ترجمه انسانی حرفه‌ای استفاده کنید. ما مسئولیتی در قبال سوء تفاهم‌ها یا تفسیرهای نادرست ناشی از استفاده از این ترجمه نداریم.