LLM Providers

March 15, 2026 · View on GitHub

🏥 Quick Return to Emergency Room

You are in a specialist desk.
For full triage and doctors on duty, return here:

WFGY Global Fix Map — main Emergency Room, 300+ structured fixes

WFGY Problem Map 1.0 — 16 reproducible failure modes

Think of this page as a sub-room.
If you want full consultation and prescriptions, go back to the Emergency Room lobby.

This page helps you choose between LLM vendors and fix provider-looking bugs that are actually schema, retrieval, orchestration, or eval drift. If you are new, start with the Orientation table and the FAQ. If you are debugging, jump to the Fix Hub.

Orientation: who is who

Provider	What it is	Typical use case	Link
OpenAI	GPT-4/4o from OpenAI Inc.	Direct API, fastest model access	openai.md
Azure OpenAI	Microsoft enterprise wrapper for OpenAI models	VNet, compliance, enterprise billing	azure_openai.md
Anthropic	The company behind Claude	Safety-focused platform	anthropic.md
Claude (Anthropic)	The model family from Anthropic	Long context, tool use, JSON control	anthropic_claude.md
Google Gemini	Google DeepMind multimodal models	Multimodal chat, reasoning	gemini.md
Google Vertex AI	Google Cloud AI platform that hosts Gemini and more	Pipelines, deployment, governance	google_vertex_ai.md
Mistral	EU startup with efficient open-weight models (e.g., Mixtral MoE)	Cost/perf, open ecosystem	mistral.md
Meta LLaMA	Meta open-weight model family	Local or private deployment, llama.cpp	meta_llama.md
Cohere	Enterprise NLP API and embeddings	RAG stacks, enterprise NLP	cohere.md
DeepSeek	CN player with infra-optimized long-context models	Cost-efficient, long windows	deepseek.md
Kimi (Moonshot)	CN chat-first models, very large parameter claims	Consumer chat focus	kimi.md
Groq	Hardware vendor: LPUs for transformer inference	Ultra-low latency serving (not a model)	groq.md
xAI Grok	xAI model family	X/Twitter integration, general chat	grok_xai.md
AWS Bedrock	AWS gateway to many models via one API	Enterprises already on AWS	aws_bedrock.md
OpenRouter	Community model aggregator, OpenAI-style endpoint	Try many models via one API key	openrouter.md
Together AI	Aggregator + infra for open weights and fine-tunes	Fast hosting, tuning services	together.md
MiniMax	CN AI lab with long-context models (204K), OpenAI-compatible API	Cost-efficient chat, RAG, agent workflows	minimax.md

FAQ for newcomers

OpenAI vs Azure OpenAI — are they the same?
Same models, different packaging. OpenAI = direct API and fastest releases. Azure OpenAI = Microsoft billing, VNet, compliance, data residency.

Anthropic vs Claude — why two pages?
Anthropic is the company. Claude is the model family. We separate because “platform issues” and “model quirks” often need different fixes.

Gemini vs Vertex AI — what is the relation?
Gemini is a model. Vertex AI is Google Cloud’s platform that runs Gemini and provides pipelines, eval, and deployment features.

What makes Mistral special?
Efficient open-weights and MoE designs. Good cost/perf. Easy to host in your own infra.

Meta LLaMA vs local LLaMA
Meta releases the weights. Community tools like llama.cpp let you run them locally on CPU or GPU.

Groq LPU vs GPU
GPU is general purpose. LPU is a chip specialized for transformer inference. You get very low latency for chat workloads.

Bedrock vs OpenRouter vs Together
Bedrock is an AWS enterprise gateway. OpenRouter is a community aggregator with OpenAI-style API. Together is an infra host for open weights with training and fine-tune options.

Open these first

Visual map and recovery: RAG Architecture & Recovery
End to end retrieval knobs: Retrieval Playbook
Why this snippet (traceability schema): Retrieval Traceability
Ordering control: Rerankers
Embedding vs meaning: Embedding ≠ Semantic
Hallucination and chunk boundaries: Hallucination
Long chains and entropy: Context Drift, Entropy Collapse
Structural collapse and recovery: Logic Collapse
Snippet and citation schema: Data Contracts
Live ops: Live Monitoring for RAG, Debug Playbook
Boot order issues: Bootstrap Ordering, Deployment Deadlock, Pre-Deploy Collapse

Core acceptance targets

ΔS(question, retrieved) ≤ 0.45
Coverage ≥ 0.70 for the target section
λ remains convergent across three paraphrases and two seeds
E_resonance stays flat on long windows

Fix Hub — typical provider symptoms → exact fix

Symptom	Likely cause	Open this
JSON mode breaks, invalid objects	Schema too loose or nested tool calls	Data Contracts, Logic Collapse
Tool calls loop or stall	Agent role drift, missing timeouts	Multi-Agent Problems, Role-drift deep dive
High similarity yet wrong snippet	Metric mismatch or fragmented store	Embedding ≠ Semantic, Vectorstore Fragmentation
Answers flip between runs	Prompt headers reorder and λ flips	Context Drift, Retrieval Traceability
Hybrid retrievers worse than single	Query parsing split, mis-weighted rerank	Query Parsing Split, Rerankers
Jailbreaks or bluffing	Overconfidence and missing fences	Bluffing Controls, Retrieval Traceability

Fix in 60 seconds

Measure ΔS
Compute ΔS(question, retrieved) and ΔS(retrieved, expected anchor). Stable < 0.40, transitional 0.40–0.60, risk ≥ 0.60.
Probe λ_observe
Vary top-k and prompt headers. If λ flips, lock the schema and apply a BBAM variance clamp.
Apply the module
Retrieval drift → BBMC + Data Contracts
Reasoning collapse → BBCR bridge + BBAM
Dead ends in long runs → BBPF alternate paths
Verify
Coverage ≥ 0.70 on three paraphrases. λ convergent on two seeds.

Quick-Start Downloads

Tool	Link	3-step setup
WFGY 1.0 PDF	Engine Paper	1) Download 2) Upload to your LLM 3) Ask “Answer using WFGY + ”
TXT OS (plain text OS)	TXTOS.txt	1) Download 2) Paste into any LLM chat 3) Type “hello world” to boot

Explore More

Layer	Page	What it’s for
⭐ Proof	WFGY Recognition Map	External citations, integrations, and ecosystem proof
⚙️ Engine	WFGY 1.0	Original PDF tension engine and early logic sketch (legacy reference)
⚙️ Engine	WFGY 2.0	Production tension kernel for RAG and agent systems
⚙️ Engine	WFGY 3.0	TXT based Singularity tension engine (131 S class set)
🗺️ Map	Problem Map 1.0	Flagship 16 problem RAG failure taxonomy and fix map
🗺️ Map	Problem Map 2.0	Global Debug Card for RAG and agent pipeline diagnosis
🗺️ Map	Problem Map 3.0	Global AI troubleshooting atlas and failure pattern map
🧰 App	TXT OS	.txt semantic OS with fast bootstrap
🧰 App	Blah Blah Blah	Abstract and paradox Q&A built on TXT OS
🧰 App	Blur Blur Blur	Text to image generation with semantic control
🏡 Onboarding	Starter Village	Guided entry point for new users

If this repository helped, starring it improves discovery so more builders can find the docs and tools.