Awesome GPT-OSS [](https://awesome.re)
November 5, 2025 ยท View on GitHub
A curated list of awesome GPT-OSS resources, tools, tutorials, and projects. OpenAI's first fully open-source language model family since GPT-2.
GPT-OSS represents OpenAI's return to open-source AI development with two powerful reasoning models: gpt-oss-120b and gpt-oss-20b. Released under the Apache 2.0 license, these models deliver state-of-the-art performance with configurable reasoning effort, full chain-of-thought access, and native tool use capabilities.
๐ Table of Contents
- ๐ข Official Resources
- ๐ค Models
- ๐ Inference Engines
- ๐ป Local Deployment
- โ๏ธ Cloud Deployment
- ๐ ๏ธ Development Tools
- ๐ Integrations
- ๐ฏ Fine-tuning
- ๐ฑ Applications
- ๐ Tutorials
- ๐ฌ Research
- ๐ก๏ธ Safety
- ๐ฅ Community
- ๐ Comparison with Other Models
- ๐ Contributing
- ๐ License
- โญ Star History
๐ข Official Resources
- OpenAI GPT-OSS Announcement - Official release announcement
- GPT-OSS GitHub Repository - Official implementation and reference code
- GPT-OSS Model Card - Comprehensive model documentation
- Open Models Page - OpenAI's dedicated open models page
- OpenAI Harmony - Response format library for GPT-OSS
- Try gpt-oss - gpt-oss playground
๐ค Models
Hugging Face Models
- gpt-oss-120b - 120B parameter model (117B total, 5.1B active)
- gpt-oss-20b - 20B parameter model (21B total, 3.6B active)
Model Specifications
| Model | Parameters | Active Parameters | Memory Requirement | Hardware |
|---|---|---|---|---|
| gpt-oss-120b | 117B | 5.1B | 80GB | Single H100 |
| gpt-oss-20b | 21B | 3.6B | 16GB | Consumer GPU |
Key Features
- Apache 2.0 License - Permissive open-source licensing
- MXFP4 Quantization - Native 4-bit quantization for efficient inference
- Mixture of Experts (MoE) - Optimized for performance and efficiency
- Configurable Reasoning - Adjustable effort levels (low, medium, high)
- Full Chain-of-Thought - Complete access to reasoning process
- Tool Use Capabilities - Web browsing, Python execution, function calling
๐ Inference Engines
vLLM
- vLLM GPT-OSS Support - Official vLLM implementation
- Flash Attention 3 Kernels - Optimized attention kernels for Hopper GPUs
- Installation:
pip install --pre vllm==0.10.1+gptoss
Ollama
- Ollama GPT-OSS Models - Easy local deployment
- OpenAI Cookbook - Ollama Guide - Official tutorial
- Quick start:
ollama pull gpt-oss:20b && ollama run gpt-oss:20b
llama.cpp
- llama.cpp GPT-OSS Support - CPU and GPU inference
- GGUF Models - Quantized models for llama.cpp
Transformers
- Hugging Face Transformers - Official integration
- Transformers Serve - OpenAI-compatible server
๐ป Local Deployment
Consumer Hardware
- plux โ The fastest way to connect your files to AI. Think file explorer + โadd to AIโ button โ discover, send, and manage your files with one click.
- LM Studio - User-friendly desktop application
- Jan - Open-source ChatGPT alternative
- Msty - Multi-platform LLM client
- Cherry Studio - Desktop client with Ollama support
Enterprise Hardware
- NVIDIA RTX Optimization - RTX-optimized deployment
- Apple Metal Implementation - Native Metal support for Apple Silicon
- AMD ROCm Support - AMD GPU compatibility
โ๏ธ Cloud Deployment
Major Cloud Providers
- Azure AI Foundry - Microsoft's AI platform
- Hugging Face Inference Providers - Multi-provider access
- AWS SageMaker - Amazon's ML platform
- Northflank - GPU-optimized deployment
- Fireworks AI - High-performance inference
- Cerebras - Ultra-fast inference (2-4k tokens/sec)
Edge Computing
- Microsoft AI Foundry Local - On-device inference for Windows
- Ollama Turbo - Hosted Ollama service for large models
๐ ๏ธ Development Tools
Python Libraries
- gpt-oss - Official Python package
- OpenAI Python SDK - Compatible with local endpoints
- LangChain - LLM application framework
- LiteLLM - Unified API across providers
JavaScript/TypeScript
- Responses.js - Response API client library
- Vercel AI SDK - React/Next.js integration
- OpenAI JS SDK - Node.js client
APIs and Protocols
- Chat Completions API - Compatible with OpenAI format
- Responses API - Advanced streaming interface
- OpenAI Harmony Format - New response format
๐ Integrations
Chat Interfaces
- plux โ The fastest way to connect your files to AI. Think file explorer + โadd to AIโ button โ discover, send, and manage your files with one click.
- Open WebUI - Feature-rich web interface
- ChatGPT-Next-Web - Self-hosted ChatGPT UI
- LibreChat - Multi-model chat platform
- LobeChat - Modern chat interface
IDE Extensions
- Continue - Open-source AI code assistant
- AI Toolkit for VSCode - Microsoft's official VSCode extension
- CodeGPT - IntelliJ plugin
Agent Frameworks
- OpenAI Agents SDK - Official agent development framework
- AutoGen - Multi-agent conversation framework
- CrewAI - Role-playing AI agents
- LangGraph - Agent workflow orchestration
๐ฏ Fine-tuning
Training Frameworks
- TRL (Transformer Reinforcement Learning) - Hugging Face training library
- OpenAI Cookbook - LoRA Fine-tuning - Official LoRA example
- Unsloth - Fast fine-tuning framework
- QLoRA - Quantized fine-tuning
Hardware Requirements
- gpt-oss-120b: Single H100 node for LoRA fine-tuning
- gpt-oss-20b: Consumer hardware compatible
- Techniques: LoRA, QLoRA, Parameter-Efficient Fine-Tuning (PEFT)
๐ฑ Applications
Chatbots and Assistants
- Anything LLM - Private document chatbot
- Perplexica - AI-powered search engine
- Dify - LLM application development platform
- FlowiseAI - Visual LLM app builder
Coding Assistants
- Aider - AI pair programming
- GPT Engineer - Code generation from specs
- Open Interpreter - Local code interpreter
- MetaGPT - Multi-agent software development
Research and Analysis
- Paper QA - Scientific paper analysis
- LlamaIndex - Document indexing and search
- RAG Flow - Retrieval-Augmented Generation
- Chroma - Vector database for AI
๐ Tutorials
Getting Started
- OpenAI Cookbook - GPT-OSS Guide - Official comprehensive guide
- How to Run GPT-OSS Locally - Step-by-step local setup
- GPT-OSS with vLLM - Production deployment guide
- Harmony Response Format - Understanding the new format
Advanced Usage
- Fine-tuning GPT-OSS - Custom model training
- Building AI Agents - Agent development with GPT-OSS
- Tool Use Examples - Browser and Python tools
Third-party Tutorials
- GPT-OSS Setup on AWS - Complete AWS deployment guide
- GPU Optimization Guide - Hardware-specific optimizations
- Docker Deployment - Containerized deployment
๐ฌ Research
Academic Papers
- GPT-OSS Model Paper - Technical specifications and benchmarks
- Mixture of Experts Research - MoE architecture foundations
- MXFP4 Quantization - 4-bit quantization techniques
Benchmarks and Evaluations
- Reasoning: Near-parity with o4-mini on core benchmarks
- Coding: Strong performance on Codeforces competitions
- Mathematics: Excellent results on AIME 2024 & 2025
- Tool Use: Superior performance on TauBench agentic evaluation
- Health: Outperforms proprietary models on HealthBench
Performance Analysis
- Simon Willison's Analysis - Independent technical review
- Comparative Benchmarks - Performance vs other models
- Enterprise Adoption Study - Market analysis
Educational Implementations
- ProjektJoe GPT-OSS from Scratch - Educational, from-scratch Python implementation of the GPT-OSS architecture.
๐ก๏ธ Safety
Security Features
- Preparedness Framework Testing - Adversarial fine-tuning results
- Red Teaming Challenge - $500,000 safety challenge
- Safety Advisory Group Review - External expert evaluation
Safety Tools
- Content Filtering - Content moderation tools
- Chain-of-Thought Monitoring - Reasoning transparency
- Usage Policy - Model usage guidelines
๐ฅ Community
Discussion Forums
- OpenAI Developer Forum - Official community
- Hugging Face Forums - ML community discussions
- Reddit r/LocalLLaMA - Local model enthusiasts
- Discord Servers - Real-time community chat
GitHub Organizations
- OpenAI - Official repositories
- Hugging Face - ML ecosystem
- vLLM Team - Inference optimization
- Ollama - Local deployment tools
News and Updates
- OpenAI Blog - Official announcements
- Hugging Face Blog - Technical deep-dives
- AI Research Twitter - Latest developments
- Papers with Code - Research tracking
๐ Comparison with Other Models
| Feature | GPT-OSS-120b | GPT-OSS-20b | Meta Llama 3.3 70b | DeepSeek-R1 |
|---|---|---|---|---|
| License | Apache 2.0 | Apache 2.0 | Custom License | MIT |
| Parameters | 117B (5.1B active) | 21B (3.6B active) | 70B | 671B (37B active) |
| Memory | 80GB | 16GB | 140GB | 340GB |
| Reasoning | โ High | โ Medium | โ Limited | โ Excellent |
| Tool Use | โ Native | โ Native | โ ๏ธ Basic | โ Advanced |
| CoT Access | โ Full | โ Full | โ Hidden | โ Full |
๐ Contributing
Contributions are welcome! Please read the contribution guidelines first.
How to Contribute
- Fork this repository
- Create a new branch for your addition
- Add your resource with a brief description
- Ensure it follows the existing format
- Submit a pull request
Criteria for Inclusion
- Must be related to GPT-OSS models
- Should be actively maintained
- Must be publicly available
- Should provide clear value to the community
๐ License
This awesome list is licensed under the CC0 1.0 Universal license.
โญ Star History
Made with โค๏ธ by the community. If you find this list helpful, please โญ star it and share with others!
Note: GPT-OSS models require the harmony response format to function correctly. Always use the provided chat templates or the OpenAI harmony library for proper interaction.