Awesome GPT-OSS [](https://awesome.re)

November 5, 2025 · View on GitHub

A curated list of awesome GPT-OSS resources, tools, tutorials, and projects. OpenAI's first fully open-source language model family since GPT-2.

GPT-OSS represents OpenAI's return to open-source AI development with two powerful reasoning models: gpt-oss-120b and gpt-oss-20b. Released under the Apache 2.0 license, these models deliver state-of-the-art performance with configurable reasoning effort, full chain-of-thought access, and native tool use capabilities.

📋 Table of Contents

🏢 Official Resources
🤖 Models
🚀 Inference Engines
💻 Local Deployment
☁️ Cloud Deployment
🛠️ Development Tools
🔗 Integrations
🎯 Fine-tuning
📱 Applications
📚 Tutorials
🔬 Research
🛡️ Safety
👥 Community
📊 Comparison with Other Models
🎉 Contributing
📄 License
⭐ Star History

🏢 Official Resources

OpenAI GPT-OSS Announcement - Official release announcement
GPT-OSS GitHub Repository - Official implementation and reference code
GPT-OSS Model Card - Comprehensive model documentation
Open Models Page - OpenAI's dedicated open models page
OpenAI Harmony - Response format library for GPT-OSS
Try gpt-oss - gpt-oss playground

🤖 Models

Hugging Face Models

gpt-oss-120b - 120B parameter model (117B total, 5.1B active)
gpt-oss-20b - 20B parameter model (21B total, 3.6B active)

Model Specifications

Model	Parameters	Active Parameters	Memory Requirement	Hardware
gpt-oss-120b	117B	5.1B	80GB	Single H100
gpt-oss-20b	21B	3.6B	16GB	Consumer GPU

Key Features

Apache 2.0 License - Permissive open-source licensing
MXFP4 Quantization - Native 4-bit quantization for efficient inference
Mixture of Experts (MoE) - Optimized for performance and efficiency
Configurable Reasoning - Adjustable effort levels (low, medium, high)
Full Chain-of-Thought - Complete access to reasoning process
Tool Use Capabilities - Web browsing, Python execution, function calling

🚀 Inference Engines

vLLM

vLLM GPT-OSS Support - Official vLLM implementation
Flash Attention 3 Kernels - Optimized attention kernels for Hopper GPUs
Installation: pip install --pre vllm==0.10.1+gptoss

Ollama

Ollama GPT-OSS Models - Easy local deployment
OpenAI Cookbook - Ollama Guide - Official tutorial
Quick start: ollama pull gpt-oss:20b && ollama run gpt-oss:20b

llama.cpp

llama.cpp GPT-OSS Support - CPU and GPU inference
GGUF Models - Quantized models for llama.cpp

Transformers

Hugging Face Transformers - Official integration
Transformers Serve - OpenAI-compatible server

💻 Local Deployment

Consumer Hardware

plux — The fastest way to connect your files to AI. Think file explorer + “add to AI” button — discover, send, and manage your files with one click.
LM Studio - User-friendly desktop application
Jan - Open-source ChatGPT alternative
Msty - Multi-platform LLM client
Cherry Studio - Desktop client with Ollama support

Enterprise Hardware

NVIDIA RTX Optimization - RTX-optimized deployment
Apple Metal Implementation - Native Metal support for Apple Silicon
AMD ROCm Support - AMD GPU compatibility

☁️ Cloud Deployment

Major Cloud Providers

Azure AI Foundry - Microsoft's AI platform
Hugging Face Inference Providers - Multi-provider access
AWS SageMaker - Amazon's ML platform
Northflank - GPU-optimized deployment
Fireworks AI - High-performance inference
Cerebras - Ultra-fast inference (2-4k tokens/sec)

Edge Computing

Microsoft AI Foundry Local - On-device inference for Windows
Ollama Turbo - Hosted Ollama service for large models

🛠️ Development Tools

Python Libraries

gpt-oss - Official Python package
OpenAI Python SDK - Compatible with local endpoints
LangChain - LLM application framework
LiteLLM - Unified API across providers

JavaScript/TypeScript

Responses.js - Response API client library
Vercel AI SDK - React/Next.js integration
OpenAI JS SDK - Node.js client

APIs and Protocols

Chat Completions API - Compatible with OpenAI format
Responses API - Advanced streaming interface
OpenAI Harmony Format - New response format

🔗 Integrations

Chat Interfaces

plux — The fastest way to connect your files to AI. Think file explorer + “add to AI” button — discover, send, and manage your files with one click.
Open WebUI - Feature-rich web interface
ChatGPT-Next-Web - Self-hosted ChatGPT UI
LibreChat - Multi-model chat platform
LobeChat - Modern chat interface

IDE Extensions

Continue - Open-source AI code assistant
AI Toolkit for VSCode - Microsoft's official VSCode extension
CodeGPT - IntelliJ plugin

Agent Frameworks

OpenAI Agents SDK - Official agent development framework
AutoGen - Multi-agent conversation framework
CrewAI - Role-playing AI agents
LangGraph - Agent workflow orchestration

🎯 Fine-tuning

Training Frameworks

TRL (Transformer Reinforcement Learning) - Hugging Face training library
OpenAI Cookbook - LoRA Fine-tuning - Official LoRA example
Unsloth - Fast fine-tuning framework
QLoRA - Quantized fine-tuning

Hardware Requirements

gpt-oss-120b: Single H100 node for LoRA fine-tuning
gpt-oss-20b: Consumer hardware compatible
Techniques: LoRA, QLoRA, Parameter-Efficient Fine-Tuning (PEFT)

📱 Applications

Chatbots and Assistants

Anything LLM - Private document chatbot
Perplexica - AI-powered search engine
Dify - LLM application development platform
FlowiseAI - Visual LLM app builder

Coding Assistants

Aider - AI pair programming
GPT Engineer - Code generation from specs
Open Interpreter - Local code interpreter
MetaGPT - Multi-agent software development

Research and Analysis

Paper QA - Scientific paper analysis
LlamaIndex - Document indexing and search
RAG Flow - Retrieval-Augmented Generation
Chroma - Vector database for AI

📚 Tutorials

Getting Started

OpenAI Cookbook - GPT-OSS Guide - Official comprehensive guide
How to Run GPT-OSS Locally - Step-by-step local setup
GPT-OSS with vLLM - Production deployment guide
Harmony Response Format - Understanding the new format

Advanced Usage

Fine-tuning GPT-OSS - Custom model training
Building AI Agents - Agent development with GPT-OSS
Tool Use Examples - Browser and Python tools

Third-party Tutorials

GPT-OSS Setup on AWS - Complete AWS deployment guide
GPU Optimization Guide - Hardware-specific optimizations
Docker Deployment - Containerized deployment

🔬 Research

Academic Papers

GPT-OSS Model Paper - Technical specifications and benchmarks
Mixture of Experts Research - MoE architecture foundations
MXFP4 Quantization - 4-bit quantization techniques

Benchmarks and Evaluations

Reasoning: Near-parity with o4-mini on core benchmarks
Coding: Strong performance on Codeforces competitions
Mathematics: Excellent results on AIME 2024 & 2025
Tool Use: Superior performance on TauBench agentic evaluation
Health: Outperforms proprietary models on HealthBench

Performance Analysis

Simon Willison's Analysis - Independent technical review
Comparative Benchmarks - Performance vs other models
Enterprise Adoption Study - Market analysis

Educational Implementations

ProjektJoe GPT-OSS from Scratch - Educational, from-scratch Python implementation of the GPT-OSS architecture.

🛡️ Safety

Security Features

Preparedness Framework Testing - Adversarial fine-tuning results
Red Teaming Challenge - $500,000 safety challenge
Safety Advisory Group Review - External expert evaluation

Safety Tools

Content Filtering - Content moderation tools
Chain-of-Thought Monitoring - Reasoning transparency
Usage Policy - Model usage guidelines

👥 Community

Discussion Forums

OpenAI Developer Forum - Official community
Hugging Face Forums - ML community discussions
Reddit r/LocalLLaMA - Local model enthusiasts
Discord Servers - Real-time community chat

GitHub Organizations

OpenAI - Official repositories
Hugging Face - ML ecosystem
vLLM Team - Inference optimization
Ollama - Local deployment tools

News and Updates

OpenAI Blog - Official announcements
Hugging Face Blog - Technical deep-dives
AI Research Twitter - Latest developments
Papers with Code - Research tracking

📊 Comparison with Other Models

Feature	GPT-OSS-120b	GPT-OSS-20b	Meta Llama 3.3 70b	DeepSeek-R1
License	Apache 2.0	Apache 2.0	Custom License	MIT
Parameters	117B (5.1B active)	21B (3.6B active)	70B	671B (37B active)
Memory	80GB	16GB	140GB	340GB
Reasoning	✅ High	✅ Medium	❌ Limited	✅ Excellent
Tool Use	✅ Native	✅ Native	⚠️ Basic	✅ Advanced
CoT Access	✅ Full	✅ Full	❌ Hidden	✅ Full

🎉 Contributing

Contributions are welcome! Please read the contribution guidelines first.

How to Contribute

Fork this repository
Create a new branch for your addition
Add your resource with a brief description
Ensure it follows the existing format
Submit a pull request

Criteria for Inclusion

Must be related to GPT-OSS models
Should be actively maintained
Must be publicly available
Should provide clear value to the community

📄 License

This awesome list is licensed under the CC0 1.0 Universal license.

⭐ Star History

Made with ❤️ by the community. If you find this list helpful, please ⭐ star it and share with others!

Note: GPT-OSS models require the harmony response format to function correctly. Always use the provided chat templates or the OpenAI harmony library for proper interaction.