Large Language Models: A Comprehensive Survey of its Applications, Challenges, Limitations, And Future Prospects (Updated July 2025)

July 21, 2025 · View on GitHub

The Large Language Models Survey repository is a comprehensive compendium dedicated to the exploration and understanding of Large Language Models (LLMs). It houses an assortment of resources including research papers, blog posts, tutorials, code examples, and more to provide an in-depth look at the progression, methodologies, and applications of LLMs. This repo is an invaluable resource for AI researchers, data scientists, or enthusiasts interested in the advancements and inner workings of LLMs. We encourage contributions from the wider community to promote collaborative learning and continue pushing the boundaries of LLM research.

Timeline of LLMs

evolutionv1 1

List of LLMs (Updated July 2025)

Language Model	Organization	Release Date	Checkpoints	Paper/Blog	Params (B)	Context Length	Licence	Try it
2025 Latest Models
Grok 3 / Grok 3 Mini	xAI	2025/02	Grok 3, Grok 3 Mini	Grok 3 Beta — The Age of Reasoning Agents	314 active (1M+ total) / Smaller variant	1M tokens	Proprietary	xAI Platform
Llama 4 Scout	Meta	2025/04	Llama 4 Scout	The Llama 4 herd: The beginning of a new era	17B active (109B total)	10M tokens	Llama 4 Community License	HuggingFace
Llama 4 Maverick	Meta	2025/04	Llama 4 Maverick	The Llama 4 herd	17B active (400B total)	1M tokens	Llama 4 Community License	HuggingFace
Llama 4 Behemoth	Meta	2025/04 (Training)	In Training	The Llama 4 herd	288B active (~2T total)	TBD	TBD	TBD
Qwen 3 Family	Alibaba	2025/04	Qwen 3 Family	Alibaba unveils Qwen3	0.6B - 235B (22B active)	32K - 131K tokens	Apache 2.0	Qwen Chat
DeepSeek-R1 Family	DeepSeek	2025/01-05	DeepSeek-R1, R1-Zero, R1-0528	DeepSeek-R1: Incentivizing Reasoning Capability	37B active (671B total)	128K tokens	MIT	DeepSeek Platform
o3 / o3-mini / o4-mini	OpenAI	2025/01-04	o3, o3-mini, o4-mini	Introducing OpenAI o3 and o4-mini	Undisclosed	200K tokens	Proprietary	ChatGPT
Claude 4 (Sonnet & Opus)	Anthropic	2025/05	Claude Sonnet 4, Claude Opus 4	Introducing Claude 4	Undisclosed	200K tokens	Proprietary	Claude.ai
Gemini 2.5 Family	Google	2025/03-06	Gemini 2.5 Pro, 2.5 Flash, 2.5 Flash-Lite	Gemini 2.5: Our newest Gemini model with thinking	Undisclosed	1M tokens	Proprietary	Gemini
Major 2024 Models
GPT-4o / GPT-4o mini	OpenAI	2024/05-07	GPT-4o, GPT-4o mini	Hello GPT-4o, GPT-4o mini: advancing cost-efficient intelligence	Undisclosed	128K tokens	Proprietary	ChatGPT
o1 / o1-mini	OpenAI	2024/09	o1, o1-mini	Learning to Reason with LLMs	Undisclosed	200K / 128K tokens	Proprietary	ChatGPT
Claude 3 Family	Anthropic	2024/03	Claude 3 Haiku, Claude 3 Sonnet, Claude 3 Opus	Introducing the next generation of Claude	Undisclosed	200K tokens	Proprietary	Claude.ai
Claude 3.5 Sonnet	Anthropic	2024/06	Claude 3.5 Sonnet	Claude 3.5 Sonnet	Undisclosed	200K tokens	Proprietary	Claude.ai
Claude 3.7 Sonnet	Anthropic	2024/10	Claude 3.7 Sonnet	Claude 3.7 Sonnet	Undisclosed	200K tokens	Proprietary	Claude.ai
Gemini 1.5 Pro / Flash	Google	2024/02-05	Gemini 1.5 Pro, Gemini 1.5 Flash	Our next-generation model: Gemini 1.5	Undisclosed	1M-2M / 1M tokens	Proprietary	Gemini
Gemini 2.0 Flash	Google	2024/12	Gemini 2.0 Flash	Gemini 2.0 Flash	Undisclosed	1M tokens	Proprietary	Gemini
Gemma 2	Google	2024/06	Gemma 2 Family	Gemma 2: Improving Open Language Models at a Practical Size	9B, 27B	8K tokens	Apache 2.0	HuggingFace
Llama 3 Family	Meta	2024/04	Llama 3 Weights	Introducing Meta Llama 3	8B, 70B	8K tokens	Custom	HuggingChat
Llama 3.1	Meta	2024/07	Llama 3.1 Weights	The Llama 3 Herd of Models	8B, 70B, 405B	128K tokens	Custom	HuggingChat
Llama 3.2	Meta	2024/09	Llama 3.2 Models	Llama 3.2: Revolutionizing edge AI and vision with open, customizable models	1B, 3B, 11B, 90B	128K tokens	Custom	HuggingChat
Llama 3.3	Meta	2024/12	Llama 3.3 70B	Llama 3.3 70B	70B	128K tokens	Custom	HuggingChat
Phi-3 Family	Microsoft	2024/04-08	Phi-3 Mini, Phi-3 Small, Phi-3 Medium, Phi-3.5 Mini	Phi-3 Technical Report	3.8B - 14B	4K-128K tokens	MIT	Azure AI Studio
IBM Granite 3.0 / 3.1	IBM	2024/10-12	Granite 3.0, Granite 3.1	IBM Introduces Granite 3.0	2B, 8B	4K / 128K tokens	Apache 2.0	IBM watsonx
Command R / R+	Cohere	2024/03-04	Command R, Command R+	Command R: Cohere's scalable generative model	35B / 104B	128K tokens	CC BY-NC 4.0	Cohere Platform
DeepSeek-V3 Family	DeepSeek	2024/12-2025/03	DeepSeek-V3, DeepSeek-V3-0324	DeepSeek-V3 Technical Report	37B active (671B total)	128K tokens	MIT	DeepSeek Platform
Qwen 2.5 Family	Alibaba	2024/09-2025/01	Qwen 2.5 Family, Qwen 2.5-Max	Qwen2.5: A Party of Foundation Models	0.5B - 72B / Undisclosed	32K-128K tokens	Apache 2.0 / Proprietary	Qwen Chat
QwQ-32B	Alibaba	2024/11	QwQ-32B-Preview	QwQ-32B Technical Report	32B	32K tokens	Apache 2.0	Qwen Chat
Mistral Family	Mistral AI	2023/09-2025/05	Mistral-7B, Mistral Large 2, Mistral Medium	Mistral 7B	7B - 123B / Undisclosed	4K-128K tokens	Apache 2.0 / Proprietary	Mistral Platform
Command R / R+	2024/03-04	Command R, Command R+	Command R: Cohere's scalable generative model	35B / 104B	128K tokens	CC BY-NC 4.0	Cohere Platform
DeepSeek-V3 Family	2024/12-2025/03	DeepSeek-V3, DeepSeek-V3-0324	DeepSeek-V3 Technical Report	37B active (671B total)	128K tokens	MIT	DeepSeek Platform
Qwen 2.5 Family	2024/09-2025/01	Qwen 2.5 Family, Qwen 2.5-Max	Qwen2.5: A Party of Foundation Models	0.5B - 72B / Undisclosed	32K-128K tokens	Apache 2.0 / Proprietary	Qwen Chat
QwQ-32B	2024/11	QwQ-32B-Preview	QwQ-32B Technical Report	32B	32K tokens	Apache 2.0	Qwen Chat
Mistral Family	2023/09-2025/05	Mistral-7B, Mistral Large 2, Mistral Medium	Mistral 7B	7B - 123B / Undisclosed	4K-128K tokens	Apache 2.0 / Proprietary	Mistral Platform
Previous Generation Models
GPT-4 / GPT-4.5	2023/03-2024/06	API Access	GPT-4 Technical Report	Undisclosed	8K-128K tokens	Proprietary	ChatGPT
LLaMA 2	2023/06	LLaMA 2 Weights	Llama 2: Open Foundation and Fine-Tuned Chat Models	7B - 70B	4K tokens	Custom	HuggingChat
PaLM 2	2023/05	PaLM 2	PaLM 2 Technical Report	Undisclosed	8K tokens	Proprietary	Bard
Bard	2023/03	Bard	Bard: An experiment by Google	Undisclosed	8K tokens	Proprietary	Bard
Chinchilla	2022/03	Chinchilla	Training Compute-Optimal Large Language Models	70B	2K tokens	Proprietary	[Research Only]
Sparrow	2022/09	Sparrow	Improving alignment of dialogue agents via targeted human judgements	70B	4K tokens	Proprietary	[Research Only]
Gopher	2021/12	Gopher	Scaling Language Models: Methods, Analysis & Insights from Training Gopher	280B	2K tokens	Proprietary	[Research Only]
YaLM	2022/06	YaLM 100B	YaLM 100B	100B	2K tokens	Apache 2.0	GitHub
OPT	2022/05	OPT Family	OPT: Open Pre-trained Transformer Language Models	0.125B - 175B	2K tokens	MIT	HuggingFace
BLOOM	2022/11	BLOOM	BLOOM: A 176B-Parameter Open-Access Multilingual Language Model	176B	2K tokens	OpenRAIL-M v1	HuggingFace
Jurassic-1 / Jurassic-2	2021/08 / 2023/03	AI21 Studio	Jurassic-1: Technical Details And Evaluation	178B	2K / 8K tokens	Proprietary	AI21 Studio
Anthropic LM (v4-s3)	2022/12	Anthropic LM	Constitutional AI: Harmlessness from AI Feedback	52B	4K tokens	Proprietary	[Research Only]
GLaM	2021/12	GLaM	GLaM: Efficient Scaling of Language Models with Mixture-of-Experts	1.2T (64B active)	2K tokens	Proprietary	[Research Only]
GPT-J / GPT-NeoX	2021/06 / 2022/04	GPT-J-6B, GPT-NeoX-20B	GPT-J-6B: 6B JAX-Based Transformer	6B / 20B	2K tokens	Apache 2.0	HuggingFace
Minerva	2022/06	Minerva	Solving Quantitative Reasoning Problems with Language Models	540B	2K tokens	Proprietary	[Research Only]
Gallactica	2022/11	Gallactica	Gallactica: A Large Language Model for Science	120B	2K tokens	Apache 2.0	[Removed]
Vicuna	2023/03	Vicuna	Vicuna: An Open-Source Chatbot Impressing GPT-4	7B, 13B, 33B	2K tokens	Custom	FastChat
Alpaca	2023/03	Stanford Alpaca	Stanford Alpaca: An Instruction-following LLaMA Model	7B	2K tokens	Custom	GitHub
Coding-Specialized Models
Code Llama	2023/08	Code Llama Models	Code Llama: Open Foundation Models for Code	7B - 34B	4K tokens	Custom	HuggingChat
StarCoder / StarChat	2023/05	StarCoder, StarChat	StarCoder: A State-of-the-Art LLM for Code	1.1B - 16B	8K tokens	OpenRAIL-M v1	HuggingFace
CodeGen2 / CodeGen2.5	2023/04-07	CodeGen2, CodeGen2.5	CodeGen2: Lessons for Training LLMs on Programming and Natural Languages	1B - 16B	2K tokens	Apache 2.0	HuggingFace
CodeT5+	2023/05	CodeT5+	CodeT5+: Open Code Large Language Models for Code Understanding and Generation	0.22B - 16B	512 tokens	BSD-3-Clause	GitHub
Replit Code	2023/05	replit-code-v1-3b	Training a SOTA Code LLM in 1 week	2.7B	Infinity (ALiBi)	CC BY-SA-4.0	HuggingFace
SantaCoder	2023/01	SantaCoder	SantaCoder: don't reach for the stars!	1.1B	2K tokens	OpenRAIL-M v1	HuggingFace
DeciCoder	2023/08	DeciCoder-1B	Introducing DeciCoder: The New Gold Standard in Efficient and Accurate Code Generation	1.1B	2K tokens	Apache 2.0	HuggingFace
Additional Historical Models
T5 / Flan-T5	2019/10	T5 & Flan-T5	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer	0.06B - 11B	512 tokens	Apache 2.0	HuggingFace
UL2 / Flan-UL2	2022/10	UL2 & Flan-UL2	UL2 20B: An Open Source Unified Language Learner	20B	512-2K tokens	Apache 2.0	HuggingFace
InstructGPT	2022/03	API Access	Training language models to follow instructions with human feedback	1.3B - 175B	2K tokens	Proprietary	[OpenAI API]
ChatGPT	2022/11	API Access	ChatGPT: Optimizing Language Models for Dialogue	~175B	4K tokens	Proprietary	ChatGPT
Pythia	2023/04	Pythia 70M - 12B	Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling	0.07B - 12B	2K tokens	Apache 2.0	HuggingFace
Dolly	2023/04	dolly-v2-12b	Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM	3B, 7B, 12B	2K tokens	MIT	HuggingFace
RedPajama-INCITE	2023/05	RedPajama-INCITE	Releasing 3B and 7B RedPajama-INCITE family of models	3B - 7B	2K tokens	Apache 2.0	HuggingFace
Falcon	2023/05	Falcon-180B, Falcon-40B, Falcon-7B	The RefinedWeb Dataset for Falcon LLM	7B, 40B, 180B	2K tokens	Apache 2.0	HuggingFace
MPT Family	2023/05-06	MPT-7B, MPT-30B	Introducing MPT-7B	7B, 30B	2K-8K tokens	Apache 2.0	MosaicML
OpenLLaMA	2023/05	OpenLLaMA Models	OpenLLaMA: An Open Reproduction of LLaMA	3B, 7B, 13B	2K tokens	Apache 2.0	HuggingFace
h2oGPT	2023/05	h2oGPT	Building the World's Best Open-Source Large Language Model	12B - 20B	256-2K tokens	Apache 2.0	h2oGPT
FastChat-T5	2023/04	fastchat-t5-3b-v1.0	FastChat-T5: Compact and Commercial-friendly Chatbot	3B	512 tokens	Apache 2.0	HuggingFace
StableLM	2023/04	StableLM-Alpha	Stability AI Launches StableLM Suite	3B - 65B	4K tokens	CC BY-SA-4.0	HuggingFace
Koala	2023/04	Koala	Koala: A Dialogue Model for Academic Research	13B	4K tokens	Custom	BAIR
OpenHermes	2023/09	OpenHermes-7B, OpenHermes-13B	Nous Research OpenHermes	7B, 13B	4K tokens	MIT	HuggingFace
SOLAR	2023/12	Solar-10.7B	SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-scaling	10.7B	4K tokens	Apache 2.0	HuggingFace
Phi-2	2023/12	phi-2	Phi-2: The surprising power of small language models	2.7B	2K tokens	MIT	HuggingFace
OpenLM	2023/09	OpenLM 1B, OpenLM 7B	Open LM: a minimal but performative language modeling repository	1B, 7B	2K tokens	MIT	HuggingFace
RWKV	2021/08	RWKV Models	The RWKV Language Model	0.1B - 14B	Infinite (RNN)	Apache 2.0	HuggingFace
DLite	2023/05	dlite-v2-1_5b	Announcing DLite V2: Lightweight, Open LLMs	0.124B - 1.5B	1K tokens	Apache 2.0	HuggingFace
Open Assistant	2023/03	OA-Pythia-12B	Democratizing Large Language Model Alignment	12B	2K tokens	Apache 2.0	HuggingFace
Cerebras-GPT	2023/03	Cerebras-GPT	Cerebras-GPT: A Family of Open, Compute-efficient, Large Language Models	0.111B - 13B	2K tokens	Apache 2.0	HuggingFace
XGen	2023/06	XGen-7B-8K-Base	Long Sequence Modeling with XGen	7B	8K tokens	Apache 2.0	HuggingFace

Key Developments in 2024

The year 2024 was transformative for the LLM landscape, with multiple breakthrough releases that established new benchmarks and capabilities:

OpenAI's Major Releases: GPT-4o launched in May 2024 brought true multimodal capabilities with 232ms response times, while o1 and o1-mini in September introduced reasoning models that spend more time "thinking" through problems, achieving 83% on mathematical olympiad problems compared to GPT-4o's 13%.

Anthropic's Claude 3 Family: The Claude 3 series (Haiku, Sonnet, Opus) launched in March 2024 were the first models to challenge GPT-4's dominance on leaderboards, followed by Claude 3.5 Sonnet in June and Claude 3.7 Sonnet in October, which became particularly popular for coding tasks.

Google's Gemini Evolution: Gemini 1.5 Pro debuted in February 2024 with up to 2M token context windows, followed by Gemini 1.5 Flash in May for faster performance, and Gemini 2.0 Flash in December 2024.

Meta's Llama Progression: Llama 3 (8B, 70B) launched in April 2024, followed by the groundbreaking Llama 3.1 series in July including the massive 405B parameter model - the largest open-source model at the time. Llama 3.2 brought multimodal capabilities in September, and Llama 3.3 concluded the year in December.

Microsoft's Phi Revolution: Microsoft's Phi-3 family proved that smaller models could punch above their weight, with Phi-3 Mini (3.8B parameters) matching much larger models on benchmarks. The series expanded with Phi-3 Small (7B), Phi-3 Medium (14B), and Phi-3.5 Mini throughout 2024.

Enterprise-Focused Models: IBM Granite 3.0 launched in October 2024 focused on enterprise use cases, while Cohere's Command R and Command R+ models excelled in retrieval-augmented generation tasks.

Google's Open Models: Gemma 2 (9B, 27B parameters) launched in June 2024 became highly popular in the open-source community, consistently ranking high in community evaluations.

Key Developments in 2025

The year 2025 has been marked by several breakthrough releases in the LLM landscape. Grok 3, launched by xAI in February 2025, introduced a 1 million token context window and achieved a record-breaking Elo score of 1402 in the Chatbot Arena, making it the first AI model to surpass this milestone. The model was trained on 12.8 trillion tokens and boasts 10x the computational power of its predecessor.

Meta's Llama 4 family represents a major leap forward with the introduction of Mixture-of-Experts (MoE) architecture. Llama 4 Scout features an unprecedented 10 million token context window, while Llama 4 Maverick achieves an ELO score of 1417 on LMSYS Chatbot Arena, outperforming GPT-4o and Gemini 2.0 Flash.

DeepSeek-R1 emerged as the first major open-source reasoning model, trained purely through reinforcement learning without supervised fine-tuning. The model demonstrates performance comparable to OpenAI's o1 across math, code, and reasoning tasks while being completely open-source under the MIT license.

Cursor-AI emerged as a vibe coding platform. Qwen 3, released by Alibaba in April 2025, features a family of "hybrid" reasoning models ranging from 0.6B to 235B parameters, supporting 119 languages and trained on over 36 trillion tokens. The models seamlessly integrate thinking and non-thinking modes, offering users flexibility to control the thinking budget.

OpenAI continued its reasoning model series with o3 and o4-mini in April 2025, while Anthropic launched Claude 4 (Opus 4 and Sonnet 4) in May 2025, setting new standards for coding and advanced reasoning with extended thinking capabilities and tool use.

Google's Gemini 2.5 Pro debuted as a thinking model with a 1 million token context window, leading on LMArena leaderboards and excelling in coding, math, and multimodal understanding tasks.

Notable Trends in 2025

Reasoning Models: The emergence of models that can "think" through problems step-by-step, with extended reasoning capabilities becoming standard.
Massive Context Windows: Models now support context windows ranging from 1M to 10M tokens, enabling processing of entire codebases and documents.
Mixture-of-Experts (MoE) Architecture: More efficient model architectures that activate only a subset of parameters during inference.
Open-Source Reasoning: DeepSeek-R1's success has democratized access to reasoning capabilities previously available only in proprietary models.
Multimodal Integration: Native multimodality becoming standard, with models trained on text, images, audio, and video from the ground up.
Tool Use and Agentic Capabilities: Enhanced ability to use tools, execute code, and perform complex multi-step tasks autonomously.

Performance Benchmarks (2025)

Reasoning Benchmarks (AIME 2025)

Grok 3: 93.3%
DeepSeek-R1-0528: 87.5%
Gemini 2.5 Pro: 86.7%
o3-mini: 86.5%

Coding Benchmarks (SWE-bench Verified)

Claude Opus 4: 72.5%
Claude Sonnet 4: 72.7%
OpenAI Codex 1: 72.1%
Llama 4 Maverick: ~70%

Long Context Performance (1M+ tokens)

Llama 4 Scout: 10M tokens
Grok 3: 1M tokens
Gemini 2.5 Pro: 1M tokens
Llama 4 Maverick: 1M tokens

Model Evolution Timeline

2022: Foundation Era

ChatGPT revolutionized conversational AI
InstructGPT introduced instruction following
Large proprietary models dominated (GPT-3, PaLM, Chinchilla)

2023: Open Source Awakening

LLaMA sparked the open-source revolution
Claude introduced constitutional AI
Specialized coding models emerged (Code Llama, StarCoder)
Model sizes optimized for efficiency (Phi, Mistral)

2024: Multimodal & Reasoning Breakthrough

GPT-4o achieved true multimodality
o1 introduced step-by-step reasoning
Claude 3 challenged GPT-4 dominance
Llama 3.1 405B became largest open model
Gemini 1.5 pushed context limits to 2M tokens

2025: The Reasoning Revolution

Grok 3 achieved highest Arena scores
DeepSeek-R1 democratized reasoning capabilities
Llama 4 introduced 10M token contexts
Claude 4 set new coding standards
Qwen 3 pioneered hybrid reasoning modes

Citation

If you find our survey useful for your research, please cite the following paper:

@article{hadi2024large,
  title={Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects},
  author={Hadi, Muhammad Usman and Al Tashi, Qasem and Shah, Abbas and Qureshi, Rizwan and Muneer, Amgad and Irfan, Muhammad and Zafar, Anas and Shaikh, Muhammad Bilal and Akhtar, Naveed and Wu, Jia and others},
  journal={Authorea Preprints},
  year={2024},
  publisher={Authorea}
}

Model Organization Summary

By Company/Organization:

🔴 Proprietary Models:

OpenAI: GPT-4, GPT-4.5, GPT-4o, o1, o3, o4-mini, ChatGPT, InstructGPT
Anthropic: Claude 3 Family, Claude 3.5, Claude 3.7, Claude 4, Anthropic LM
Google/DeepMind: Gemini 2.5, Gemini 2.0, Gemini 1.5, PaLM 2, Bard, T5, UL2, Chinchilla, Sparrow, Gopher, GLaM, Minerva
xAI: Grok 3, Grok 3 Mini
AI21 Labs: Jurassic-1, Jurassic-2
Mistral AI: Mistral 7B, Mistral Large 2, Mistral Medium

🟢 Open Source Models:

Meta: Llama 4, Llama 3.x, Llama 2, OPT, Code Llama, Gallactica
Alibaba: Qwen 3, Qwen 2.5, QwQ-32B
DeepSeek: DeepSeek-R1, DeepSeek-V3
Microsoft: Phi-3 Family, Phi-2
IBM: Granite 3.0, Granite 3.1
Google: Gemma 2
Cohere: Command R, Command R+
BigScience: BLOOM
EleutherAI: GPT-J, GPT-NeoX, Pythia
BigCode: StarCoder, StarChat, SantaCoder
Salesforce: CodeGen2, CodeT5+, XGen
TIIUAE: Falcon
Upstage: SOLAR

🎓 Academic/Research:

LMSYS: Vicuna, FastChat-T5
Stanford: Alpaca
UC Berkeley: Koala
LAION: Open Assistant
OpenLM Research: OpenLLaMA
MLFoundations: OpenLM

🏢 Other Companies:

Yandex: YaLM
Replit: Replit Code
H2O.ai: h2oGPT
Databricks: Dolly
Together: RedPajama-INCITE
MosaicML: MPT Family
Stability AI: StableLM
Nous Research: OpenHermes
Cerebras: Cerebras-GPT
Deci AI: DeciCoder
AI Squared: DLite
BlinkDL: RWKV

By Model Type:

🧠 Reasoning Models (2024-2025):

OpenAI: o1, o1-mini, o3, o3-mini, o4-mini
DeepSeek: DeepSeek-R1 Family
Alibaba: QwQ-32B, Qwen 3 (hybrid reasoning)
Google: Gemini 2.5 (thinking models)

💬 Conversational Models:

OpenAI: ChatGPT, GPT-4o
Anthropic: Claude 3/4 Family
Google: Bard, Gemini
xAI: Grok 3

💻 Code-Specialized:

Meta: Code Llama
BigCode: StarCoder, SantaCoder
Salesforce: CodeGen2, CodeT5+
Replit: Replit Code
Deci AI: DeciCoder

🌐 Multimodal:

OpenAI: GPT-4o
Google: Gemini 2.0/2.5
Meta: Llama 4, Llama 3.2

⚡ Efficient/Small:

Microsoft: Phi-3 Family, Phi-2
Google: Gemma 2
AI Squared: DLite
Upstage: SOLAR

Last updated: July 2025
Original repository: https://www.techrxiv.org/doi/full/10.36227/techrxiv.23589741.v3