Curated & Opinionated

AI Models Hub

Not every model — the right models for each use case. Benchmarks, pricing, context windows, and our honest take on each.

13
Models tracked
6
Open source
7
Proprietary
9
Providers

Showing 13 models

OpenAIMay 2024

GPT-4o

Closed
Multimodal + General

Best all-around model with strong vision capabilities

Params
~200B
Context
128K
Input/1M
$2.5
MMLU
88.7%
HumanEval
90.2%
GSM8K
95.8%
AnthropicJun 2024

Claude 3.5 Sonnet

Closed
Coding + Analysis

Top coding model with industry-leading instruction following

Params
~70B
Context
200K
Input/1M
$3
MMLU
88.3%
HumanEval
92%
GSM8K
96.4%
Anthropic2025

Claude 4 Opus

Closed
Complex Reasoning + Research

Most capable model for deep analysis and research tasks

Params
~400B
Context
500K
Input/1M
$15
MMLU
91.2%
HumanEval
96.2%
GSM8K
98.1%
Google2025

Gemini 3 Ultra

Closed
Massive Context + Documents

Unmatched 2M token context — entire codebases at once

Params
~1T (MoE)
Context
2M
Input/1M
$5
MMLU
90.8%
HumanEval
91.5%
GSM8K
97.2%
GoogleFeb 2025

Gemini 2.0 Flash

Closed
Speed + Cost Efficiency

Fastest model per dollar — perfect for high-volume apps

Params
~8B
Context
1M
Input/1M
$0.1
MMLU
81.2%
HumanEval
78.4%
GSM8K
89.3%
OpenAIDec 2024

o1 Pro

Closed
Math + Hard Reasoning

Best for STEM, math olympiads, and PhD-level problems

Params
Unknown
Context
200K
Input/1M
$60
MMLU
90.3%
HumanEval
92.4%
GSM8K
99.1%
DeepSeekDec 2024

DeepSeek V3

Open
Open Source Coding + Value

GPT-4 performance at 1/10th the cost — open weights

Params
671B (MoE, 37B active)
Context
128K
Input/1M
$0.27
MMLU
87.5%
HumanEval
89.6%
GSM8K
94.8%
Meta2025

Llama 4 Scout

Open
Open Source + Huge Context

10M token context window — free to self-host

Params
109B (MoE, 17B active)
Context
10M
Input/1M
Free
MMLU
84.8%
HumanEval
78.2%
GSM8K
90.5%
Meta2025

Llama 4 Maverick

Open
Open Source General Purpose

Meta's flagship — rivals GPT-4o on most benchmarks

Params
400B (MoE, 17B active)
Context
1M
Input/1M
Free
MMLU
87.5%
HumanEval
85.5%
GSM8K
93.7%
Mistral2025

Mistral Large 3

Open
EU/GDPR + Enterprise

Best EU-made model — GDPR compliant, deployable on-prem

Params
~123B
Context
128K
Input/1M
$2
MMLU
84%
HumanEval
84.2%
GSM8K
91.3%
AlibabaSep 2024

Qwen 2.5 72B

Open
Open Source Budget + Multilingual

Best open-source model under 100B params for multilingual tasks

Params
72B
Context
128K
Input/1M
Free
MMLU
86%
HumanEval
86.7%
GSM8K
95.2%
MicrosoftDec 2024

Phi-4

Open
Lightweight / Edge Deployment

Remarkable at 14B — best small model for on-device AI

Params
14B
Context
16K
Input/1M
Free
MMLU
84.8%
HumanEval
82.6%
GSM8K
91.5%
xAIFeb 2025

Grok 3

Closed
Real-time Info + Reasoning

Real-time web access + strong reasoning with Think mode

Params
~314B
Context
131K
Input/1M
$3
MMLU
87.5%
HumanEval
88.9%
GSM8K
94.8%

Want to learn how to build with these models? Our course covers GPT-4o, Claude, Gemini, Llama, and more — hands-on.

Also check our LLM Comparison Guide for deeper benchmark analysis.

    Need help deciding?

    Chat with us instantly!

    Subscribe on YouTube