LLM WebDevArena Leaderboard
Last updated: March 26, 2025 (data verified by human)
Models
22 total
1

AnthropicClaude 3.7 Sonnet (20250219)

Anthropic

1354.01
Score
Input Cost
$3.00
Output Cost
$15.00
95% CI
+12.49 / -10.75
Votes
4,825
License
Proprietary
2

GoogleGemini-2.5-Pro-Exp-03-25

Google

1267.7
Score
Input Cost
N/A
Output Cost
N/A
95% CI
+15.58 / -15.39
Votes
1,654
License
Proprietary
3

AnthropicClaude 3.5 Sonnet (20241022)

Anthropic

1245.4
Score
Input Cost
$3.00
Output Cost
$15.00
95% CI
+4.51 / -4.90
Votes
21,059
License
Proprietary
4

DeepSeekDeepSeek-R1

DeepSeek

1203.8
Score
5

xAIearly-grok-3

xAI

1144.94
Score
6

OpenAIo3-mini-high (20250131)

OpenAI

1144.14
Score
7

AnthropicClaude 3.5 Haiku (20241022)

Anthropic

1136.02
Score
8

GoogleGemini-2.0-Pro-Exp-02-05

Google

1099.19
Score
9

OpenAIo3-mini (20250131)

OpenAI

1097.7
Score
10

OpenAIo1 (20241217)

OpenAI

1049.23
Score
11

OpenAIo1-mini (20240912)

OpenAI

1046.16
Score
12

GoogleGemini-2.0-Flash-Thinking-01-21

Google

1033.91
Score
13

GoogleGemini-2.0-Flash-001

Google

1030.44
Score
14

GoogleGemini-2.0-Flash-Thinking-1219

Google

1023.15
Score
15

GoogleGemini-Exp-1206

Google

1022.31
Score
16

GoogleGemini-2.0-Flash-Exp

Google

983.52
Score
17

AlibabaQwen2.5-Max

Alibaba

977.59
Score
18

OpenAIGPT-4o-2024-11-20

OpenAI

964
Score
19

DeepSeekDeepSeek-V3

DeepSeek

963.43
Score
20

AlibabaQwen2.5-Coder-32B-Instruct

Alibaba

903.53
Score
21

GoogleGemini-1.5-Pro-002

Google

894.57
Score
22

MetaLlama-3.1-405B-Instruct

Meta

811.92
Score
Showing 22 of 22 models
Data sourced from: web.lmarena.ai/leaderboard