WebDev Arena Leaderboard

WebDev Arena is an open-source benchmark evaluating AI capabilities in web development, developed by LMArena.

Leaderboard

Arena Score

1252.24

License

Proprietary

95% CI

+4.91 / -6.09

Votes

12,356

Arena Score

1138.14

License

Proprietary

95% CI

+6.33 / -6.88

Votes

6,288

Arena Score

1068.48

License

Proprietary

95% CI

+6.10 / -5.58

Votes

9,224

Arena Score

1060.55

License

Proprietary

95% CI

+7.32 / -9.25

Votes

4,463

Arena Score

1029.45

License

Proprietary

95% CI

+6.21 / -5.47

Votes

8,010

Arena Score

1025.87

License

Proprietary

95% CI

+5.42 / -4.43

Votes

11,388

Arena Score

978.07

License

Proprietary

95% CI

+4.95 / -4.99

Votes

11,046

DeepSeek v3

DeepSeek

#7

Arena Score

967.88

License

DeepSeek

95% CI

+8.46 / -9.30

Votes

5,345

Arena Score

964.00

License

Proprietary

95% CI

+4.16 / -4.89

Votes

11,977

Arena Score

902.78

License

Apache 2.0

95% CI

+6.16 / -5.16

Votes

10,788

Arena Score

894.46

License

Proprietary

95% CI

+5.42 / -4.77

Votes

10,488

Arena Score

811.76

License

Llama 3.1

95% CI

+13.01 / -16.72

Votes

1,117

More Statistics for WebDev Arena (Overall)

Confidence Interval for Model Strength

Figure 1

Average Win Rate Against All Other Models (Assuming Uniform Sampling and No Ties)

Figure 2

Fraction of Model A Wins for All Non-tied A vs. B Battles

Figure 3

Battle Count for Each Combination of Models (without Ties)

Figure 4