Leaderboard
Rank | Model | Score | Organization |
---|---|---|---|
#1 | R1 | 94.0 | DeepSeek |
#2 | o3-mini-high | 92.0 | OpenAI |
#3 | o3-mini | 91.0 | OpenAI |
#4 | o1 | 88.0 | OpenAI |
#5 | Claude Sonnet 3.5 (new) | 86.0 | Anthropic |
#6 | GPT-4o | 82.0 | OpenAI |
#7 | Grok 2 | 56.0 | x.ai |
#8 | GPT-4o-mini | 49.0 | OpenAI |
Rank | Model | Score | Organization |
---|---|---|---|
#1 | R1 | 94.0 | DeepSeek |
#2 | o3-mini-high | 92.0 | OpenAI |
#3 | o3-mini | 91.0 | OpenAI |
#4 | o1 | 88.0 | OpenAI |
#5 | Claude Sonnet 3.5 (new) | 86.0 | Anthropic |
#6 | GPT-4o | 82.0 | OpenAI |
#7 | Grok 2 | 56.0 | x.ai |
#8 | GPT-4o-mini | 49.0 | OpenAI |