Will Anthropic have the #1 AI model at the end of April 2026 (Style Control On)?
Polymarket · 24d ago
RejectedREJECTED YES · $0.00
Reasoning
Agent Consensus
68%
P(YES)
SKIPPED
Forecaster
75%
Bull
94%
Bear
22%
Bulls say
“Claude Opus 4.6 Thinking was ranked #1 on LMArena Text Overall as of April 23, 2026 with 1500±5 Elo — the most recent observed leaderboard snapshot, just 7 days before market close.. Style Control methodology historically favors Claude: the 2024 LMSys analysis explicitly found that 'Claude 3.5 Sonnet, Opus, and Llama-3.1-405B rise substantially' when length/formatting effects are removed. This means if Claude leads without style control, it likely leads by a WIDER margin with style control on..”
Bears say
“The bull's core evidence does not match the contract. The market resolves on Chatbot Arena 'Style Control On' at a specific timestamp, but the cited April 23 snapshot is explicitly 'Overall' / no-style-control. That is not a minor proxy error: it is a different leaderboard configuration, and the research itself admits no current Style Control data is available. You cannot justify a 94% YES from an unobserved target variable.. The bull overclaims the 2024 style-control result. Historical LMSys findings about older Claude models rising under style control are stale and model-specific, not a structural law that must hold in April 2026. The competitive set, prompting mix, and reward-hacking equilibrium have changed materially. Extrapolating a 2024 rank-direction effect to 2026 frontier models is weak, especially when the contract depends on rank #1, where tiny Elo differences matter..”
Full Debate
6 agents · 0.0s total