Gemini 3 and Grok 4.1 currently top the LMArena leaderboard. This public scoreboard ranks today’s major AI models based on real user battles. It’s run by LMSYS, the same team behind the Chatbot Arena, and has become one of the most trusted ways to see how models stack up in the real world.

I put Gemini 3 and Grok 4.1 head-to-head, through nine distinct challenges —spanning logic puzzles, coding tasks, creative writing and self-reflection — to see how each handles the range of demands users typically bring to AI assistants. The results reveal interesting contrasts in style, depth and reliability.

Tom’s Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds.

Google News