AI Wins Gold at the Hardest Math Competition on Earth — And the Proofs Are Beautiful

For the first time in history, an AI system has achieved gold medal performance at the International Mathematical Olympiad — the world's most prestigious mathematics competition. Google DeepMind's Gemini Deep Think scored 35 out of 42 points, solving 5 of 6 problems perfectly.

But the score isn't what made mathematicians sit up. It's how the AI did it.

What Makes This Different from Everything Before

Previous AI math systems needed heavy formal-language scaffolding — problems had to be translated into machine-readable formats like Lean or Isabelle before the AI could work on them. The 2024 system (AlphaProof + AlphaGeometry) scored silver at 28 points but required manual translation.

Gemini Deep Think worked entirely in natural language. It read the problem statements in English, reasoned through them, and produced elegant written proofs that human mathematicians could read, follow, and verify — all within the competition's 4.5-hour time limit per day.

IMO President Prof. Dr. Gregor Dolinar confirmed: "Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow."

The Beauty of Problem 1: Sunny Lines on a Lattice

To appreciate what the AI achieved, let's look at one of the problems it solved — IMO 2025 Problem 1. This is a combinatorial geometry problem that mixes counting, convexity, and elegant reduction.

The problem: Consider a triangular grid of lattice points P_n = {(a,b) where a,b are positive integers and a+b ≤ n+1}. A line is called "sunny" if it's not parallel to the x-axis, y-axis, or the diagonal x+y=0. Find all values of k such that exactly n lines can cover every point, with exactly k of them being sunny.

The answer: k = 0, 1, or 3. Notably, k = 2 is impossible.

The beautiful counting argument the AI discovered:

The convex hull of the lattice points forms a triangle. Its boundary contains exactly 3k − 3 lattice points. Now here's the key insight: each sunny line can intersect this triangular boundary in at most 2 points (because sunny lines can't lie along any of the three edges).

So to cover all boundary points, you need:

3k − 3 ≤ 2k ⇒ k ≤ 3

This single inequality eliminates all values of k above 3 in one stroke. Then the AI constructed explicit line arrangements for k = 0, 1, and 3, and proved k = 2 impossible using a local parity argument.

For k = 3, one elegant construction uses three sunny lines: y = x, 2x + y = 5, and x + 2y = 5.

This is the kind of clean, insightful counting argument that top human mathematicians love — reducing a seemingly complex combinatorial problem to a simple inequality through geometric insight. And the AI found it on its own.

How the AI Actually Works

Deep Think mode: The model spends extended "thinking time" exploring multiple solution paths in parallel — similar to how human mathematicians try different approaches
Reinforcement learning: Trained on millions of math problems where the AI only gets rewarded when the proof is 100% correct — no partial credit, no hallucinations allowed
Formal verification: Uses Lean (a proof-checking software) to catch any logical error instantly, ensuring every step is rigorous
Natural language output: Produces human-readable proofs, not machine code

The Timeline: Silver to Gold in One Year

2024: AlphaProof + AlphaGeometry 2 = Silver medal (28/42 points, 4 problems solved)
2025: Gemini Deep Think = Gold medal (35/42 points, 5 problems solved)

A 25% improvement in score and one full medal tier jump in a single year. At this rate, the question isn't whether AI will achieve a perfect score — it's when.

Why Mathematicians Are Excited (and Nervous)

The excitement: AI can now serve as a genuine collaborator in mathematical research. Imagine a tool that can explore thousands of proof strategies in parallel, verify each one rigorously, and present the most elegant solution in plain language.

The nervousness: if AI can produce creative, original proofs at this level, what does that mean for the role of human mathematicians? The field is grappling with questions it never expected to face this soon.

But perhaps the most profound implication is for science broadly. Mathematics is the language of physics, chemistry, biology, and engineering. An AI that can reason mathematically at this level can potentially accelerate discovery across every scientific discipline.

Sources: Google DeepMind, Art of Problem Solving, DeepMind Solutions PDF