Thread Reader
Axiom

Axiom
@axiommathai

Tweet

1/ AxiomProver got 12/12 of Putnam 2025. Today we release the Lean proofs AxiomProver generated autonomously. We also provide our take of the problems, proof visualizations, and compare how humans vs AI approach differently. Tons of fun math and Lean! Our findings in thread.

2/ We sorted the proofs into three categories: i. problems that are easy for humans, but painstaking for AxiomProver ii. problems that are surprisingly within AxiomProver's reach iii. problems that got two completely different math approaches from humans vs AxiomProver
3/ Problems that are easy for humans but painstaking for AxiomProver are calculus (A2, B2), and combinatorial construction (A5). A simple positivity lemma for human takes chunks of Lean. What's obvious for humans needs to satisfy the formal checker. Analysis makes AI grumpy!
4/ Combinatorial constructions that you just see could take lots of work in Lean. Every little corner case and bookkeeping detail has to be spelled out explicitly. The current Lean formalization is 2054 lines long and took 518 minutes to generate! What a friendly beast.
5/ Since AxiomProver doesn't have a Euclidean geometry engine, we are surprised to see it being able to get geometry (B1). Geometry is turned to symbols. It also sailed through the combinatorial game theory question (A3) while combinatorics is usually viewed as challenging.
6/ A6 is the problem that we are stunned by - none of our in-house mathematicians figured out how to do it, but AxiomProver did. While we know p-adic power series expansions are needed, AxiomProver won the fight of going from the right idea to actually finishing the problem.
7/ For A4 and B4, AxiomProver and our experts arrive at completely different solutions! This divergence in approaches between humans and AI is especially interesting to us. For A4, while our mathematicians thought algebraically, AxiomProver clearly took the geometric route.
8/ The irony is that after it saw the key geometric intuition, the formal Lean verification spent most of its lines carefully checking that this circle of vectors actually works. This is exactly the part that a human mathematician would wave their hands at and call intuitive.
9/ For B4, we have one picture that creatively settles the problem. AxiomProver, via 1061 lines of Lean, chose a bookkeeping argument and grinded through the combinatorial properties of rows and columns until the result falls out. No elegant picture, just raw case analysis.
10/ In a time of AI solving math, what's hard for humans and what's hard for AI are different. What makes a math problem hard for a machine? This is an interesting direction in itself. Perhaps it should be answered by applying AI to actual research math, the north star.
11/ Grothendieck says: don't attack the problem, raise the water level until it surrounds land masses. When will AI bring in definitions, build theories, and choose the right lens to view the problem or the abstraction to make it dissolve? Not there yet, but let's keep trying.
12/ Read our full blog here (we promise it's a fun read): axiommath.ai/territory/from Lean code and visualization graphs here: github.com/AxiomMath/putn Apply here if this mission excites you: axiommath.ai/join-us
Axiom

Axiom

@axiommathai
The Starting Point for Reasoning
Follow on 𝕏
Missing some tweets in this thread? Or failed to load images or videos? You can try to .