Solving maths won’t solve intelligence
History shows what happens when axioms hit reality
It’s been quite a month for AI in mathematical research.
A couple of weeks ago, an internal OpenAI model disproved an 80-year-old conjecture in geometry. What was the conjecture in question? In short, suppose you scatter a given number of points on a flat plane. Then you ask: what is the largest possible number of pairs of points that are exactly one unit apart?
In 1946, Paul Erdős conjectured that even the best possible arrangement of n points in the plane can create only slightly more than n pairs of points exactly one unit apart. But the OpenAI model showed that this wasn’t true: with the right arrangement, you can fit in far more one-unit gaps between points than previously thought.
As mathematician Tim Gowers put it: “if a human had written the paper and submitted it to the Annals of Mathematics and I had been asked for a quick opinion, I would have recommended acceptance without any hesitation. No previous AI-generated proof has come close to that”
The discovery follows the news a few weeks ago that an amateur had solved another Erdős problem, with nothing more than a GPT-5.4 Pro subscription. The raw output was quite hard to follow, but it suggested the problem might have easier than previously assumed. Still, human researchers hadn’t managed to spot the new trick GPT had converged on. Terence Tao told Scientific American that humans had seemingly hit a collective mental block: ‘This one is a bit different because people did look at it, and the humans that looked at it just collectively made a slight wrong turn at move one.’
From maths to minds
We’re seeing some impressive progress, and it’s occurring against a background of wider ambition. Several startups are now emerging with a focus on generating verifiable mathematical proofs with AI. The underlying pitch is that if machines can master mathematical reasoning, they can master intelligence.
Yet this isn’t a totally new idea. Centuries ago, Enlightenment thinkers tried to place mathematics at the centre of human knowledge. Mathematical formalisations found their way into everything from politics (‘we hold these truths to be self-evident’) to justice (‘balance of probabilities’). The idea was that core human principles – from equality, to justice, to beauty – could be indisputably formalised, just as they had been in mathematics.
In the first draft of the Declaration of Independence, Thomas Jefferson had written that ‘we hold these truths to be sacred and undeniable’. It was only after Jefferson had shown the draft to his colleague Benjamin Franklin that the crucial edit was made. With heavy pen slashes, he cut the words ‘sacred and undeniable’ – with their implied reliance on religion – and replaced them with the scientific claim of ‘self-evident’.
Voltaire once voiced his frustration with the ‘quarter proofs’ and ‘eighth proofs’ that had made their way into parts of the mathematically-inspired French legal system, and the strange arithmetic that followed. ‘Eight hearsays, which, in fact, are no other than the echo of a groundless report, constitute a full proof,’ he complained.
In the 18th Century, the Prussian Code had even aimed to construct a detailed logical structure to cover any possible legal situation. The final version, which came into effect in 1794, contained over 17,000 articles of law. However, technical terms in the code were left undefined; to understand the new code, readers often needed to refer to the previous system of Roman Law.
There were also many scenarios for which the code provided no answer. Judges were not allowed to refer to the precedent of historical decisions, so instead they had to use their own interpretation. Legal uncertainty, in other words, could not be removed through sheer mathematical brute force.
As I wrote about in Proof, the quest for mathematical perfection clashed with real-life societies. There’s more to knowledge and decision-making than mathematical logic. What is ‘good’? What is ‘fair’? What is ‘equal’?
In November 1863, Abraham Lincoln gave an address in Gettysburg, Pennsylvania, which had recently been the site of the bloodiest battle in the American Civil War. In his opening sentence, Lincoln referred to a nation ‘dedicated to the proposition that all men are created equal’.
Rather than stating equality as a self-evident truth – which he knew many denied when it came to slavery – he was reframing it as a proposition, which the nation would need to prove together. As Lincoln put it, the civil war was ‘testing whether that nation, or any nation, so conceived and so dedicated, can long endure’.
Many of the key decision-making processes in life cannot be neatly solved like a mathematical challenge, with a right answer than can be indisputably proven. I’ve previously written about how even some coding tasks can be difficult to evaluate. Ensuring that code is ‘hallucination-free’ isn’t just a matter of checking it runs or produces the expected output for a given input when tested. If we’re developing a model or an analysis, it’s also about deciding what we are happy with:
Building a model isn’t just about having something that can output predictions in a consistent way. It’s about ensuring assumptions are defensible and design decisions are appropriate.
Back in 2022, AI researcher Paul Christiano pointed out that even if humans can build AI that does exactly what its designers want, the next challenge would be agreeing on what is the ‘correct’ thing to want:
‘If we succeed at the technical problem of AI alignment, AI developers would have the ability to decide whether their systems generate sexual content or opine on current political events, and different developers can make different choices. Customers would be free to use whatever AI they want, and regulators and legislators would make decisions about how to restrict AI.’
History has shown that it is not enough to simply define something, or prove something, or know something. People have to choose what they want to know, and what they want to do with that knowledge. Mathematics can be a valuable tool to make life’s journey faster, but it can’t decide where we’re going.
Cover image: Bozhin Karaivanov


I’d also argue that intelligence isn’t so much about the answer as it is about the taste of questions. As you point out with modeling, choosing what we’re happy with requires evaluating the social reception of our work (e.g. will reviewers be okay with density-dependent transmission?). Asking questions is one step further removed, a subtle art of intuiting which answers might compel people, or sometimes which questions alone are enough to scramble others into a hunt for solutions.
I’d love to see AI make conjectures that inspire us. Even if AI doesn’t prove it, that would be quite intelligent if it can point us in a motivating direction for meaningful proofs