It appears that even Google researchers use ChatGPT for their math homework, so lazy students everywhere may have finally found the opportunity they have been waiting for. And the truly amazing part? It appears that artificial intelligence (AI) may have performed better than its developers.
What’s the breakthrough, exactly?
As stated by Pushmeet Kohli, Google’s DeepMind head of AI for science, “When we started the project there was no indication that it would produce something genuinely new” to the Guardian. “To the best of our knowledge, this is the first time a large language model has produced a true, novel scientific finding.”
That’s correct: according to Google’s AI engineers, one of the most intelligent people in the infamously vexing field of combinatorics is now a chatbot. At first, it was only intended to be a proof-of-concept; however, the AI went on to solve open problems more effectively than any previous algorithm, thanks to a new algorithm the team called FunSearch.
“FunSearch found new solutions for a longstanding open problem in mathematics: the cap set problem,” Alhussein Fawzi and Bernardino Romera Paredes wrote on the DeepMind blog.
“Finding the largest set of points (referred to as a cap set) in a high-dimensional grid where no three points lie on a line is the problem,” they said.
Here’s where an example might be helpful. Twelve cards are dealt in the game Set (no relation); each card is marked with a different combination of shape, color, shading, and quantity. The next task for players is to locate a set of three cards that have all three of those characteristics, either uniquely or identically. For instance, a card with one red solid diamond, another with two blue striped diamonds, and a third with three green empty diamonds would all constitute a set because they all have diamonds, but they differ in terms of color, shading, and quantity.
More cards are dealt out until one is located if, despite the possibility that nobody will be able to identify one of the 12 sets of cards on the table, this will happen. Because mathematicians are cunning jerks, someone decided to find out the maximum size of a cap set in Z34, or, to put it more mathematically, how many cards can be dealt before a set is required.
While that specific problem was solved in 1971 (the solution is 20, incidentally), larger sets present far more challenging problems. The number of possible solutions increases incredibly quickly in combinatorics, as is sadly typical; after only eight features, you’re looking at roughly 31600 possible “cards.”
Naturally, people haven’t figured that one out yet—after all, why would you even try? Furthermore, how would you even attempt it? Mathematicians disagree on how to approach the cap set problem for n = 8, let alone the actual solution, so that statement is not rhetorical.
That’s why it’s so amazing that Google AI seems to have figured it out using a size 512 cap set that was previously unidentified.
Kohli told Nature, “This is the first time that someone has demonstrated that an LLM-based system can go beyond what was known by mathematicians and computer scientists.” “It’s not just new; it works better than anything on the market right now.”
How to instruct your chatbot
If it holds up, it’s major news. The neural networks that power all those chatbots that have become so frighteningly popular recently are called large language models, or LLMs. While there has been a lot of talk about how LLMs are going to replace humans in the creative industries and eliminate the need for humans to create art, music, or any other wonderful thing that defines our species, the reality is that LLMs are not nearly as sophisticated as an Ex Machina or an I, Robot; instead, they operate by essentially scraping enormous volumes of data and text created by humans and repackaging it in an uncannily realistic way.
Not just because the bots are taking advantage of all the genuine artists, but it’s a serious issue overall. These chatbots are driven by LLMs, which aren’t concerned with truth or falsity but rather with identifying patterns in speech and text. This means that the responses they give frequently seem reasonable at first glance but are completely unusable.
Thus, how did the DeepMind scientists steer clear of this issue when pursuing their mathematical endeavors? Well, they didn’t exactly. Instead, Google’s LLM-based coding model Codey, which can prompt and generate code for developers, is combined with an algorithm to check and score what Codey comes up with to create FunSearch, which gets its name from its ability to search the function space, in case you were wondering why extremal combinatorics is so cool.
The procedure was as follows: the group would write a section of code to solve the mathematical problem, but omit the actual instructions for the program. Then Codey would enter and offer suggestions for those lines. After that, Codey’s work would be marked by the second algorithm and sent back for review.
According to Kohli, the MIT Technology Review, “Many will be nonsensical, some will be sensible, and a few will be truly inspired.” “You say, ‘Okay, take these and repeat,’ after taking those that inspire you.”
Not content to just defeat its human masters in one venerable mathematical riddle, FunSearch proceeded to tackle another, known as the “bin packing problem.”
Fawzi and Paredes stated, “We decided to explore the flexibility of FunSearch by applying it to an important practical challenge in computer science, encouraged by our success with the theoretical cap set problem.” “A lot of real-world issues, like filling containers with goods or allocating compute tasks in data centers to reduce costs, have their roots in the ‘bin packing’ problem.”
The question of how to pack items into bins or containers in a way that minimizes the number of bins required is known as the “bin packing problem.” Despite its seeming simplicity, the computational complexity of this problem is even higher than that of the cap set problem—NP-hard, not NP-complete, for technical enthusiasts.
However, Fawzi and Paredes noted that “setting up FunSearch for this problem was easy, despite being very different from the cap set problem.” “FunSearch provided an automatically customized program that outperformed established heuristics, utilizing fewer bins to pack the same number of items—adapting to the details of the data.”
The LLMs’ limitations
Even though DeepMind’s discoveries have enormous implications, working mathematicians shouldn’t be concerned about their job security just yet. For the time being, FunSearch is restricted to problems that meet specific requirements. These include being easily evaluated and scored, as well as adhering to the same “fill in the missing code” strategy that the team employed in the cap set and bin packing problems. The researchers point out that since you can’t grade things like that in a way that would make sense for a computer, tasks like generating proofs, for instance, would be far too difficult for AI at this time.
It’s a brave new world out there, though, and you never know what enduring mystery will be solved next.
Jordan Ellenberg, a mathematics professor at the University of Wisconsin-Madison and co-author of the paper, told the Guardian, “What I find exciting, even more so than the specific results we found, is the prospects it suggests for the future of human-machine interaction in math.”
FunSearch creates a program that finds the solution rather than the solution itself. I might not be able to solve other related problems with the help of a solution to a particular problem. However, a program that solves the problem is something that a person can read and understand, potentially leading to ideas for the next problem and the next and the next.