Sakana AI Unveils Sudoku Benchmark: A New Frontier for AI Reasoning
Table of Contents
- 1. Sakana AI Unveils Sudoku Benchmark: A New Frontier for AI Reasoning
- 2. Introduction to the Modern Sudoku Challenge
- 3. The Frontier of AI: Reasoning Capabilities
- 4. Looking to Japan for Inspiration
- 5. Cracking The Cryptic: A Partnership for Human Reasoning Data
- 6. Beautiful Hand-Made Sudokus From Nikoli
- 7. Bonus: The Sakana AI Sudoku
- 8. Acknowledgements
- 9. Sakana AI
- 10. What specific challenges and methodologies does Sakana AI employ to train their AI models to solve “Parity Fish” and similar complex Sudoku puzzles?
- 11. Sakana AI Unveils Sudoku Benchmark: Revolutionizing AI Reasoning with Human Insights
Challenging AI with handcrafted Sudoku puzzles and human reasoning data for
next-generation problem-solving.
Can you solve the Sakana AI
Sudoku Puzzle?
Sakana AI is pushing the boundaries of artificial intelligence with a new,
incredibly challenging reasoning benchmark based on Sudoku puzzles.This initiative aims to address the limitations of current AI models in
tackling complex problem-solving scenarios that require human-like
reasoning and creativity.
The three key components of this groundbreaking project include:
-
A New Reasoning Benchmark: Introducing a challenging
Sudoku-based benchmark designed to test the limits of AI reasoning.
These puzzles are so intricate that even professional puzzle solvers
find them extremely tough. The technical details are available on
the
GitHub Repo. -
Partnership with Cracking The Cryptic: Collaborating with
the popular YouTube channel
Cracking The
Cryptic to provide thousands of hours of high-quality human
puzzle-solving examples. This data will be used to train AI reasoning
models, mimicking human thought processes. -
Hand-Made Sudokus from Nikoli: Featuring beautifully
handcrafted Sudokus from Nikoli, the renowned Japanese company that
gave Sudoku its name. These puzzles are designed to require more
varied and creative reasoning than computer-generated puzzles.
Introduction to the Modern Sudoku Challenge
Sudoku,a puzzle played
on a 9×9 grid,challenges players to fill in missing numbers so that
each row,column,and 3×3 box contains the numbers 1 through 9.
popularized in Japan in the 1980s and later in the UK in the 2000s,
Sudoku has evolved into what are now called ‘Modern Sudokus.’ These
puzzles frequently enough include additional rules, adding layers of complexity that
demand more sophisticated problem-solving techniques.

Pierced Butterfly by
Awedish
While computers and AI have been able to solve standard Sudoku puzzles
for some time, they often rely on brute-force methods that don’t
replicate human reasoning. Modern Sudoku puzzles, with their unique
rules and intricate solution paths, present a greater challenge. The
goal is to develop AI that can approach these puzzles with the same
level of understanding and creativity as a human solver.
The Frontier of AI: Reasoning Capabilities
Despite advances in AI, developing robust reasoning capabilities remains
a critically important challenge. Models like OpenAI’s ChatGPT and DeepSeek’s R1,
while remarkable, still struggle with tasks that require sustained,
accurate reasoning or a high degree of creativity.
As AI models advance, evaluation methods must also evolve. Conventional
tests and math competitions are increasingly mastered by modern
reasoning models, necessitating more sophisticated evaluation approaches.
Sakana AI believes that modern Sudokus are perfectly suited for this
purpose.

Llion Jones’ talk on “The Next Reasoning Benchmark” at GTC 2025
At the NVIDIA GTC 2025 event, Jensen Huang highlighted the potential of
puzzles like Sudoku for training AI to reason.

Looking to Japan for Inspiration
Sakana AI, based in Japan, often draws inspiration from Japanese culture.
In this case, they are leveraging the popularity of Sudoku in Japan to
address one of the moast pressing issues in AI research: enabling AI
models to reason about difficult problems effectively.
Sudoku, popularized in Japan in the 1980s by nikoli, provides a
treasure trove of explicit reasoning data.
Cracking The Cryptic: A Partnership for Human Reasoning Data
A critical component of Sakana AI’s benchmark is the partnership with
cracking The Cryptic, a popular YouTube channel known for its engaging
and insightful Sudoku solving videos. This collaboration provides access
to a vast dataset of human reasoning processes, which is invaluable for
training AI models.
The partnership yields a wealth of data, including:
- Over 2,500 videos of puzzle-solving sessions.
- Over 2,000 hours of high-quality reasoning traces transcribed into
text, totaling ~10 million words. - approximately 2 million actions extracted from the solving videos.
Sakana AI is also releasing tools to collect, clean, and preprocess data
for training AI models. See the
Github Repofor
more data.
Cracking The Cryptic’s YouTube channel can be found
here.
Beautiful Hand-Made Sudokus From Nikoli

Nikoli, the famous Japanese puzzle company that gave Sudokus their name,
has provided
100
hand-made Sudokusfor the benchmark.
Hand-made puzzles are more interesting and require more varied kinds of
reasoning to solve.
The hand-made Sudokus by Nikoli have a ‘beautiful idea’ that the AI will
need to find to solve the puzzle without brute force. The elegant
insights required to efficiently solve hand-crafted puzzles remain
beyond the capabilities of current AI systems, underscoring the value of
Nikoli-sourced puzzle collection.

A
Beautiful Nikoli Sudoku
Bonus: The Sakana AI Sudoku
Sakana AI commissioned a custom Sudoku by Marty Sears, a puzzle setter
whose puzzles often appear on
Cracking
The Cryptic. This puzzle, called ‘Parity Fish,’ requires that
numbers adjacent along the red Sakana AI logo line must contain an even
and an odd digit.
Try solving it
here.
Simon solves it
here.

The Sakana AI Sudoku
(
Parity Fish by marty
Sears
). Normal Sudoku rules apply: Fill the grid with the digits 1-9 so
that digits don’t repeat in any row, column, and marked 3×3 box. Two
cells adjacent along the lines in the Sakana AI logo must contain one
even digit and one odd digit. Two cells connected by a white dot
contain consecutive digits. Two cells connected by a black dot contain
digits where one is double the other.
Acknowledgements
Sakana AI thanks all of the amazing setters that have created all of
these fantastic puzzles for all of us to enjoy. A particularly big
thanks to the setters of the puzzles that appeared in the GTC Talk and
this blog:
Sakana AI
Interested in joining us? Please see our
career opportunitiesfor more
information.

What specific challenges and methodologies does Sakana AI employ to train their AI models to solve “Parity Fish” and similar complex Sudoku puzzles?
Sakana AI Unveils Sudoku Benchmark: Revolutionizing AI Reasoning with Human Insights
Archyde News explores Sakana AI’s innovative approach to AI advancement using complex Sudoku puzzles.
Archyde News: Welcome, Dr. Anya Sharma, Lead AI Researcher at Sakana AI, to discuss this exciting new Sudoku benchmark. Can you tell us more about why Sakana AI chose sudoku for this ambitious project?
Dr. Anya Sharma: Thank you for having me. We recognized that while AI has made extraordinary strides, especially in areas like image recognition, it still struggles with complex reasoning. Sudoku, with its inherent logic and a wide range of difficulty levels, offers an excellent framework for evaluating and advancing AI’s problem-solving capabilities.It’s about teaching AIs not just to find solutions but to understand the thought processes behind them.
Archyde News: That’s engaging. The partnership with Cracking The Cryptic seems like a key component.What specific benefits does this collaboration bring?
Dr.Anya Sharma: Cracking The Cryptic provides an invaluable resource: a vast dataset of human reasoning processes. The channel’s videos and accompanying transcripts offer a detailed look into how expert solvers approach and solve incredibly complex Sudoku puzzles. This data, including over 2,500 videos and millions of transcribed words, allows us to train AI models in a way that mimics human-like thought processes, enhancing their ability to reason and strategize.
Archyde News: And incorporating hand-made Sudokus from Nikoli is another interesting element. Why are these puzzles so important?
Dr.Anya Sharma: Hand-made puzzles, especially those created by Nikoli, present a unique challenge. They ofen contain “beautiful ideas”, intricate patterns and require more advanced reasoning skills to solve. These aren’t puzzles that can be easily solved with brute-force computing power; they demand insight and strategic thinking.This is where current AI systems often fall short, making Nikoli’s puzzles a perfect testing ground for developing more sophisticated AI.
Archyde News: The Sakana AI Sudoku, “Parity Fish,” commissioned from Marty Sears, adds another layer of intrigue. What makes this particular puzzle a good portrayal of the benchmark?
Dr.Anya Sharma: “Parity Fish” embodies the complexity we’re aiming to capture. It integrates additional rules, requiring solvers to consider constraints and patterns beyond the basic Sudoku principles. this tests an AI’s ability to handle complex interdependencies and adaptable problem-solving. It’s a beautiful example of how elegant design can create incredibly challenging reasoning tasks.
Archyde News: Looking ahead,what are the long-term implications of this Sudoku-based benchmark for the wider field of AI?
Dr. Anya Sharma: We believe this benchmark is a significant step toward developing AI that can tackle real-world complexities, including more of the nuances of human reasoning. This extends beyond Sudoku, affecting fields which require human level problem-solving. The insights and advancements from this project will be the foundations for applications in areas like scientific revelation, financial modeling, and even creative problem-solving.
Archyde News: A truly innovative approach! Do you have any final thoughts for our readers,and perhaps a challenge?
Dr. Anya Sharma: I encourage everyone to explore the “Sakana AI sudoku” and the other puzzles mentioned. See if you can crack them! Perhaps you’ll come up with a new strategy or way of thinking that could inspire advancements in AI. We’d love to get people’s views on effective problem-solving methods. What are your thoughts on the “beautiful ideas” you’ve incorporated in your AI models?