Gemini AI Beats Pokémon Blue With a Strategic Assist

Gemini AI Beats Pokémon Blue With a Strategic Assist Gemini AI Beats Pokémon Blue With a Strategic Assist
IMAGE CREDITS: POKEMON

Google’s most advanced AI model, Gemini 2.5 Pro, has just pulled off a nostalgic win: finishing Pokémon Blue, the iconic 1996 GameBoy game.

The announcement came directly from Google CEO Sundar Pichai, who celebrated on X (formerly Twitter), saying: “What a finish! Gemini 2.5 Pro just completed Pokémon Blue!”

But here’s the twist—this wasn’t a corporate showcase. The livestream, dubbed Gemini Plays Pokémon, was created by Joel Z, a 30-year-old software engineer with no ties to Google. Still, the project has caught the attention of Google execs, who’ve been cheering it on. Logan Kilpatrick, Google AI Studio’s product lead, had previously noted Gemini’s progress, celebrating its fifth badge in-game and joking with Pichai about “Artificial Pokémon Intelligence.”

So why Pokémon?

Earlier this year, rival AI startup Anthropic revealed that its Claude models were making steady progress on Pokémon Red, the companion title to Blue. According to Anthropic, Claude’s “extended thinking” and agent training helped it handle the quirks of retro gameplay—a benchmark that’s as entertaining as it is technically demanding. There’s even a Claude Plays Pokémon Twitch channel, which Joel Z credits as his inspiration.

But while Gemini has now completed Pokémon Blue, Claude hasn’t yet crossed the finish line in Red. Does that make Gemini better? Not exactly.

Joel Z was quick to set expectations on his Twitch page. “Please don’t consider this a benchmark for how well an LLM can play Pokémon,” he wrote. “Gemini and Claude receive different inputs and rely on unique agent harnesses.”

And that’s key. Neither AI is playing the game on its own. Both rely on specialized frameworks that feed them annotated screenshots and let them issue commands through custom interfaces. In some cases, they can even summon other agents to help with specific tasks, like navigating a cave or managing a battle.

Joel Z admits that he occasionally stepped in to support Gemini. But he insists it wasn’t cheating—more like debugging. “I don’t give it step-by-step hints or walkthroughs,” he said. “The only thing close was flagging a glitch involving the Lift Key from a Team Rocket Grunt, which was fixed in later game versions.”

He also added that Gemini Plays Pokémon is still evolving. The current version of the system is actively being improved to better handle complex reasoning without direct human input.

While this might not be a perfect test of AI gaming skills, it’s still a milestone. It shows how far multimodal AI like Gemini can go when paired with tools that extend its capabilities. And maybe, just maybe, it’s a preview of where AI-powered agents are headed next.

Share with others

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Service

Follow us