Creators' AI

Creators' AI

Gemini 3: What You Should Know

Review & a LOT of stunning cases

Creators AI's avatar
Creators AI
Nov 20, 2025
∙ Paid

Hello there!

Lately, while scrolling through X and Reddit, I’ve seen a lot of mixed takes on ChatGPT 5.1, which dropped last week. And…I can’t say the same about Gemini 3 Pro. People say it looks solid in math, code, and understanding. So here is where I’m going with this.

See for yourself what Gemini 3 Pro Thinking and ChatGPT-5.1 Thinking (on the right) did with the prompt about a rotating hexagon with a ball. A user asked both models to “Create an HTML, CSS, and JavaScript where a ball is inside a rotating hexagon. The ball is affected by Earth’s gravity and the friction from the hexagon walls. The bouncing must appear realistic.”

Source: @alex_prompter. Looks like ChatGPT struggled a little here 🤭

Okay, in this piece, we:

  1. Discover what is new in Gemini 3

  2. Break out on Gemini Agent and Antigravity

  3. See what people have already tested and the prompts you can run

This year, we have a limited pre-BFCM Offer for our readers to get 20% OFF Annual Subscriptions. Limited Offer ⏳🔥

Get 20% Off

Before we jump in, I want to give you a quick preview of what this thing can actually pull off (just in case the first example didn’t impress you enough).

A quick experiment by @measure_plan in Google AI Studio using Gemini 3, Three.js, and MediaPipe grew into a Jarvis HUD Interface for Tony Stark.
Gemini 3 turned a single prompt into a smartphone OS without any setup!

Keep your mailbox updated with practical knowledge & key news from the AI industry!

What Is Gemini 3

The Gemini 3 family was unveiled on Tuesday, including its “most intelligent model” Gemini 3 Pro (but, honestly, I’ve seen someone claiming the best among all models). Long before its official release, the model was surrounded by rumors, and it turned out to be more than just smoke and mirrors.

Btw, this is the first time that a new version of Gemini has been launched simultaneously in Google Search and in the app.

Generally speaking, Gemini 3 is also much better at figuring out the context and intent behind your request (less prompting needed), can comprehend vast datasets, challenging problems from different information sources, including text, audio, images, video, and entire code repositories. I know, it sounds not so impressive so far, but it actually does break the mold with its two versions:

  • Gemini 3 Pro (the main one, which is available today across Google products)

  • And Gemini 3 Deep Think (enhanced reasoning mode)

Gemini 3 Pro With the Architecture of Experts

Gemini 3 Pro works like a MoE model, so instead of firing up the whole network every time, it picks a small group of experts for each token, even though it has a 1 million-token context window! It routes tokens to whoever is best for the job, which cuts down the compute bill.

More clearly, most models like GPT-4, GPT-4.1, and others use dense architectures. They turn on the entire model for every token. It works, but it costs more and reacts less flexibly.

MoE is obviously not new, but Gemini 3 Pro takes it further with better training on multimodal stuff like text, images, and audio.

Deeper in training

Yeah, Gemini 3 Pro trained on multimodal data from day one. It’s not just good with pictures and audio; it actually learned from them. And there were even videos in the training, which is pretty rare.

Overall, the training data was a mix of synthetic, human, instruction-response pairs, and preference data. Everything got filtered hard before it hit the model. It also used data from people using Google’s services (there could be a joke about privacy, but I hope it was consensual)

Reasoning benchmarks

Gemini 3 Pro is crushing it on the numbers. It hit 1501 on LMArena, beating Gemini 2.5 Pro’s 1451 and topping the leaderboard. On Humanity’s Last Exam, one of the toughest AI tests, it scored 37.5 percent without tools, way ahead of GPT-5.1 and Grok 4. With search and code, it jumps to 45.8 percent.

On visual reasoning puzzles like ARC AGI 2, it hit 31.1 percent, way up from 4.9 on Gemini 2.5 Pro and ahead of GPT-5.1 at 17.6. For PhD-level scientific questions on GPQA Diamond, it scored 91.9 percent, slightly ahead of GPT-5.1 and Claude Sonnet.

And it seems we have an example. The guy prompted, “Solve this logic puzzle: You have 3 boxes—one contains only apples, one only oranges, and one a mix of both. Each is incorrectly labeled. By picking one fruit from one box, determine how to correctly label all boxes.” And Gemini cracked like a nut.

Source: @alex_prompter

The Smart One

If Gemini 3 Pro is not enough for you, Deep Think will be dropped soon for Google AI Ultra users. It’s specially tuned for tough tasks in math, coding, and scientific analysis.

Moreover, this mode boosts reasoning by thinking through multiple solution options using parallel chains of reasoning and self-checking. If your query is simple, the model thinks less and answers faster. If it’s complex, it takes more time to analyze. We’ve already observed the same feature (adaptive thinking) in ChatGPT 5.1, so it’s becoming a trend.

Deep Think scores 41% on Humanity’s Last Exam (vs. 37.5% for the base model), 93.8% on GPQA Diamond (91.9%), and a record 45.1% on ARC-AGI-2 with code execution (an absolute record for this test).

Google Antigravity – a new platform for agent-based development

The interface is nothing new

It’s not a usual IDE, more like a squad of AI agents running across your editor, terminal, and browser at the same time. You throw one prompt at it, and it splits the task, writes code, tests it live, even drops walkthroughs and reports while you chill.

I’ve already seen headlines like “I’m deleting Cursor” and “ it kills Cursor!” a few times now, and I don’t know about you, but I find it crazy to make these conclusions one day after the release.

The cool part is that it can either have the normal Editor view like a regular IDE, or jump into Manager view and boss around a bunch of agents at once. Each one does its own thing, plans out tasks, and spits out “Artifacts”, so you actually see what’s going on.

Benchmarks say it’s way faster and cleaner than Cursor. Indeed, Google Antigravity offers more affordability and smoother integration within its ecosystems, but it takes time before we really see which one is more convenient.

Gemini Agent

So the Agent plans out the tasks, digs through your Gmail, sorts your Calendar, and preps stuff like slides or summaries in a snap of one prompt. It also pushes you before doing anything major like making a purchase.

Ring a bell, doesn’t it? Of course, it reminds me of Atlas. Another mini-assistant that is supposed to get stuff done instead of just spitting out answers. By the way, do you still use Atlas? Either way, we’re inching closer to a universal AI helper, but honestly, they still feel a bit overhyped to me.

Still, Gemini 3 Pro is now integrated into many developer products and tools (another example of the company’s infrastructure building), and we need to see how it works.

First Use Cases of Gemini 3 and Prompts You Can Try

After we’ve seen the usual boring benchmarks where (surprise, surprise) the model comes out on top, let’s finally dig in what it can. Even though it’s only been like two days, there are already plenty of interesting real-world cases to check out (even with guides).

AI Google Studio

Let’s cheat a bit and take the first one from Google’s presentation to mention Vibe coding (I held back for as long as I could) and Google AI studio, that I showcased in the beginning.

The devs say Google AI Studio becomes this instant idea-to-prototype machine. You jump into Build mode, toss it a prompt or an image, and you get a fully functional app.

If nothing crosses your mind, just hit “I’m feeling lucky”and let Gemini 3 Pro come up with the concept and write the code at the same time.

You can snap a pic of a scrappy napkin sketch, drop it into Google AI Studio, and the model will spin that into a clickable web experience.

Training analysis

It can watch your full-length sports videos, pinpoint the exact mistakes, and hand you the drills to fix them. Reminds me of a ChatGPT-doctor, but in this case it’s a coach and an analyst at once.

Science, Games, and Prompts!

Keep reading with a 7-day free trial

Subscribe to Creators' AI to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Creators' AI
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture