Codex vs Gemini CLI: Um Caso Real de Eficiência em Tokens

Photo by @willacosta on Unsplash

Table of Contents

TL;DR
Gemini Scenario
Codex Scenario
Results
References

TL;DR

What is TL;DR?

Short for Too Long, Didn't Read. An expression used to refer to a summary of the most important parts of a lengthy piece of text.

Problem: images are not rendering in a certain component after updating the project's libraries. Before the update everything worked perfectly, so the cause is most likely some breaking change.

Results:

Agent	Token Usage	Model	Session Duration
OpenAI Codex	~ 201K	GPT-5.3	~ 5 minutes
Gemini CLI	~ 1.46M	Gemini 3 Flash Preview	~ 16 minutes

Efficiency Analysis:

Codex was not only more efficient — consuming 87% fewer tokens — but also solved the problem 68.8% faster.
Meanwhile, the Gemini session took considerably longer, generating 7.27× more token usage than Codex, and in the end didn't actually solve the problem.

For this experiment I used the following models, both under paid plans for their respective platforms.

Google One AI Pro — Gemini 3 Flash Preview
OpenAI Go — GPT-5.3 Codex

The problem is a straightforward case, but one that let me test the efficiency, consumption, and "intelligence" of each agent in finding a solution.

Before we talk about the problem and how the agents performed, here is the prompt used for both:

Investigate the blogpost content at `.../ai-coding-assistant.mdx`, It isn't rendering the `ImageLayout` component properly. The images property is being received as an `undefined`, so there was an exception at the line 52 on `image-layout.tsx` file.

I've updated some app's packages, maybe it has something to do with the new "next-remote-mdx" version or other peer-dependencies.

This is the error output: """
src/components/layout/image-layout.tsx (52:15) @ map
x TypeError: Cannot read properties of undefined (reading 'map')
at ImageLayout (./src/components/layout/image-layout.tsx:40:26)
at stringify (<anonymous>)
digest: "1222779921"
"""

I had a problem in the ImageLayout component that started failing after updating some of the project's dependencies. The image paths were passed as a list using JSX syntax:

<ImageLayout
	images={[{ ... }, {...}]}
/>

As you can imagine, this component renders images in a dynamic grid. Images are provided through specific parameters such as path, quality, size, etc.

Gemini Scenario

I started the implementation with Gemini and noticed early on that the reasoning process wasn't great. The first thing it did was add a console.log to the function that renders the component:

export function ImageLayout({ images }: ImageLayoutProps) {
  console.log('ImageLayout received images:', images)

  return (
    <div className='flex flex-wrap gap-4 my-6 m-auto py-1 w-fit-content'>
      {images?.map((image, index) => { ... }
	)

After that, the agent got lost in a series of pointless steps, editing files completely unrelated to the original goal, such as:

lib/get-posts.ts
lib/get-posts-by-slug.ts

These files handle fetching blog posts and have no relationship whatsoever to the problem I asked Gemini to solve.

It also ran several build commands with little apparent purpose:

pnpm-check \
pnpm run build > build_output.txt 2>&1
...

After a good few minutes, here is the final code Gemini produced:

Screenshot Gemini CLI

The fixes suggested above make absolutely no sense as a solution to the problem. Gemini simply added a validation to check whether the value passed to the images prop was actually an array.

Besides editing files completely unrelated to the original goal, it never once attempted to find the actual root cause of the issue — which, in contrast to Codex, was resolved much faster and more accurately.

Codex Scenario

Initially, Codex started by exploring the dependencies used in the repository — which I already suspected to be the problem (and even mentioned it in the initial prompt). Here is the command the agent ran:

 pnpm list next-mdx-remote @mdx-js/mdx @mdx-js/react

After a few interactions, Codex used a very interesting approach to find the solution:

It created two versions of an MDX component simulating the same behavior as ImageLayout — one that received the images via JSX syntax using {}, and another using a JSON string.

<ImageLayout images={[...]} /> → component received {} (prop missing)
<ImageLayout images="abc" /> → component received { images: "abc" }

By compiling both components, the agent noticed that the version using JSX syntax returned undefined, while the string version worked correctly.

<ImageLayout images='[{ ... }, { ... }]' />

The agent then wrote code to parse the JSON string into an object, allowing the same structure of the existing code to be used. All of this quickly and directly, without needing to touch files outside this context. It even added a new function to normalize the images prop — a defensive strategy, I would say:

Screenshot Codex CLI

And it also adjusted the way the component is used to reflect the recent changes:

-<ImageLayout images={[{ ... }, { ... }]} />
+<ImageLayout images='[{ ... }, { ... }]' />

Results

After testing both agents, I reached the following results:

Just a reminder that the same prompt and scenario were used for both agents in order to ensure a fair comparison.

Agent	Token Usage	Model	Effort	Session Duration
OpenAI Codex	~ 201K	GPT-5.3-codex	Medium	~ 5 minutes
Gemini CLI	~ 1.46M	gemini-3-flash-preview	Not Available	~ 16 minutes

Codex was not only more efficient — consuming 86% fewer tokens — but also solved the problem 68.8× faster.
Meanwhile, the Gemini session took considerably longer, generating 7.27× more token usage than Codex, and in the end didn't actually solve the problem.

This illustrates that using the right model for a given use case not only solves the problem efficiently, but also helps save time and tokens (which are expensive resources).

I'm not saying you shouldn't use Gemini, nor that you should use Codex for everything. Each model has its strengths and weaknesses for different tasks. You need to test them and draw your own conclusions based on your specific use case.

Codex vs Gemini CLI: A Real-World Case of Token Efficiency

TL;DR

Gemini Scenario

Codex Scenario

Results

References