Spec-Driven Development: A nova fronteira dos agentes de IA

Photo by @codioful on Unsplash

Table of Contents

What we can understand by "Spec"
From the problem to the real world
My current usage
References

It may sound like something new, but it isn't. We've been creating specification documents since the early days of engineering — they used to be called "Requirements Documentation", a term you probably remember from a course on "Design Patterns" or something similar.

The thing is, everything is cyclical, and we're once again focused on something that was always true but, let's be honest, was always neglected:

Good documentation changes everything.

This becomes even more important today with the intense use of AI agents focused on code generation. Documents aren't just read by humans anymore — they're also read by these same agents, which will most of the time be the ones responsible for the actual implementations. This is where a relevant use case emerges for strategies centered around SDD, as we'll see ahead.

What we can understand by "Spec"

Spec (Specification) is used to represent these documents with specific details about business rules, technical decisions, user stories, test scenarios, and so on. It's not very different from what we had some time ago — it's literally describing how the system or a given feature should look and behave.

Since it's a relatively new concept in the field of agentic workflows, there's some debate about the best moment to use SDD. Some say Greenfield projects are a better fit, while others argue that in Brownfield projects things start to get a bit more chaotic.

Greenfield and Brownfield

These are terms used to define the scope and state of projects — the first referring to projects built from scratch, and the second to projects already underway.

This relates to another recurring debate on the subject: Spec-First, Spec-anchored, and Spec-as-source. Most existing approaches focus on Spec-first — that is, documents are created first and then used as implementation sources by AI agents.

Spec-anchored and Spec-as-source are evolutionary approaches to the process, in which documents are not just used as initial requirements or rule holders, but as key pieces driving the evolution of workflows, alongside the evolution of the system itself.

Documentation is a living thing. It represents the current state of applications and needs to evolve together with the system, since both humans and AI agents will use it as the source of truth for implementations and validations.

Currently, one of the biggest challenges around using AI coding assistants is precisely generating code that makes sense given the business rules, constraints, and technical decisions involved. Another relevant aspect is token cost — tokens are expensive, and assistants often get lost doing pointless things, burning through a considerable amount of money along the way.

SDD strategies help create focus on what actually needs to be done, and especially on what should not be done. This can yield good results at a lower cost by reducing unnecessary LLM interactions.

Let's look at some examples of SDD strategies in use today:

Kiro: Has a workflow divided into Requirements, Design, and Tasks — each feature to be implemented will have these three files defining what and how it will be done.
Spec-Kit: GitHub's implementation, featuring a CLI that makes it easy to create spec files from templates.
Tessl: Every file or implementation in the codebase has a related specification file with details about what that feature is and how it should behave.

The approaches above aren't the only ones out there, but they're good options for starting to experiment with SDD in your projects. Let's take a look at how one of these spec files is actually structured.

# Feature Specification: Formulário de login

**Feature Branch**: `feat/login-form`
**Created**: 2026-05-09
**Status**: Draft
**Input**: Crie uma especificação para uma tela de login...

### User Story 1 - Successful Login (Priority: P1)

As a registered user, I want to enter my email and password into the login form so that I can access my account.

**Why this priority**: Core functionality; without successful login, users cannot use the application.

**Independent Test**: Can be fully tested by entering valid credentials and clicking "Login". Success is confirmed when the user is redirected or granted access.

**Acceptance Scenarios**:

1. **Given** the login page is loaded, **When** I enter a valid email and matching password and click "Login", **Then** I should be redirected to the application dashboard.
2. **Given** I am on the login form, **When** I click the password field, **Then** the characters I type should be masked for security.

...

## Requirements *(mandatory)*

### Functional Requirements
- **FR-001**: System MUST provide a clearly labeled text input field for the user's email address.

...

Notice that the file above describes business rules and validation scenarios in detail — there are no technical details or code suggestions yet. This is important for delimiting the scope of implementation for AI agents.

This file was generated using the CLI provided by Spec-Kit itself. Installation is straightforward and will let you create spec files very quickly. As we can see in the image below, after starting a new project, you create a spec with the /specify command:

Image representing a CLI window with an example of Spec-Kit

From the problem to the real world

Despite being a solid strategy, there are some counterpoints worth mentioning about using SDD — after all, nothing is a silver bullet.

The first issue is that we can fall into overengineering — or, put more bluntly, "using a bazooka to kill an ant" (my apologies to the ants 🐜). The idea of spinning up a spec file for everything and letting Claude sort it out sounds amazing, but at what point are we wasting precious (and extremely expensive) resources just to change a button label on a frontend screen?

Creating a detailed specification document takes time. As I mentioned earlier, documentation needs to evolve alongside the system — so if we end up with countless spec files for every little thing, we'll spend a huge amount of time keeping them updated, since they need to reflect the current state of the system.

So you might be asking yourself: "Why not just have some AI agent generate those documents?" A very fair question — but here's another one to chew on:

To what extent do AI agents have complete mastery over your business domain?

If we're working with a greenfield approach, sure, SDD will be great — we can model the system according to what we're building and gradually guide AI agents on how we want the specs and implementations to look.

But in complex scenarios, like existing projects, things are different. If your system or product already has solid, reliable documentation, then yes, it's entirely possible to use an agent to create specs from what it already knows about the system and feed it that information. But honestly, we know that's rarely the case — very few products have complete documentation, and when they do, it's usually outdated. So I'll turn the question back to you:

In your context, would it be trivial to delegate all spec generation to an AI agent?

This is just one example — we could think of many other scenarios. I believe the ideal approach is finding the right balance between what is and isn't worth writing extensive specification documents for, with user stories, test scenarios, and so on. Balance is key.

My current usage

To wrap up, I'd like to share how I've been using SDD in my day-to-day work. I'll be using an application created specifically for this article as an example — you'll find the repository link at the end.

I'm following a simplified strategy for creating spec files, using a blend of all three concepts already mentioned here: spec-first, anchored, and as-source.

We kick off the process by generating an initial file describing what's in scope for the version and what's out, along with validation scenarios, requirements, and so on. Here's how the file is structured:

# Read-It-Later — Specification

> Spec-Driven Development document. Requirement keywords (**MUST**, **MUST NOT**, **SHOULD**, **SHOULD NOT**, **MAY**, **REQUIRED**, **RECOMMENDED**, **OPTIONAL**) are used as defined in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119).

---

## 1. Purpose

The system **MUST** provide a minimal personal "read-it-later" application that allows a single user to:

1. Submit a website URL through a web form.
2. Persist the URL together with metadata automatically extracted from the target page.
3. List every previously stored URL on a dedicated page.

The system **MUST NOT** implement authentication, multi-user support, or content synchronization in this iteration. Those concerns are explicitly out of scope (see §13).

...

The key factor here is the use of requirement keywords in the descriptions:

Must: Mandatory requirements — typically those defined for V1.
Should: Important requirements that can only be changed with a good reason (the agent needs to justify it with solid arguments).
May: Optional requirements or instructions.

These and other terms were introduced in RFC2119 and serve to establish the level of importance of each requirement — a consistent way to communicate how critical a given requirement is.

In my tests I noticed that agents don't always follow the rules described in the specifications. Even though agents today have larger context windows, they still get lost when files are long and packed with rules. For this reason, we need to stay alert and always revisit the strategies we're using.

Something that has helped me throughout this process is asking the agent to act as a reviewer — to analyze and make suggestions on the spec.md file. Obviously it won't have the full context around that implementation, but the result can be useful, as it was in this case:

- The backend **MUST** enable CORS for the frontend origin (default `http://localhost:8501`). The allowed origin **SHOULD** be configurable via environment variable `FRONTEND_ORIGIN`

+ The frontend **MUST** call the backend from server-side Python (e.g. `requests`/`httpx`) running inside the Streamlit process; the browser never calls the backend directly. The backend therefore **MUST NOT** be required to expose CORS headers, and CORS middleware **SHOULD NOT** be installed unless a browser-side client is introduced in a later iteration.
...

The previous instruction stated that the backend would need to enable CORS (a security rule between frontend and backend). But after the review, the agent identified that this wasn't necessary, since one of the main requirements is to delegate API calls to the server — meaning the frontend doesn't need to and shouldn't make direct calls to the API. So we don't need CORS, because the server already knows the origin.

This simple scenario illustrates that even generating a spec that seems good enough, there can still be failure points. The ideal approach is always to have quality gates in place for thorough validation.

Quality Gate

A term used to describe quality standards you aim to meet for a given objective. In our case, we need a few validation steps before moving on to implementations.

I believe it's worth testing the concepts of SDD and drawing your own conclusions — we don't always know exactly what requirements are needed to implement a given feature, and if we do, they'll very likely change as the system grows.

That's why I think the idea of documentation that evolves alongside the system is fundamental. It allows us to maintain quality and a better shared understanding of the rules — for both humans and machines.

Everything is still very new, and discussions are ongoing about the best ways to work with and get the most out of SDD. The best thing here is to study these new concepts, test them, and above all question whether or not they make sense for your use case.

I'll leave it here — thank you for your interest and for taking the time to read this. I'm linking the repository of the application I mentioned at the start of this section. Take a special look at the spec.md file, where I apply the concepts covered here. Happy studying, and thanks! ;)

References

Photo by @codioful on Unsplash

Table of Contents

What we can understand by "Spec"
From the problem to the real world
My current usage
References

The thing is, everything is cyclical, and we're once again focused on something that was always true but, let's be honest, was always neglected:

Good documentation changes everything.

What we can understand by "Spec"

Greenfield and Brownfield

These are terms used to define the scope and state of projects — the first referring to projects built from scratch, and the second to projects already underway.

SDD strategies help create focus on what actually needs to be done, and especially on what should not be done. This can yield good results at a lower cost by reducing unnecessary LLM interactions.

Let's look at some examples of SDD strategies in use today:

Kiro: Has a workflow divided into Requirements, Design, and Tasks — each feature to be implemented will have these three files defining what and how it will be done.
Spec-Kit: GitHub's implementation, featuring a CLI that makes it easy to create spec files from templates.
Tessl: Every file or implementation in the codebase has a related specification file with details about what that feature is and how it should behave.

# Feature Specification: Formulário de login

**Feature Branch**: `feat/login-form`
**Created**: 2026-05-09
**Status**: Draft
**Input**: Crie uma especificação para uma tela de login...

### User Story 1 - Successful Login (Priority: P1)

As a registered user, I want to enter my email and password into the login form so that I can access my account.

**Why this priority**: Core functionality; without successful login, users cannot use the application.

**Independent Test**: Can be fully tested by entering valid credentials and clicking "Login". Success is confirmed when the user is redirected or granted access.

**Acceptance Scenarios**:

1. **Given** the login page is loaded, **When** I enter a valid email and matching password and click "Login", **Then** I should be redirected to the application dashboard.
2. **Given** I am on the login form, **When** I click the password field, **Then** the characters I type should be masked for security.

...

## Requirements *(mandatory)*

### Functional Requirements
- **FR-001**: System MUST provide a clearly labeled text input field for the user's email address.

...

Image representing a CLI window with an example of Spec-Kit

From the problem to the real world

Despite being a solid strategy, there are some counterpoints worth mentioning about using SDD — after all, nothing is a silver bullet.

So you might be asking yourself: "Why not just have some AI agent generate those documents?" A very fair question — but here's another one to chew on:

To what extent do AI agents have complete mastery over your business domain?

In your context, would it be trivial to delegate all spec generation to an AI agent?

My current usage

I'm following a simplified strategy for creating spec files, using a blend of all three concepts already mentioned here: spec-first, anchored, and as-source.

# Read-It-Later — Specification

> Spec-Driven Development document. Requirement keywords (**MUST**, **MUST NOT**, **SHOULD**, **SHOULD NOT**, **MAY**, **REQUIRED**, **RECOMMENDED**, **OPTIONAL**) are used as defined in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119).

---

## 1. Purpose

The system **MUST** provide a minimal personal "read-it-later" application that allows a single user to:

1. Submit a website URL through a web form.
2. Persist the URL together with metadata automatically extracted from the target page.
3. List every previously stored URL on a dedicated page.

The system **MUST NOT** implement authentication, multi-user support, or content synchronization in this iteration. Those concerns are explicitly out of scope (see §13).

...

The key factor here is the use of requirement keywords in the descriptions:

Must: Mandatory requirements — typically those defined for V1.
Should: Important requirements that can only be changed with a good reason (the agent needs to justify it with solid arguments).
May: Optional requirements or instructions.

These and other terms were introduced in RFC2119 and serve to establish the level of importance of each requirement — a consistent way to communicate how critical a given requirement is.

- The backend **MUST** enable CORS for the frontend origin (default `http://localhost:8501`). The allowed origin **SHOULD** be configurable via environment variable `FRONTEND_ORIGIN`

+ The frontend **MUST** call the backend from server-side Python (e.g. `requests`/`httpx`) running inside the Streamlit process; the browser never calls the backend directly. The backend therefore **MUST NOT** be required to expose CORS headers, and CORS middleware **SHOULD NOT** be installed unless a browser-side client is introduced in a later iteration.
...

Quality Gate

A term used to describe quality standards you aim to meet for a given objective. In our case, we need a few validation steps before moving on to implementations.

Spec-Driven Development: The new frontier of AI agents

What we can understand by "Spec"

From the problem to the real world

My current usage

References

Spec-Driven Development: The new frontier of AI agents

What we can understand by "Spec"

From the problem to the real world

My current usage

References

GitHub - WillACosta/sdd_lab_app: A simple app to showcase Spec-Driven Development (SDD)