The Progression Nobody Finished

Three years. Four waypoints. Each one was declared the destination on arrival, and each one was quietly demoted the moment something newer showed up.

Prompting
Prompt engineering
Context engineering
Intent engineering

The progression is real. The story we tell about it is not.

The story we tell is that we have been getting better at talking to machines. That story is incomplete.

What each step actually solved

The cleanest framing I have seen comes from outside the series and is worth borrowing. Each waypoint answers a different question. Prompt: what to do. Context: what to know. Intent: what to want. Hold that frame in mind as the four steps walk.

Prompting was the first cliff. You typed a sentence and the machine answered with a paragraph. The novelty was simply that it answered at all. GPT-3 made it real in 2020, and ChatGPT made it mainstream at the end of 2022. The discipline was that the machine answered sometimes well and sometimes badly, and the difference between the two was a black box. Prompting solved the access problem, meaning how you got useful output from a probabilistic system in the first place, and it left the reliability problem wide open.

Prompt engineering was the response. Teams began treating the prompt as a designed artifact, using few-shot examples, role priming, and chain-of-thought scaffolding to coax better behavior out of the model. A small industry of tooling appeared almost immediately. Microsoft shipped PromptFlow, and the rest of the ecosystem shipped equivalents. The discipline was real and the gains were measurable.

But prompt engineering solved the prompt. It did not solve the application. A well-crafted prompt does not know what your customer’s account balance is, what the retrieval index returned, what the previous twelve turns of the conversation said, or which tools the model can call. The bottleneck moved.

The bottleneck has been moving, and the language has never caught up.

Context engineering was the response to that. The work shifted from optimizing the sentence to optimizing what the model could see when the sentence arrived. Retrieval-augmented generation, tool registries, memory systems, and plugin architectures all became first-class concerns. Microsoft renamed the operating table the Kernel and built Semantic Kernel around it. LangChain did the same shape of work in a different idiom and arguably won the mindshare battle. By mid-2025 Andrej Karpathy was tweeting that the new skill in AI was not prompting but context engineering, and by July of that year Gartner was declaring that context engineering is in, prompt engineering is out. Eighteen months from practitioner observation to industry-analyst proclamation. The discipline was real and the gains were real.

But context engineering solved the input. It did not solve the outcome. A well-furnished context does not, by itself, tell the model what good looks like. A model with perfect retrieval and full tool access can still produce something the human did not ask for and cannot use.

Intent engineering is the current waypoint. The work shifted again, this time from optimizing the input to specifying the goal. The idea is to tell the agent what you want in terms precise enough that the agent can plan its way to the outcome rather than guess at it. GitHub’s Spec-Kit is positioned around exactly this shift, and the BMAD Method has emerged alongside it as a spec-driven framework with its own following. The rest of the field is converging on the same posture from different directions.

Intent engineering is the right diagnosis. Code is no longer the bottleneck. Specification is.

But here is the part nobody is saying.

The unfinished step

We have named the practice. We have not named the language.

Read any serious treatment of intent engineering and watch what happens. The author defines the problem clearly. The author argues persuasively that intent is now the load-bearing artifact. The author then shows examples written in English prose, in the body of a prompt.

That is the unfinished step.

Most of the current writing on intent engineering frames it at the altitude of governance, meaning the encoding of organizational objectives, cultural values, and policy constraints into agent architectures. That work matters and the writers doing it are not wrong. But the claim this series is making sits at a lower altitude and at finer granularity. Behavioral intent, expressed precisely enough that an agent can execute it and a human can verify it, at the level of the unit of work. That is where the trust loop closes or fails to close. Governance sits above. The unit of work is where the code gets written. The claim is not smaller. It is more consequential, because nothing in the governance layer matters if the unit of work cannot be specified and verified.

Prompting graduated to prompt engineering when prompts became artifacts a team could review and version. Prompt engineering graduated to context engineering when context became infrastructure a platform could manage. Context engineering graduated to intent engineering when the goal became more important than the input.

Intent engineering has not yet graduated, because the artifact it produces, meaning the intent itself, is still being written in unstructured prose.

The discipline has a name. The medium does not.

The decades-long dance

This part is worth slowing down for, because it is the move every senior engineer in the audience has already half-thought.

Humans have been expressing intent to other intelligent executors for as long as software has been built. Product managers express intent to engineers. Engineers receive that intent, enrich it, and translate it into something the machine understands. We have been doing this dance for decades, with a stack of artifacts between the human and the machine. The user story. The acceptance criteria. The design doc. The API contract. The test. The code itself.

The artifacts are not all equal. Some are negotiable English. Some are executable. The act of engineering is the act of moving down that stack, from the squishy to the precise, and producing something a deterministic machine can run.

“Oh, you just got here? I been doin this shit for over ten years.”

— Puffy, I Get Money (Forbes 1, 2, 3 Remix)

AI-augmented engineering changes one variable in that equation. The executor on the other end is no longer deterministic. The agent is probabilistic. It will give you something different on Tuesday than it gave you on Monday, given the same input.

This is the fact every step in the progression has been quietly trying to solve.

That is what the four waypoints have in common. They are all attempts to wring determinism out of an inferential machine. Better prompts to make the output less variable. Better context to make the input more complete. Better intent to make the outcome more constrained.

Each step pushed determinism a little further into the system. None of them closed it.

The newest move on the field, harness engineering, is a confirmation of this pattern rather than an exception to it. A harness is the system around an agent, including the tools, validators, retry logic, observability, and guardrails that constrain what the agent can do and verify what it produces. A good harness is also a specification-enrichment engine. It takes the user stories and acceptance criteria a human authored, surfaces the gaps, recommends the additions needed to close them, and routes those recommendations back to a human for accept, reject, or redirect. I know this because I am building harnesses, and the harnesses I build do exactly that. They take acceptance criteria from a PRD, harden them into Gherkin, generate feature files, generate BDD tests, and surface the gaps the original criteria did not anticipate.

The human originates intent. The harness pressure-tests it. The specification, hardened by both, is what closes the trust loop.

What closes the loop

Here is the wager that Articles 4 through 13 will defend.

The loop closes when the specification of intent is also the verification of output. When the artifact that says what the agent should do is the same artifact that tests whether the agent did it. When intent and verification are the same document.

That is not a new idea. It is, in fact, an old one, old enough that most teams have it sitting in their backlog under the name acceptance criteria. They are just not using it the way the moment demands.

Article 1 planted the flag. Article 2 staked out the ground beneath it. This article traced how the industry got here and named the step nobody has finished.

The next article asks the question that makes the rest of the series urgent. If agents can produce code in seconds, what is the scarce resource?

The answer is not code. It never was.

Further reading: Article 1 of this series, ”Gherkin May Be the Most Consequential Language of the AI-Augmented Era”, and Article 2, ”Defining AI-Augmented Engineering”.