WorkflowIntermediate

Spec-Driven Development

Replace ad-hoc prompting with structured Live Specs and Context Packets to produce deterministic, evaluable agent outputs through the Specify-Execute-Evaluate cycle.

Overview

Spec-Driven Development is a workflow pattern in which every unit of agent work begins with a structured specification — a Live Spec — rather than an ad-hoc natural-language prompt. The Context Architect authors the spec, an agent executes against it, and an Eval Harness validates the result. This three-phase loop — Specify, Execute, Evaluate — is the Spec Driven Development methodology described in the Agentic Development Handbook, and it is the primary alternative to Vibe Coding.

Problem

Teams that rely on ad-hoc prompting encounter predictable failure modes:

Non-reproducible outputs. The same intent phrased differently produces different code. There is no stable artifact to version, diff, or review.
Missing context. Prompts rarely carry the full system context an agent needs — architecture constraints, interface contracts, quality standards — so the agent guesses, and guesses diverge.
No evaluation anchor. Without machine-readable acceptance criteria, there is no way to automatically verify whether agent output satisfies the requirement. Review becomes a manual, subjective process.
Drift across sessions. Knowledge evaporates between agent sessions. Each new conversation starts from zero unless the developer manually re-supplies context.

These problems compound as teams scale the number of agents and tasks. What works for a single developer chatting with a copilot breaks down when multiple agents execute in parallel across a codebase.

Solution

Replace the ad-hoc prompt with a formal specification layer composed of two artifacts:

Live Spec — A versioned, machine-readable document that defines what the agent must build, including behavioral contracts, acceptance criteria, and references to relevant context.
Context Packet — A bundled set of files, schemas, examples, and instructions that the agent receives alongside the spec. Context Packets supply the how — architecture decisions, coding standards, API contracts, and Golden Samples that demonstrate expected output quality.

The Context Architect authors and maintains these artifacts. Execution follows the Triangular Workflow:

Specify — The Context Architect writes or updates the Live Spec with clear acceptance criteria and attaches the relevant Context Packet.
Execute — The agent receives the spec and context, then produces code, tests, or documentation.
Evaluate — The Eval Harness runs automated checks against the acceptance criteria defined in the spec. Failures loop back to the Execute phase with diagnostic context; passes advance the output to human review gates.

This pattern applies Context Engineering principles: the bottleneck in agent performance is not model capability but the quality and completeness of context provided to the model.

Implementation

Code Examples

Ad-Hoc Prompt (Before)

Can you create a React product card component? It should show the
product image, name, price, and have an add-to-cart button. Use
TypeScript and Tailwind. Make it responsive.

This prompt lacks architecture context, has no acceptance criteria, and produces non-reproducible results.

Live Spec (After)

# specs/product-card.spec.yaml
spec:
  id: product-card-v2
  title: Product Card Component
  status: active
  author: "@context-architect"

behavioral_contract:
  description: >
    A presentational React component that displays a single product
    with image, name, formatted price, and an add-to-cart action.
  inputs:
    - name: product
      type: "Product"
      source: "src/types/product.ts"
  outputs:
    - rendered ProductCard component
    - onAddToCart callback invocation with product ID

acceptance_criteria:
  - id: ac-1
    description: Renders product image with lazy loading
    validation: unit-test
  - id: ac-2
    description: Displays formatted price using currency util
    validation: unit-test
  - id: ac-3
    description: Calls onAddToCart with product.id on button click
    validation: unit-test
  - id: ac-4
    description: Passes axe accessibility audit with zero violations
    validation: a11y-check
  - id: ac-5
    description: Responsive layout at 320px, 768px, and 1024px breakpoints
    validation: visual-regression

context_references:
  - path: context/frontend-standards.md
  - path: context/component-patterns.md
  - path: src/types/product.ts
  - path: src/components/ExampleCard.tsx  # golden sample

scope:
  includes:
    - ProductCard component implementation
    - Unit tests for all acceptance criteria
  excludes:
    - Cart state management
    - API integration

Considerations

Benefits

• **Reproducibility.** The same spec produces consistent agent output regardless of phrasing, session, or agent model.
• **Evaluability.** Machine-readable acceptance criteria enable automated validation through the [[eval-harness]], reducing reliance on manual review.
• **Knowledge accumulation.** Specs and Context Packets are versioned artifacts that capture institutional knowledge. They survive developer turnover and agent model changes.
• **Parallelization.** Multiple agents can execute against different specs simultaneously because each spec is self-contained with its own context.
• **Governance integration.** Specs provide a natural gate for [[gate-based-governance]] — review the spec before authorizing execution, then review agent output against the spec criteria.
• **Measurable improvement.** Teams can track spec pass rates over time and identify which context gaps cause the most failures.

Challenges

• **Upfront investment.** Writing a Live Spec takes more time than typing a prompt. The payoff comes from reuse, reproducibility, and reduced rework — but teams must commit to the practice before seeing returns.
• **Spec maintenance.** Specs must evolve with the codebase. Stale specs produce incorrect agent output. Teams need processes (or agents) to keep specs current.
• **Context Packet curation.** Assembling and maintaining high-quality Context Packets requires ongoing effort from the [[context-architect]]. Under-specified context leads to the same problems as ad-hoc prompting.
• **Tooling maturity.** The ecosystem for spec-driven agent workflows is still developing. Teams may need to build custom tooling for spec parsing, context assembly, and eval harness integration.
• **Cultural shift.** Developers accustomed to direct coding or conversational prompting may resist the overhead of writing specs. Leadership must reinforce that specs are the primary engineering artifact in an agentic workflow.

Spec-Driven Development

Overview

Problem

Solution

Implementation

Define the Live Spec Structure

Build Context Packets

Author Specs Before Assigning Work

Configure the Evaluation Harness

Execute the Triangular Workflow

Iterate on Specs, Not Prompts

Code Examples

Considerations