Glossary
Agent ArchitectureEmerging

Software Factory

An automated development pipeline where AI agents execute spec-driven tasks under human governance, scaling output through higher operator leverage ratios.

Definition

A software factory is an automated development pipeline in which AI agents execute structured tasks — code generation, testing, refactoring, deployment — under continuous human governance. The Agentic Engineering model treats agents as execution capacity that scales through higher Operator Leverage Ratio values (more agents per human operator), not through removing humans from the process.

The defining characteristic of a software factory is that agents work from Live Spec documents rather than ad-hoc prompts. Specifications define what to build; agents determine how to build it; and an Eval Harness validates the result against machine-readable acceptance criteria. Humans remain responsible for specification authoring, evaluation design, and architectural decisions.

Maturity Levels

Software factory maturity is measured by the operator leverage ratio — the number of concurrent agent tasks a single human can effectively govern — not by the degree of human removal.

LevelNameDescriptionOperator Leverage
L0ManualNo agent involvement. Developers write all code directly.N/A
L1AssistedAgents provide inline suggestions and completions. A developer reviews each suggestion before accepting it.1:1
L2CopilotAgents generate multi-file changes from natural-language prompts. Developers review outputs before committing.1:1 to 1:3
L3Spec-DrivenAgents execute against Live Specs with automated evaluation. Human review focuses on spec quality and eval results rather than line-by-line code inspection.1:3 to 1:10
L4Governed AutonomyAgents operate continuously on queued specs with Gate Based Governance. Humans define gates, review exceptions, and handle escalations. Routine tasks flow through without manual intervention, but governance gates ensure human oversight at defined checkpoints.1:10 to 1:50

At every maturity level, Human In The Loop oversight is present. The nature of that oversight shifts from reviewing individual lines of code (L1–L2) to reviewing specifications and evaluation results (L3) to defining governance policies and handling exceptions (L4). The goal is not to eliminate human judgment but to apply it where it has the highest leverage — at the specification and evaluation layers rather than the implementation layer.

Relationship to Vibe Coding

A software factory is distinct from Vibe Coding, which relies on conversational, ad-hoc interaction with AI models. Vibe coding can be productive for exploration and prototyping but does not scale to multi-agent, multi-task execution because it lacks the structured specifications and automated evaluation that a factory pipeline requires.

Last updated: 3/11/2026