Spec-to-Code Ratio

The Spec-to-Code Ratio (SCR) measures the percentage of Live Specs that result in a functional pull request without requiring a human code rewrite. It is calculated as:

PRs merged without human code changes / Total agent-generated PRs

A PR counts as "without human changes" when it passes the Eval Harness, passes human review, and merges with no modifications beyond trivial formatting. Any substantive code edit by a human reviewer — fixing logic, adding missing error handling, restructuring an approach — disqualifies the PR from the numerator.

Target ranges for mature teams:

Above 0.7 — the team's specifications are precise enough that agents produce merge-ready code more than 70% of the time. This is the target for teams with established agentic workflows.
0.5 to 0.7 — functional but with room to improve. Specs are generally sound, but edge cases or architectural constraints are regularly underspecified.
Below 0.5 — specs are not detailed enough for reliable agent execution. More than half of agent output requires human rewriting, which negates much of the throughput benefit of agentic workflows.

The SCR is the most actionable metric for the Context Architect role. When it drops, the cause almost always traces to specification quality rather than agent capability:

Ambiguous acceptance criteria — the spec does not define clear pass/fail conditions, leaving the agent to guess at intent.
Missing edge cases — the spec covers the happy path but omits error handling, boundary conditions, or concurrency scenarios.
Stale Golden Samples — the Golden Samples included in the Context Packet no longer reflect current codebase patterns, causing the agent to produce structurally outdated code.

Tracking the SCR alongside the Correction Ratio provides a complete picture: the SCR measures whether the spec was good enough to get it right the first time, while the Correction Ratio measures how much effort was needed to fix it when it was not.

Spec-to-Code Ratio

Definition