Glossary
EvaluationEmerging

Flow Efficiency

The ratio of active agent compute time to total wall-clock time, measuring how much time agents spend working versus waiting.

Definition

Flow Efficiency measures the ratio of active agent compute time to total wall-clock time for a task. It is calculated as:

Active compute time / Total time from task assignment to PR submission

Active compute time includes all periods when the agent is generating code, running tests, or interacting with the Eval Harness. Total time includes everything from the moment a task is assigned to an agent until the final pull request is submitted — including all wait states, queue time, and human review delays.

Target ranges:

  • Above 0.6 — agents spend more than 60% of their assigned time actively working. This indicates a well-functioning pipeline with minimal bottlenecks.
  • 0.4 to 0.6 — moderate efficiency with identifiable drag factors. Improvement is possible by addressing specific wait states.
  • Below 0.4 — the bottleneck is in human processes, not agent speed. Agents are spending more time waiting than working, which means adding more agents will not increase throughput.

Common drag factors that reduce Flow Efficiency:

  1. Review Queue Buildup — completed agent work sits waiting for human review. This is the most common cause of low Flow Efficiency and is addressed by improving the Operator Leverage Ratio.
  2. Context Preparation DelaysLive Specs and Context Packets are not ready when agents are available, creating idle time at the start of the pipeline.
  3. Infrastructure Wait Times — provisioning Ephemeral Workbenches, pulling dependencies, or waiting for external service availability adds non-productive time.
  4. Rescue Mission Latency — when an agent raises a Blocker Flag, the time between the flag and the operator's response is pure wait time.

Flow Efficiency is monitored on the AgentOps Dashboard and reviewed during the Daily Flow Sync. It complements Token Budget tracking: a low Flow Efficiency with a low token spend indicates the pipeline is starved for human attention, while a low Flow Efficiency with a high token spend indicates agents are retrying failed approaches.

Last updated: 3/11/2026