Context Engineering
Organizing and maintaining the optimal set of tokens and state data during LLM inference.
Definition
Context engineering is the practice of deliberately designing, organizing, and managing the information that flows into a large language model's context window to maximize the quality and relevance of its outputs. While prompt engineering focuses on crafting individual instructions, context engineering takes a systems-level view of how all the pieces of context, including system prompts, retrieved documents, conversation history, tool outputs, and user inputs, are assembled and prioritized.
Key characteristics of context engineering include:
-
Information Architecture: Practitioners decide what information to include, exclude, summarize, or defer, treating the context window as a scarce resource that must be allocated strategically.
-
Dynamic Context Assembly: Rather than static prompts, context engineering involves building pipelines that assemble context dynamically based on the current task, user state, and available information.
-
State Management: In multi-turn or agentic workflows, context engineers design how conversation history is compressed, which tool outputs are retained, and when to reset or summarize accumulated state.
-
Retrieval Integration: Context engineering determines how and when to pull external knowledge via RAG, balancing retrieval relevance against context window capacity.
-
Evaluation-Driven: Effective context engineering requires measuring output quality against different context configurations, treating context design as an empirical optimization problem rather than a one-time setup.
In the Agentic Development Handbook, context engineering is the foundation of the Context-First Architecture pillar. The Context Index serves as the canonical registry that maps every project artifact — architecture decision records, dependency graphs, style guides, and test fixtures — to a retrievable location agents can query at task time. Context Packets are the delivery mechanism: scoped bundles of files, rules, and Live Spec references assembled for a single agent task so that the model receives precisely the information it needs without exhausting its token budget.