Prompt Chaining
Break complex tasks into a sequence of simpler prompts where each step's output feeds the next, improving reliability and debuggability.
Overview
Prompt chaining decomposes a complex AI task into a series of smaller, focused prompts. Each prompt in the chain takes the output of the previous one as input, creating a pipeline that is easier to debug, test, and iterate on than a single monolithic prompt.
This pattern is foundational for building reliable AI-assisted workflows. Instead of asking an LLM to do everything at once — where a failure in any part corrupts the whole output — you isolate each concern into its own step.
Problem
Large, complex prompts that try to accomplish multiple goals simultaneously suffer from several issues:
- Unreliable outputs — the model may excel at one sub-task but fail at another, and you get a mixed-quality result
- Difficult debugging — when output is wrong, it's hard to identify which part of the reasoning went off track
- No partial reuse — you can't reuse the good parts of a failed generation
- Context overload — cramming too many instructions into one prompt degrades performance on each individual instruction
Solution
Break the task into discrete steps, each with its own prompt. Pass the output of one step as context to the next. This gives you:
- Clear responsibility per step
- Ability to inspect and validate intermediate outputs
- Easy swapping of individual steps (different models, temperatures, or even non-AI logic)
- Natural checkpoints for human review
A typical chain might look like: Analyze → Plan → Execute → Review.
Implementation
Code Examples
async function promptChain(topic: string) {
// Step 1: Research
const research = await llm.generate({
prompt: `List 5 key points about "${topic}". Output as a JSON array of strings.`,
temperature: 0.3,
});
// Step 2: Outline
const outline = await llm.generate({
prompt: `Given these key points:\n${research}\n\nCreate a blog post outline with sections and sub-points. Output as markdown.`,
temperature: 0.5,
});
// Step 3: Write
const draft = await llm.generate({
prompt: `Write a blog post following this outline:\n${outline}\n\nWrite in a professional but accessible tone. 800-1200 words.`,
temperature: 0.7,
});
return draft;
}import { z } from 'zod';
const ResearchSchema = z.array(z.string()).min(3).max(10);
async function validatedChain(topic: string) {
// Step 1 with validation
const rawResearch = await llm.generate({
prompt: `List 5 key points about "${topic}". Output as a JSON array.`,
});
const research = ResearchSchema.parse(JSON.parse(rawResearch));
// Only proceed if research is valid
const outline = await llm.generate({
prompt: `Create an outline from these points:\n${JSON.stringify(research)}`,
});
return outline;
}Considerations
- • **Debuggability** — inspect each intermediate output to find where things go wrong
- • **Reliability** — each step is simpler and more likely to succeed
- • **Reusability** — individual steps can be reused across different workflows
- • **Testability** — each step can be unit tested independently with fixed inputs
- • **Flexibility** — swap models, adjust temperatures, or insert non-AI logic at any step
- • **Latency** — multiple sequential API calls are slower than a single call
- • **Error propagation** — a bad output from an early step corrupts the entire chain
- • **Cost** — more API calls means higher token usage and cost; tracking the [[token-budget]] across chain steps is essential
- • **Complexity** — managing the chain infrastructure requires additional code
- • **Context loss** — later steps may lack context that was available in earlier steps unless explicitly passed