Why not just write longer prompts with more detail?

Two reasons. First, models lose retrieval quality past a certain prompt length — past ~2,000 tokens of instruction, recall on early constraints drops. Second, multi-step prompts let you inspect intermediate outputs and correct course before the model commits to a wrong path.

Doesn't chunking double the latency?

Yes — and it's usually worth it. The wrong-direction tax on a single big prompt (you read 800 words of output, realize it's off, and re-prompt) is much higher than the cost of two sequential prompts that each go right the first time.

When is a single big prompt actually better?

When the task is genuinely one job and you have a strong template (see the scaffolded ask). Don't chunk for the sake of it. Chunking helps when the steps require different kinds of thinking; it adds friction when they don't.

Multi-step prompts: when to break a request into chunks

Some prompts produce better answers when you ask for less at once. "Take this report, summarize it, identify the three biggest risks, propose mitigations for each, and write an exec-level memo" is technically one prompt — but it's actually four distinct cognitive jobs, and most models do at least one of them poorly when forced to do all four in one shot. This guide is when to chunk a prompt into steps, and the chunking patterns that work.

Three signs your prompt should be two prompts

Sign 1: The request changes mode mid-prompt. If you're asking for analysis AND synthesis AND output formatting in one go, mode-switching costs you quality. Models do better when they finish one mode before switching.

Sign 2: A later step depends on a judgment the model has to make in an earlier step. "Identify the three biggest risks AND propose mitigations" forces the model to commit to its risk identification silently. If the risks it picked were wrong, the mitigations follow them off a cliff. Chunking lets you intervene.

Sign 3: The output format makes inspection hard. If the final asked-for output is "a polished memo", you can't see whether the analysis was right — it's buried in prose. Chunking surfaces the intermediate work.

The chunking patterns

Pattern A: Analysis → review → synthesis.

Step 1 (analyze):
Read the report. List the five distinct risks it raises, ranked
by severity. Don't propose anything. Don't synthesize.

[Inspect output. Correct course if needed.]

Step 2 (synthesize):
Of those five risks, take the top three. For each, propose
one concrete mitigation.

[Inspect output.]

Step 3 (format):
Now turn the top three risks + mitigations into a 200-word
exec memo. Use the voice in my system prompt examples.

You can do all three in one chat thread, branching if you want to keep the intermediate outputs.

Pattern B: Generate options → choose → refine.

Step 1: Give me five different headlines for <piece>. Don't
explain. Just the five.

Step 2: I picked #3. Give me three variations of #3 that
test different emotional registers.

Step 3: I'm going with #3b. Tighten it to under 60 characters
without losing the meaning.

This works well for any creative-output task. The model produces diverse options when it doesn't have to commit to one; you commit, the model refines.

Pattern C: Draft → diagnose → revise.

Step 1: Draft the piece.

[Branch here.]

Step 2 (in the new branch): Reverse-outline the draft. One-line
summary per paragraph. List the three claims it's making, ranked
by how strongly the evidence supports each.

Step 3: Knowing what you know from the reverse outline, revise
the draft. Fix the unsupported claim first, the muddled paragraph
second.

The branch keeps the draft alive while you diagnose it, so you can return to the original if the revision goes wrong.

When NOT to chunk

If the task is genuinely one job and you have a strong template — the scaffolded ask is a good example — chunking adds friction without adding quality. Don't chunk emails. Don't chunk single-function code requests. Don't chunk "explain X to me" prompts.

The signal: if you can't articulate what the intermediate steps would be, you don't need them.

Where this fits

Chunking pairs naturally with branching — every chunk boundary is a natural branch point if you want to try the next step two different ways. And it pairs with the portable system prompt template — the same instruction set applies across chunks regardless of which model you route each chunk to.

Try oran.chat free if you want the chunking workflow with native branching built in. More practical workflows in Playbooks.