Back to my writings

1M Tokens Don't Fix Workflow Problems

Anthropic just made the 1-million-token context window generally available for Opus 4.6 and Sonnet 4.6. The reactions are predictable: more space, more power, more celebration. But here is what most people are missing. The headline is not the size. It is predictability. You are no longer punished for sending more context. Competitors double your rate the moment you get serious. Claude does not. Whether your request is 9K tokens or 900K, the cost is the same. That changes behavior. You stop trimming useful information just to control cost.

But more input does not mean better output. Only better structure does.

The First 30% Is Where the Work Happens

The early part of any conversation is where the model has the sharpest reasoning. Fill that space with verbose logs, repeated context or noise and you pay for it later. By the time you get to actual work, attention is already diluted. Treat early tokens like prime real estate. Load what matters. Save the deep dives for later.

Read Big, Execute Small

This is the pattern that works. Use the 1M window to read and understand. Then kill the session and execute in small, fresh windows.

The workflow is straightforward. Load your full spec or codebase into the 1M window and let Claude digest it, spot patterns and build a map. Then have Claude split the implementation into feature groups and create a plan. Finally, spin up a new session for each feature group with clean context and focused work.

You are not scaling a conversation. You are scaling a system.

I have tested this both ways. Running numerous features in one massive window consistently degrades output quality. The model loses track of dependencies, misses critical changes and starts hallucinating by the third or fourth feature. The same work, executed in fresh sessions after a thorough read-through, produces reliable results with fewer missed requirements.

Context Decays

Long sessions degrade. Not instantly. Gradually. After a few compaction cycles, you start seeing artifacts: repeated logic, inconsistent decisions, hallucinated references that do not belong. With the old 200K window, Claude would start rushing around the 140K mark, delivering half-baked code because it knew it was running out of room.

The 1M window gives the model breathing room. But that breathing room only helps if you do not fill it with things that do not matter. Keep your working context lean. Move stable rules and repetitive instructions out of the main conversation so your high-value context space stays clean for the actual work.

When the Size Actually Matters

There are scenarios where the 1M window is a genuine advantage. Any time you need to ingest a large body of information, whether it is a complex codebase, a comprehensive set of requirements — the old approach meant splitting the work across multiple sessions or relying on specialized ingestion workflows. A project with 160+ requirement documents, for example, would previously need careful orchestration just to get the model to see the full picture. Now that could fit comfortably in a single session. The model can read everything, cross-reference across documents and maintain context without the overhead of stitching together fragmented sessions.

But here is the honest part: Opus will still forget. Even with everything loaded in context, it will miss things. A workflow detail, a constraint buried in file 47, a decision made in a passing comment. This is not a context window problem. The larger the input, the more likely something slips through the cracks. On one of my recent projects, I watched it hold over 160 documents and still lose track of a requirements that was clearly stated. Bigger context does not mean you can stop paying attention. It removes the upfront cost of getting the model to see everything. It does not remove the need to check if it actually understood it.

Outside of those, smaller windows with clean context will usually produce better results.

The Simple Test

Are you using the large window to understand the system? Or to avoid thinking about structure?

One leads to better products. The other leads to bigger mistakes.

PS: I have only been able to test the 1M window with Opus 4.6 so far. Curious to hear from others who have started working with it. How are you using it?