TOKEN STRATEGY
The second face of the environment is "how you spend tokens." Run the same loop differently — efficiency, balance, aggressive — and allocate people to the stage.
How will you spend tokens?
If business decides the money that keeps the loop alive, the token strategy decides how to spend it. Saving tokens and burning them are both strategies. When you read which mode the business situation calls for and place people and validation accordingly, the technical side supports the business.
Three token-usage modes
Efficiency — spend less
Minimize token consumption. Press unit cost to the floor with lighter models, aggressive caching, short context, and batching.
- When margin pressure is high, or you handle large volumes of low-risk, repetitive traffic. When turning self-sustaining is urgent.
- The ceiling on quality and autonomy is low. It falls short on paths that need hard judgment or differentiation.
Balance — split by path
Differentiate by path. Assign high-performance models to high-risk, high-value decisions and lighter models to low-risk paths like classification and filtering.
- When operating steadily and managing quality and cost at once. The default for most production.
- Per-path policy and monitoring add operational complexity. Set the boundaries wrong and you miss on both sides.
Aggressive — spare no tokens
Spare no tokens. Deploy top-performance models, extended thinking, and multi-agent verification to push quality and speed up.
- When differentiation is the point, or high-value decisions and early trust matter more than cost. The phase of buying time with investment.
- High token unit cost pressures self-sufficiency. Sustained without margin discipline, cost overtakes revenue.
The people and quality that execute the strategy
Choosing a mode alone does not turn the loop. Organize people as roles, not org charts — name at each OCLS stage who owns (OWN), what gets approved (human approval), and where it is tuned (SHARPEN). A role is a point of responsibility on the loop, not a job title, and it requires no reorganization.
Quality is not built anew. Evaluation and guardrails (pass@k/pass^k, capability and regression evals), private evals (business-outcome criteria), decision traceability, and governance lint already exist. The token strategy's job is to wire this validation permanently into the loop — whichever mode you pick — so honesty holds whether you burn more tokens or fewer.
Stage guide — mode and staffing
As the business stage changes, the recommended mode and staffing change with it.
| Evolution stage | Recommended mode | Staffing |
|---|---|---|
| 1. Single-Agent Start | Efficiency — watch unit cost first | One person owns direction, execution, and review together. |
| 2. Responsibility Separation | Efficiency→Balance — begin per-path differentiation | Designate an outcome owner per role and separate review. |
| 3. Multi-Agent Collaboration | Balance — go aggressive on high-risk paths | People running approval gates and evaluation stay resident on the loop. |
| 4. Governance by Design | Situational and dynamic — let margin decide the mode | Validation, operations, and cost management are internalized as roles (not a reorg). |