Token Strategy

OVERVIEW

How will you spend tokens?

If business decides the money that keeps the loop alive, the token strategy decides how to spend it. Saving tokens and burning them are both strategies. When you read which mode the business situation calls for and place people and validation accordingly, the technical side supports the business.

MODES

Three token-usage modes

Efficiency — spend less

Minimize token consumption. Press unit cost to the floor with lighter models, aggressive caching, short context, and batching.

When: When margin pressure is high, or you handle large volumes of low-risk, repetitive traffic. When turning self-sustaining is urgent.
Trade-off: The ceiling on quality and autonomy is low. It falls short on paths that need hard judgment or differentiation.

Balance — split by path

Differentiate by path. Assign high-performance models to high-risk, high-value decisions and lighter models to low-risk paths like classification and filtering.

When: When operating steadily and managing quality and cost at once. The default for most production.
Trade-off: Per-path policy and monitoring add operational complexity. Set the boundaries wrong and you miss on both sides.

Aggressive — spare no tokens

Spare no tokens. Deploy top-performance models, extended thinking, and multi-agent verification to push quality and speed up.

When: When differentiation is the point, or high-value decisions and early trust matter more than cost. The phase of buying time with investment.
Trade-off: High token unit cost pressures self-sufficiency. Sustained without margin discipline, cost overtakes revenue.

OPS

The people and quality that execute the strategy

Choosing a mode alone does not turn the loop. Organize people as roles, not org charts — name at each OCLS stage who owns (OWN), what gets approved (human approval), and where it is tuned (SHARPEN). A role is a point of responsibility on the loop, not a job title, and it requires no reorganization.

Quality is not built anew. Evaluation and guardrails (pass@k/pass^k, capability and regression evals), private evals (business-outcome criteria), decision traceability, and governance lint already exist. The token strategy's job is to wire this validation permanently into the loop — whichever mode you pick — so honesty holds whether you burn more tokens or fewer.

STAGE GUIDE

Stage guide — mode and staffing

As the business stage changes, the recommended mode and staffing change with it.

Evolution stage	Recommended mode	Staffing
1. Single-Agent Start	Efficiency — watch unit cost first	One person owns direction, execution, and review together.
2. Responsibility Separation	Efficiency→Balance — begin per-path differentiation	Designate an outcome owner per role and separate review.
3. Multi-Agent Collaboration	Balance — go aggressive on high-risk paths	People running approval gates and evaluation stay resident on the loop.
4. Governance by Design	Situational and dynamic — let margin decide the mode	Validation, operations, and cost management are internalized as roles (not a reorg).