ARCHITECTURE
Execution unit, owner, collaboration rules, governance — the four layers stack to complete the structure. Each layer answers a question the previous one cannot.
Premises of the AI-Native Product
reopt architecture rests on the premise that AI-native products differ fundamentally from traditional software.
Non-deterministic execution
The same input can produce different outputs. The design discipline is therefore not to guarantee outputs but to declare judgment conditions and refusal boundaries.
Duality of ownership
The owner of an outcome can be a human or an AI. Human ownership allows contextual judgment; AI ownership requires every condition to be made explicit. This distinction determines how strict the governance must be.
Balance of autonomy and control
AI's value comes from autonomous judgment, but unchecked autonomy produces agentic debt. To raise autonomy, you must first define the structure.
Speed is the default, structure is the choice
In an era where AI can ship an MVP in a day, speed is not the differentiator. Only structure separates a sustainable product from a one-off demo.
Structure is governance
Adding governance as a separate process invites teams to bypass it. Governance must come from the product's own structure to operate naturally.
Goals of this architecture
- Keep it explainable, at any scale, who owns which outcome.
- Declare judgment conditions and failure paths as contracts, so that structure itself becomes governance.
- Use the OCLS loop so that operational data continuously corrects the structure.
- Let the whole team (product, engineering, operations) communicate in the same structural vocabulary.
Non-goals
- Does not prescribe the usage of any particular framework or library.
- Does not cover prompt-engineering technique.
- Does not recommend model selection or fine-tuning strategy.
- Does not require changes to organizational structure or HR systems.
The Trap of Speed Without Structure
AI can ship an MVP in a day. But three months later, when nobody can explain why the product decided what it decided, speed has turned into poison.
Prompts alone do not become structure
No matter how carefully you tune the prompt, it still cannot answer who owns the outcome or what happens when it fails.
The faster you ship, the faster it becomes a black box
A demo takes a day. But when cost, quality, approval, and accountability tangle, within three months no one can explain the product.
There is no path from demo to production
The first demo is easy. Without a structural path to team-scale operation and iterative improvement, the project stops at the demo.
Why four layers
Each layer answers a question the previous one cannot. Module alone cannot answer "who owns this outcome," so Agent is required. Agent alone cannot answer "how do they collaborate," so Collaboration rules are required. Collaboration alone cannot answer "is the product evolving safely," so Governance is required.
| Layer | Responsibility | Contract | Operations | Observed metrics (examples) | Protocol reference |
|---|---|---|---|---|---|
| Module | The product's minimum execution unit. Performs one capability and returns a result. | Declares input conditions, output format, authority scope, and refusal conditions as a contract. | Tracks call count, cost, output quality, and failure reasons. | Average cost per call, failure rate, average response time, contract-violation count | MCP Tool Definition — input/output schemas play the same role as the MCP tool schema |
| Agent | The actor that owns and judges outcomes. Uses one or more modules to reach a goal. | Declares goal, authority scope, delegation policy, and termination condition. | Records judgment rationale, execution steps, and approval events. | Goal-completion rate, average handle time, collaboration frequency, escalation rate | A2A Agent Card — declares the agent's goal, authority, and termination condition |
| Collaboration | Defines role allocation, information transfer, and flow coordination between actors. | Defines collaboration rules, information-transfer scope, and cost-attribution rules. | Monitors bottlenecks, wait time, rework, and collaboration failures. | Collaboration-failure rate, average wait time, rework frequency | A2A Task Lifecycle — collaboration flow, state transitions, information transfer |
| Governance | Enforces evaluation, approval, cost control, and policy so the product evolves safely. | Provides guardrails, approval policy, evaluation criteria, and correction criteria. | Monitors quality variance, policy violations, and human-intervention points. | Guardrail-block count, approval wait time, quality-score distribution, policy-violation rate | Infrastructure-level guardrails (OPA, RBAC) — a policy-enforcement layer above protocol |
The point of the four layers is not to slice the product more finely. It is to make explainable who owns the outcome, who absorbs the failure, and how improvement iterates.
Design Principles
If the principles below stop holding, reopt architecture regresses into yet another bundle of feature-driven automation.
Own Every Outcome
Design around responsibility units, not features
An agent is not a function caller — it is a continuously explainable owner of outcomes.
Contract First
A module is not a callable tool — it is an execution unit with a contract
Only when a module has a contract does the structure support evaluation, replacement, authority control, and testing.
Layer, Then Scale
Scale is possible only when governed through classification and structure
As agents multiply, structuring them into categories, layers, and boundaries is what keeps governance intact and scale sustainable.
Sharpen in Operation
Architecture is a system that evolves in operation
Responsibility boundaries and policies must keep adjusting from failure logs and evaluation outcomes.
Thinking Vocabulary
Four thinking tools. Each concept is a lens for design decisions and corresponds to a phase of the OCLS governance loop.
OWN
Own Every Outcome
Assign an owner to every outcome, and name whether that owner is a human or an AI. Human ownership leaves room for contextual judgment; AI ownership requires every condition to be declared as a contract. The type of owner determines how strict the governance must be.
Before building, assign an owner to every outcome and mark whether it's a human or an AI.
Is the owner of this outcome a human or an AI? And when something goes wrong, who is the escalation target?
CONTRACT
Contract First
Declaring input, output, authority, and refusal conditions before implementation makes evaluation, replacement, and control possible. The more an AI owns the outcome, the stricter and more explicit the contract must be. The contract is the language of governance.
If you can write the contract, you've understood the responsibility.
What input must this module refuse? When you can answer, the contract is complete.
LAYER
Layer, Then Scale
Even as agents multiply, structuring them into categories, layers, and boundaries preserves governance. Layering is the scaling strategy. Multi-agent scaling research from Google and MIT (2026, as reported by InfoQ) showed empirically that task-dependency structure is the deciding factor — for tasks with heavy tool coordination, multi-agent overhead actually degrades performance, and the optimal collaboration strategy varies by task.
Before creating a new agent, decide where it sits within the existing categories.
Which of the four layers does this agent operate in, and what is its relationship to the existing agents?
SHARPEN
Sharpen in Operation
Tuning boundaries from operational data lets governance evolve with reality. Anthropic's [Demystifying Evals for AI Agents] frames eval-driven development: defining evaluation first surfaces requirement ambiguity before implementation. The moment a capability eval graduates to a regression eval is the signal that the boundary has stabilized.
Start with provisional boundaries and let operational data drive splits, merges, and re-categorization.
Do recent operational signals call for boundary adjustment? Handoff failures, accountability gaps, and cost anomalies are those signals.
OCLS Loop
Own → Contract → Layer → Sharpen. A cyclical model for governance design. Each pass sharpens the governance boundaries.
Contract First and Sharpen in Operation are not in conflict. Contracts are provisional. Operational data updates the contract. Signals surfaced in SHARPEN (handoff failure, accountability gaps, cost anomalies) trigger the next OWN phase, where existing contracts and boundaries are revisited.
OCLS in the Development Lifecycle
The traditional SDLC does not handle the non-deterministic execution and governance requirements of agents. The OCLS loop operates as the governance layer of the Agentic SDLC (ASDLC). OWN + CONTRACT run inside the Design Loop, LAYER inside the Run Loop, and SHARPEN inside the Governance Loop. Key distinction: ASDLC declares not only "what the agent must do" but "what it must never do" — refusal conditions and guardrails — at design time.
Design Review Checklist
Applying the One Question to each layer turns it into a concrete validation question. Use this checklist in design reviews, sprint kickoffs, and incident retrospectives.
CONTRACT
Module
- Are this module's input, output, and failure conditions explicitly defined?
- What input must this module refuse?
- Can a contract violation be detected at runtime?
OWN
Agent
- Can this agent's responsibility scope be stated in one sentence?
- Are its goal, authority scope, and termination condition declared?
- When this agent fails, who is the escalation target?
LAYER
Collaboration
- Is the context passed at handoff the minimum required?
- Are inter-agent collaboration rules (order, conditions, recovery paths) explicit?
- Is a retry or alternate path defined for handoff failure?
SHARPEN
Governance
- Are evaluation criteria defined quantitatively?
- Are human-approval criteria for high-risk actions explicit?
- Can every decision be traced after the fact?
Agentic Debt — the debt that accrues when speed outpaces structure
Technical debt accrued at low speed. Agentic debt accrues at high speed. The faster AI builds, the faster a structureless product becomes a black box.
OWN
Authority Sprawl
As agents multiply, who holds which authority becomes untraceable. Agent identities, access scopes, and execution rights expand tacitly, producing security incidents and accountability gaps at the same time.
CONTRACT
Contract Gap
Modules and agents run without documented input/output, refusal conditions, or failure modes. Side effects of a module swap or prompt change become unpredictable, and there is no basis to define evaluation criteria.
LAYER
Observability Gap
Reasoning paths, decision rationale, and handoff reasons go unrecorded. When something fails, the cause cannot be pinpointed and the whole system becomes a black box.
SHARPEN
Validation Gap
Evaluation criteria and guardrails are missing or exist only as ritual. Quality variance widens, dangerous behavior is found only after the fact, and the feedback loop for improvement is broken.
Evolution Path
1. Single-Agent Start
Start AI automation in a small scope and accumulate baseline contracts and logs.
Transition signals
- One agent juggles several roles and you can no longer tell which role caused a failure.
- Prompt changes spill into unrelated capabilities.
- You have enough per-module logs (call count, cost, success/failure) to track them independently.
Core deliverables
- Per-module draft input/output contracts
- Baseline execution logs (call count, cost, success/failure)
- List of points where roles conflict
Enterprise Security baseline — agent identity, least-privilege, baseline audit logging
2. Responsibility Separation
Separate conflicting roles such as planning, execution, review, and deployment to sharpen responsibility boundaries.
Transition signals
- Handoffs between separated agents repeatedly drop or duplicate context.
- Parallel work is required that a single agent cannot perform.
- Each agent reaches a state where it can be evaluated and improved independently.
Core deliverables
- Per-agent responsibility statements
- Handoff rules and context-passing schemas
- Per-agent independent evaluation criteria
Enterprise Compliance boundaries — per-role data access scope, PII handling policy
3. Multi-Agent Collaboration
Define collaboration rules and information flow to stabilize the product flow.
Transition signals
- Collaboration flow is stable but quality variance is wide or cost is unpredictable.
- Agents start acting beyond their authority, or human-approval situations recur.
- Scale demands explicit enforcement of governance rules.
Core deliverables
- Stabilized collaboration flow diagrams
- Context-routing rules
- State/memory separation policy
Enterprise Integration with existing systems — SSO/OIDC, existing approval workflow integration, API gateway
4. Governance by Design
Bake evaluation, approval, cost control, and policy enforcement into the product's baseline structure.
Transition signals
- The OCLS loop runs as a standing operating cadence; boundaries and contracts update on a regular schedule.
- Governance metrics (quality score, cost, policy-violation rate) are tracked steadily.
Core deliverables
- Automated evaluation and guardrail pipelines
- Approval classification matrix and escalation rules
- OCLS-based recurring review process
Enterprise Automated audit — RBAC, audit-trail automation, recurring governance reviews
Operating Loops
Design Loop
Lock responsibility boundaries first — role design, contract definition, failure-scenario specification.
Run Loop
Collect execution logs, quality scores, and cost signals to identify bottlenecks and unnecessary collaboration.
Governance Loop
Adjust evaluation criteria, approval policies, and exception handling to scale the system safely.