ARCHITECTURE

Architecture

Execution unit, owner, collaboration rules, governance — the four layers stack to complete the structure. Each layer answers a question the previous one cannot.


Premises of the AI-Native Product

reopt architecture rests on the premise that AI-native products differ fundamentally from traditional software.

P.01

Non-deterministic execution

The same input can produce different outputs. The design discipline is therefore not to guarantee outputs but to declare judgment conditions and refusal boundaries.

P.02

Duality of ownership

The owner of an outcome can be a human or an AI. Human ownership allows contextual judgment; AI ownership requires every condition to be made explicit. This distinction determines how strict the governance must be.

P.03

Balance of autonomy and control

AI's value comes from autonomous judgment, but unchecked autonomy produces agentic debt. To raise autonomy, you must first define the structure.

P.04

Speed is the default, structure is the choice

In an era where AI can ship an MVP in a day, speed is not the differentiator. Only structure separates a sustainable product from a one-off demo.

P.05

Structure is governance

Adding governance as a separate process invites teams to bypass it. Governance must come from the product's own structure to operate naturally.

GOAL

Goals of this architecture

  • Keep it explainable, at any scale, who owns which outcome.
  • Declare judgment conditions and failure paths as contracts, so that structure itself becomes governance.
  • Use the OCLS loop so that operational data continuously corrects the structure.
  • Let the whole team (product, engineering, operations) communicate in the same structural vocabulary.
NON–GOAL

Non-goals

  • Does not prescribe the usage of any particular framework or library.
  • Does not cover prompt-engineering technique.
  • Does not recommend model selection or fine-tuning strategy.
  • Does not require changes to organizational structure or HR systems.

The Trap of Speed Without Structure

AI can ship an MVP in a day. But three months later, when nobody can explain why the product decided what it decided, speed has turned into poison.

Prompts alone do not become structure

No matter how carefully you tune the prompt, it still cannot answer who owns the outcome or what happens when it fails.

The faster you ship, the faster it becomes a black box

A demo takes a day. But when cost, quality, approval, and accountability tangle, within three months no one can explain the product.

There is no path from demo to production

The first demo is easy. Without a structural path to team-scale operation and iterative improvement, the project stops at the demo.

Why four layers

Each layer answers a question the previous one cannot. Module alone cannot answer "who owns this outcome," so Agent is required. Agent alone cannot answer "how do they collaborate," so Collaboration rules are required. Collaboration alone cannot answer "is the product evolving safely," so Governance is required.

4-LAYER · STACK VIEW[ BOTTOM → TOP ]ABSTRACTIONL4GOVERNANCEGovernanceIs the system evolving safely?L3COLLABORATIONCollaborationHow do multiple agents collaborate?L2AGENTAgentWho owns this outcome?L1MODULEModuleWhat is the I/O contract?
Each layer resolves a question the previous one cannot
LayerResponsibilityContractOperationsObserved metrics (examples)Protocol reference
ModuleThe product's minimum execution unit. Performs one capability and returns a result.Declares input conditions, output format, authority scope, and refusal conditions as a contract.Tracks call count, cost, output quality, and failure reasons.Average cost per call, failure rate, average response time, contract-violation countMCP Tool Definition — input/output schemas play the same role as the MCP tool schema
AgentThe actor that owns and judges outcomes. Uses one or more modules to reach a goal.Declares goal, authority scope, delegation policy, and termination condition.Records judgment rationale, execution steps, and approval events.Goal-completion rate, average handle time, collaboration frequency, escalation rateA2A Agent Card — declares the agent's goal, authority, and termination condition
CollaborationDefines role allocation, information transfer, and flow coordination between actors.Defines collaboration rules, information-transfer scope, and cost-attribution rules.Monitors bottlenecks, wait time, rework, and collaboration failures.Collaboration-failure rate, average wait time, rework frequencyA2A Task Lifecycle — collaboration flow, state transitions, information transfer
GovernanceEnforces evaluation, approval, cost control, and policy so the product evolves safely.Provides guardrails, approval policy, evaluation criteria, and correction criteria.Monitors quality variance, policy violations, and human-intervention points.Guardrail-block count, approval wait time, quality-score distribution, policy-violation rateInfrastructure-level guardrails (OPA, RBAC) — a policy-enforcement layer above protocol

The point of the four layers is not to slice the product more finely. It is to make explainable who owns the outcome, who absorbs the failure, and how improvement iterates.

Design Principles

If the principles below stop holding, reopt architecture regresses into yet another bundle of feature-driven automation.

Own Every Outcome

Design around responsibility units, not features

An agent is not a function caller — it is a continuously explainable owner of outcomes.

Contract First

A module is not a callable tool — it is an execution unit with a contract

Only when a module has a contract does the structure support evaluation, replacement, authority control, and testing.

Layer, Then Scale

Scale is possible only when governed through classification and structure

As agents multiply, structuring them into categories, layers, and boundaries is what keeps governance intact and scale sustainable.

Sharpen in Operation

Architecture is a system that evolves in operation

Responsibility boundaries and policies must keep adjusting from failure logs and evaluation outcomes.

Thinking Vocabulary

Four thinking tools. Each concept is a lens for design decisions and corresponds to a phase of the OCLS governance loop.

OWN

Own Every Outcome

Assign an owner to every outcome, and name whether that owner is a human or an AI. Human ownership leaves room for contextual judgment; AI ownership requires every condition to be declared as a contract. The type of owner determines how strict the governance must be.

Before building, assign an owner to every outcome and mark whether it's a human or an AI.

Is the owner of this outcome a human or an AI? And when something goes wrong, who is the escalation target?

CONTRACT

Contract First

Declaring input, output, authority, and refusal conditions before implementation makes evaluation, replacement, and control possible. The more an AI owns the outcome, the stricter and more explicit the contract must be. The contract is the language of governance.

If you can write the contract, you've understood the responsibility.

What input must this module refuse? When you can answer, the contract is complete.

LAYER

Layer, Then Scale

Even as agents multiply, structuring them into categories, layers, and boundaries preserves governance. Layering is the scaling strategy. Multi-agent scaling research from Google and MIT (2026, as reported by InfoQ) showed empirically that task-dependency structure is the deciding factor — for tasks with heavy tool coordination, multi-agent overhead actually degrades performance, and the optimal collaboration strategy varies by task.

Before creating a new agent, decide where it sits within the existing categories.

Which of the four layers does this agent operate in, and what is its relationship to the existing agents?

SHARPEN

Sharpen in Operation

Tuning boundaries from operational data lets governance evolve with reality. Anthropic's [Demystifying Evals for AI Agents] frames eval-driven development: defining evaluation first surfaces requirement ambiguity before implementation. The moment a capability eval graduates to a regression eval is the signal that the boundary has stabilized.

Start with provisional boundaries and let operational data drive splits, merges, and re-categorization.

Do recent operational signals call for boundary adjustment? Handoff failures, accountability gaps, and cost anomalies are those signals.

OCLS Loop

Own → Contract → Layer → Sharpen. A cyclical model for governance design. Each pass sharpens the governance boundaries.

Contract First and Sharpen in Operation are not in conflict. Contracts are provisional. Operational data updates the contract. Signals surfaced in SHARPEN (handoff failure, accountability gaps, cost anomalies) trigger the next OWN phase, where existing contracts and boundaries are revisited.

OCLS · CYCLE VIEW[ 4 STAGES · ∞ LOOP ]01OWNWho ownsthis outcome?02CONTRACTWhat are the governanceconditions?03LAYERWhich category andlayer does it fit?04SHARPENWhere do operational datacall for adjustment?↻ LOOP
OWN → CONTRACT → LAYER → SHARPEN → ↻

OCLS in the Development Lifecycle

The traditional SDLC does not handle the non-deterministic execution and governance requirements of agents. The OCLS loop operates as the governance layer of the Agentic SDLC (ASDLC). OWN + CONTRACT run inside the Design Loop, LAYER inside the Run Loop, and SHARPEN inside the Governance Loop. Key distinction: ASDLC declares not only "what the agent must do" but "what it must never do" — refusal conditions and guardrails — at design time.

Design Review Checklist

Applying the One Question to each layer turns it into a concrete validation question. Use this checklist in design reviews, sprint kickoffs, and incident retrospectives.

CONTRACT

Module

  • Are this module's input, output, and failure conditions explicitly defined?
  • What input must this module refuse?
  • Can a contract violation be detected at runtime?

OWN

Agent

  • Can this agent's responsibility scope be stated in one sentence?
  • Are its goal, authority scope, and termination condition declared?
  • When this agent fails, who is the escalation target?

LAYER

Collaboration

  • Is the context passed at handoff the minimum required?
  • Are inter-agent collaboration rules (order, conditions, recovery paths) explicit?
  • Is a retry or alternate path defined for handoff failure?

SHARPEN

Governance

  • Are evaluation criteria defined quantitatively?
  • Are human-approval criteria for high-risk actions explicit?
  • Can every decision be traced after the fact?

Agentic Debt — the debt that accrues when speed outpaces structure

Technical debt accrued at low speed. Agentic debt accrues at high speed. The faster AI builds, the faster a structureless product becomes a black box.

OWN

Authority Sprawl

As agents multiply, who holds which authority becomes untraceable. Agent identities, access scopes, and execution rights expand tacitly, producing security incidents and accountability gaps at the same time.

CONTRACT

Contract Gap

Modules and agents run without documented input/output, refusal conditions, or failure modes. Side effects of a module swap or prompt change become unpredictable, and there is no basis to define evaluation criteria.

LAYER

Observability Gap

Reasoning paths, decision rationale, and handoff reasons go unrecorded. When something fails, the cause cannot be pinpointed and the whole system becomes a black box.

SHARPEN

Validation Gap

Evaluation criteria and guardrails are missing or exist only as ritual. Quality variance widens, dangerous behavior is found only after the fact, and the feedback loop for improvement is broken.

Evolution Path

1. Single-Agent Start

Start AI automation in a small scope and accumulate baseline contracts and logs.

Transition signals

  • One agent juggles several roles and you can no longer tell which role caused a failure.
  • Prompt changes spill into unrelated capabilities.
  • You have enough per-module logs (call count, cost, success/failure) to track them independently.

Core deliverables

  • Per-module draft input/output contracts
  • Baseline execution logs (call count, cost, success/failure)
  • List of points where roles conflict

Enterprise Security baseline — agent identity, least-privilege, baseline audit logging

2. Responsibility Separation

Separate conflicting roles such as planning, execution, review, and deployment to sharpen responsibility boundaries.

Transition signals

  • Handoffs between separated agents repeatedly drop or duplicate context.
  • Parallel work is required that a single agent cannot perform.
  • Each agent reaches a state where it can be evaluated and improved independently.

Core deliverables

  • Per-agent responsibility statements
  • Handoff rules and context-passing schemas
  • Per-agent independent evaluation criteria

Enterprise Compliance boundaries — per-role data access scope, PII handling policy

3. Multi-Agent Collaboration

Define collaboration rules and information flow to stabilize the product flow.

Transition signals

  • Collaboration flow is stable but quality variance is wide or cost is unpredictable.
  • Agents start acting beyond their authority, or human-approval situations recur.
  • Scale demands explicit enforcement of governance rules.

Core deliverables

  • Stabilized collaboration flow diagrams
  • Context-routing rules
  • State/memory separation policy

Enterprise Integration with existing systems — SSO/OIDC, existing approval workflow integration, API gateway

4. Governance by Design

Bake evaluation, approval, cost control, and policy enforcement into the product's baseline structure.

Transition signals

  • The OCLS loop runs as a standing operating cadence; boundaries and contracts update on a regular schedule.
  • Governance metrics (quality score, cost, policy-violation rate) are tracked steadily.

Core deliverables

  • Automated evaluation and guardrail pipelines
  • Approval classification matrix and escalation rules
  • OCLS-based recurring review process

Enterprise Automated audit — RBAC, audit-trail automation, recurring governance reviews

Operating Loops

Design Loop

Lock responsibility boundaries first — role design, contract definition, failure-scenario specification.

Run Loop

Collect execution logs, quality scores, and cost signals to identify bottlenecks and unnecessary collaboration.

Governance Loop

Adjust evaluation criteria, approval policies, and exception handling to scale the system safely.