Implementing Reopt Agentic Governance: from assessment to system
Case Study — 사례 연구
Eric Han
How a production SaaS running eight AI agents used reopt architecture's principles to diagnose its governance gaps and implement them — universal audit logging, persistent approval records, MCP tracing and rate limiting, agent constraints, and an admin dashboard.
Background — assess first, then implement
Reopt is a B2B SaaS platform that manages brand, content, and customer operations in a single workspace. On top of Next.js 16, React 19, and a Turborepo monorepo sit 35 packages and four apps. Eight AI agents handle document generation, brand strategy, customer analysis, page building, and workflow automation.
Diagnosing the existing system through reopt architecture's eight patterns made four debt types stand out cleanly. Audit logging existed only on BrandDefinition, so domain-level consistency was missing (Observability Gap); approval decisions lived only in UI state and were never persisted (Contract Gap); MCP handlers had neither usage tracking nor rate limiting (Authority Sprawl); and agent execution had no structural limits (Validation Gap).
This article walks through the implementation that built a governance system on top of those diagnostic results.
Responsibility Partitioning — Agent Registry + Versioning
Reopt's eight agents each carry their own responsibility scope and register themselves with a global registry. What the governance work added were version management and execution constraints. Every agent now carries a semver version and limits its execution scope through a constraints field.
type AgentDefinition = {
id: string;
displayName: string;
description: string;
version?: string; // semver — "1.0.0"
buildSystemPrompt: (ctx: AgentContext) => string | Promise<string>;
createTools: (params: { context: AgentContext; dataStream: ... }) => ToolSet;
activeToolNames: (ctx: AgentContext) => string[];
defaultModel: string;
allowedModels?: string[] | null;
uiFeatures: AgentUIFeatures;
constraints?: {
maxToolCallsPerTurn?: number; // upper bound on tool calls per turn (default 10)
maxTokensPerSession?: number; // upper bound on tokens per session
maxSteps?: number; // maximum execution steps (default 5)
};
};
AgentDefinition — the agent contract extended with version and execution constraints
AgentContext now carries agentId, so agent identity threads through the entire call chain from tool execution to audit logging. From route handler → agent selection → context creation → tool factory → audit log, the agentId is never broken.
Module Contract — Tool Registry expansion
As the Tool Registry grew from 15 to 32 entries, the metadata grew alongside it. On top of the existing input/output declarations, the contract now includes failure modes, whether the tool calls an LLM internally, and a per-hour rate limit.
type ToolDefinition = {
id: string;
displayName: string;
category: ToolCategory;
action: ToolAction;
needsApproval: boolean;
parameters?: ToolParamDefinition[];
outputFields?: ToolParamDefinition[];
executeSummary?: string;
detailUrlPattern?: string;
// fields added in the governance implementation
involvesLLM?: boolean; // a tool that calls Claude internally
rateLimitPerHour?: number; // per-tool hourly call cap
failureModes?: Array<{
code: string;
retryable: boolean;
userMessage: string;
}>;
};
ToolDefinition — the contract with failure modes, LLM-call flag, and rate limit
The involvesLLM flag is the key to cost tracking. Tools like requestSuggestions, createDocument, and createPostDraft call Claude internally, so double-billing occurs. With this flag, the cost dashboard can identify the LLM-in-LLM pattern.
failureModes lifts refusal conditions out of code, where they used to scatter, into the registry. reopt architecture's judgment question — "under what conditions must this module refuse or fail?" — now has a declarative answer.
Human Approval — persistent approval records
Previously, an approval decision lived only in the UI message state (approval-pending → approval-responded). Once the chat ended there was no way to trace who had approved what. The AiToolApprovalRecord model fills that gap.
model AiToolApprovalRecord {
id String @id @default(cuid())
workspaceId String
chatId String // in which conversation
messageId String // on which message
agentId String // which agent
toolId String // requested which tool
toolArgs Json // with which arguments
status String @default("pending") // pending | approved | denied
resolvedAt DateTime? // when the decision was made
createdAt DateTime @default(now())
@@index([workspaceId, status])
@@index([chatId])
}
AiToolApprovalRecord — the persistent record of an approval decision
When the approval flow detects a tool call, a pending record is created; the user's response updates it to approved or denied. With this, questions like "what was last month's approval-rejection rate for the cx agent?" and "which tool is rejected most often?" become answerable.
Decision Traceability — universal audit logging
Change history used to be recorded only in the BrandDefinitionChange table. Customer tags, content publishing, and CMS property changes were not tracked. AiToolAuditLog provides consistent audit logging across every data-mutating tool.
model AiToolAuditLog {
id String @id @default(cuid())
workspaceId String
agentId String // which agent
toolId String // with which tool
userId String // in which user's session
entityType String // which entity
entityId String // with which ID
action String // create | update | delete
before Json? // state before the change
after Json? // state after the change
approvalId String? // reference to the approval record
durationMs Int? // execution duration
createdAt DateTime @default(now())
@@index([workspaceId, createdAt(sort: Desc)])
@@index([agentId, createdAt(sort: Desc)])
@@index([entityType, entityId])
}
AiToolAuditLog — universal audit log that threads agent, tool, and entity together
// logToolAction — audit helper that works inside and outside a transaction
export async function logToolAction(
client: PrismaClient | PrismaTransaction,
params: {
workspaceId: string;
agentId: string;
toolId: string;
userId: string;
entityType: string;
entityId: string;
action: "create" | "update" | "delete";
before?: unknown;
after?: unknown;
approvalId?: string;
durationMs?: number;
},
) {
await client.aiToolAuditLog.create({ data: params });
}
logToolAction — an audit helper compatible with Prisma transactions
Two design decisions matter. First, no foreign key to the workspace: audit logs must survive workspace deletion (retention-first). Second, a logging failure must not stop tool execution (fire-and-forget). Audit is observation, not blocking.
It currently applies to six data-mutating tools: updateCustomerTag, createCustomerTask, updatePostTags, updatePostProperties, savePostDraft, publishPost. The before/after snapshot makes it precise what changed; the approvalId makes it traceable under which approval it ran.
MCP Governance — tracing, rate limiting, audit
MCP handlers previously had no usage tracking, rate limiting, or audit logging. External agents (Claude Code and others) acted in a complete black box. Three layers closed this Observability Gap.
// 1. McpToolInvocation — call records
model McpToolInvocation {
id String @id @default(cuid())
workspaceId String
clientId String // which client
userId String
toolName String // which tool
status String // success | error
durationMs Int?
createdAt DateTime @default(now())
// args and result are intentionally excluded for sensitive-data risk
@@index([workspaceId, createdAt(sort: Desc)])
@@index([clientId, createdAt(sort: Desc)])
}
McpToolInvocation — records of external-agent tool calls
// 2. MCP rate limiting — Redis sliding window
const READ_LIMIT = 100; // 100 per minute
const WRITE_LIMIT = 10; // 10 per minute
// Separate read and write for differentiated limits
const WRITE_TOOLS = new Set(["reopt_eav_record_bulk_update"]);
// Fail-open on Redis failure — a rate-limiter outage does not block tools
// 429 responses include the Retry-After header
MCP Rate Limiting — differentiated read/write rate limits
// 3. withAudit — non-blocking audit wrapper
export function withAudit(handler: McpHandler): McpHandler {
return async (params) => {
const start = Date.now();
try {
const result = await handler(params);
// fire-and-forget: audit-log failure does not delay the response
void logMcpInvocation({ ...params, status: "success", durationMs: Date.now() - start });
return result;
} catch (err) {
void logMcpInvocation({ ...params, status: "error", durationMs: Date.now() - start });
throw err;
}
};
}
withAudit — a non-blocking wrapper that prevents audit failures from blocking tool execution
The composition order of the three layers is rate-limit → audit → handler. Block excessive calls with the rate limit first, apply audit to the calls that pass, then run the handler. Each layer is independent, and a failure in one layer does not collapse the others.
Governance Dashboard — operational visibility
Audit logs and MCP traces are not governance if you cannot view them. Two dashboards were added to the admin app.
- AI Audit page — audit logs and approval records separated into tabs. Filter by agentId, toolId, status, and date. Expandable before/after JSON view. 50 entries per page.
- MCP Usage page — summary cards for total calls, success rate, and average response time. A chart of the top 20 tools by call frequency. Recent call logs. 7 / 30 / 90 day range selector.
Agent Usage Statistics rolls up daily totals (session count, tokens, credits) from the AiCreditLedger and Chat tables. This is the foundation for tracking cost per agent.
These dashboards enable the SHARPEN phase of the OCLS loop. To adjust boundaries from operational data, you must first be able to see operational data.
Remaining work
The governance implementation closed much of the Observability Gap and Contract Gap, but work remains.
- No risk-tier classification — every approval-target tool requires the same level of approval. A low/medium/high matrix is needed.
- No Cross-Agent Collaboration — agent-to-agent handoffs and context transfer are not supported. The transition to Stage 3 (Multi-Agent Collaboration) is still ahead.
- No rationale — audit logs record what changed, not why the decision was made.
- The other data-mutating tools — only 6 of 32 registered tools have audit logging applied. Expansion to the CUD tools in the Canvas and Document domains is needed.
On reopt architecture's evolution scale, Reopt is at the transition point between Stage 2 (Responsibility Split) and Stage 3 (Multi-Agent Collaboration). With the governance infrastructure laid down, the foundation now exists to design inter-agent collaboration rules.
Conclusion — structure isn't finished in one pass
Reopt's governance implementation proceeded in two steps. First, the existing system was diagnosed with reopt architecture's patterns to identify the gaps. Second, those gaps were filled at the infrastructure level. The universal audit log (AiToolAuditLog), persistent approval records (AiToolApprovalRecord), MCP tracing and rate limiting (McpToolInvocation + Redis rate limiter), agent constraints, and the admin dashboard are the result.
Three core design principles drove it: retention-first (logs outlive the original data), fire-and-forget (observation does not block execution), fail-open (an infrastructure failure does not collapse the feature). These principles came from the production-SaaS reality that governance must not sacrifice performance or stability.
As reopt architecture's OCLS loop says, governance is not finished in one pass; it improves iteratively against operational data. This implementation completed the first loop. The data collected in the SHARPEN phase will trigger the next OWN phase — risk-tier classification, cross-agent collaboration, and rationale recording.
Tags