Harness Engineering for Financial Intelligence

Abstract

AI agents are climbing a permission ladder. From code access, to environment access, to financial access. Each rung changes what the agent’s potential and what can go wrong. The harness engineering community has built sophisticated systems for coding agents: sandboxes, evaluators, feedback loops, architectural constraints. But financial access introduce failures that coding harnesses were never designed for. Transactions are irreversible—errors cost real money. The space of valid actions is defined by user intent, which is a much larger space than what rules can govern.

More fundamentally, agent spending doesn’t look like human spending. The agent isn’t executing a payment. It’s executing a task that produces payments: high-frequency, small, distributed, non-enumerable. Traditional payment systems and risk models were built for a different structure of spending entirely.

Giving AI money isn’t the problem. Making AI spend according to human intent is.

This post proposes financial harness engineering as a new design discipline: building constraint systems that make AI agents safe to operate with financial permissions. We introduce three core constructs: intent-based control, the mandate-as-sandbox, and the spending validation loop.

Background

The way we engineer AI systems has evolved in layers. Prompt engineering dealt with single-turn interactions: how to express a need clearly. Context engineering extended this to multi-turn planning: ensuring the AI has an accurate picture of the current situation across a session. Harness engineering encompasses both and goes further. It builds the foundational environment so that agents get the context they need by default, without needing humans’ constant supervision every time it executes. The effort is migrated from user’s repeated labor into the agent system.

OpenAI’s Codex team and Anthropic’s Labs team both demonstrated what good coding harnesses look like at scale. Three principles emerged: constraints that multiply agent performance, separation of generation and evaluation so agents don’t judge their own work, and feedback loops that close so errors get corrected without human intervention.

These principles work when the worst-case outcome is a broken build and they rest on a few assumptions: failures are detectable before deployment, reversible after detection, and bounded in cost. Financial systems break all three. A misrouted payment doesn’t throw a compiler error. It might look valid at the protocol level: correct format, valid recipient, sufficient balance, but completely wrong in intent. Chargebacks are slow and contested. On-chain settlement is final by design. The cost for a robust financial system isn’t time, it’s the actual amount transacted, plus cascading effects across the system.

There’s a deep misaligned understanding on agentic payment. An agent isn’t executing a payment. It’s executing a task that produces payments as a need to take a shorter route for the task. The spending pattern is high-frequency, small, distributed, and non-enumerable. Traditional payment systems were designed for the opposite: one known transaction, one known amount, one confirmation. And traditional risk control asks “is this really you?” But when an AI agent is the spender, identity isn’t the question. The agent is authorized. The question is whether what it’s doing is correct given what you asked for. This is the shift from identity-based risk to intent-based risk. From questioning the spender to questioning the underlying intent, existing harnesses and payment systems weights heavier on the former.

We are here to address the problem: is the agent spending with the aligned intent?

Agents evolve by permission, not by parameter count

There’s a misconception in how we talk about AI progress. We say models are getting more capable. True, but something else is also changing the game.

Agents are built on models with tooling. The advancement in what agents can do come from what they’re allowed to access. Give a model filesystem access and it becomes a coding agent. Give it a browser and API keys and it becomes a research agent. Give it a credit card and it becomes a financial agent, something that can act in the real human economy.

This is already happening. The moment you hand a coding agent a credit card, it can subscribe to services, purchase API access, pay for compute, buy access to research papers. The permission unlocked the capability with no model upgrade involved. The credit card turned a coding tool into a general-purpose economic agent.

This is the permission ladder:

Code access: read and write files, run commands in a sandbox
Environment access: browse the web, call APIs, manage data
Financial access: spend money, execute transactions, move value

Each rung doesn’t just add risk. It fundamentally changes the agent’s capability. And at each rung, the harness has to be carefully redesigned.

We have good harnesses for rung one. Rung two is being explored. Rung three, where agents handle real money, is largely unaddressed. We think this is the most important infrastructure problem in AI right now.

The Financial Harness

We’ve been building what we call a financial harness, a system that lets AI agents operate with financial access while keeping that access controlled and auditable.

The core insight: don’t give agents rules. Make them understand intent.

Traditional payment authorization works per-transaction. The user confirms a specific amount to a specific recipient. One transaction, one approval. The authorization object is the payment itself.

Intent-based authorization works differently. The user authorizes a task, not a transaction. “Book me a flight to Tokyo under $2,000” is an authorization to complete a goal with a budget. The agent might make five transactions or fifty in the process. The authorization object is the intent, and the payments are shorter route for pursuing it.

This is a paradigm change. Users no longer authorize “spending money.” They authorize “completing a task and allowing money to be spent in the process.”

In coding harnesses, constraints are rules: “imports follow this dependency order,” “files stay under 500 lines.” These work because the domain is formal and mechanically checkable.

Financial behaviour doesn’t decompose into rules that cleanly. “Don’t spend too much” is meaningless without context. “Only pay approved vendors” requires knowing what “approved” means in this workflow. The space of valid financial actions is defined by what the agent is trying to accomplish, not by a static policy.

So instead of rules, our harness starts with intent. This is the shift from rule-based to intent-based financial control.

Intent as the constraint layer

Intent-based control requires two capabilities working together: expression and understanding.

The user expresses what they want. The system understands that expression well enough to evaluate every downstream action against it. This is the same bidirectional capability that makes agents useful in the first place: comprehending a goal and acting on it, now turned inward as a control mechanism.

When a user says “research and book the best flight to Tokyo under $2,000,” that sentence contains everything the harness needs:

Objective: book a flight to Tokyo
Budget boundary: $2,000
Implied constraints: spending should be travel-related

The harness doesn’t need a predefined vendor list or per-transaction limits. It understands the intent and evaluates actions against it. A $400 airline charge? Consistent. A $400 SaaS subscription? Inconsistent. Blocked.

Rules require someone to anticipate every edge case in advance. Intent requires only that the user can say what they want and the system can understand what they said.

The mandate as sandbox

Think about how filesystem permissions work. A process asks for access to /data/reports. The OS doesn’t give it the whole filesystem. It scopes the access: read from this directory, write to that one, everything else is invisible. The process operates freely within its granted scope.

The financial equivalent is the mandate. The mandate is the sandbox.

It’s not a rule layered on top of an open system. It’s the boundary of the agent’s financial reality. When a user says “book a flight to Tokyo under $2,000,” the system constructs a mandate:

A $2,000 budget ceiling, enforced at infrastructure level
An intent scope limited to travel transactions
A time window for validity

The agent asked for access to “booking flights.” It got opened a scoped directory: travel spending, $2,000, this week. Everything outside that directory doesn’t exist from the agent’s perspective.

The mandate isn’t a restriction on top of financial access. The mandate is the only financial access the agent has. Just as a sandbox is the coding agent’s entire filesystem, the mandate is the financial agent’s entire wallet.

Budget boundaries are the hard walls. Intent matching creates the soft boundaries within. The agent operates freely inside and hits resistance the moment it tries to step outside. Not because a rule said “no,” but because the world outside the mandate isn’t accessible.

Spending validation: the missing feedback loop

In coding, the feedback loop is: build, test, catch error, feed back, fix. The agent iterates until the work passes review. In finance, the equivalent loop is harder because the “mistake” may have already settled. A wire transfer cannot be rescinded. The financial harness needs validation that operates before settlement.

This is the spending validation loop. It operates in three layers, from cheap to expensive:

Layer 1: Rule-based filtering. Fast, deterministic, low cost. Blacklisted categories (gift cards, gambling). Obvious deviations from the mandate scope. Repeated identical transactions that suggest a loop or waste. This catches the obvious errors before anything more expensive runs. Most traditional risk systems stop here. We don’t.

Layer 2: Intent matching. Every transaction carries an intent ID that links it back to the original mandate. The system checks: is this transaction semantically consistent with the stated objective? Is the spending path reasonable given what the agent is trying to accomplish? A $400 airline charge under a travel mandate passes. A $400 SaaS subscription under the same mandate won’t. This is the core of what makes the system different from traditional payment validation. It’s checking semantic meaning.

Layer 3: Model evaluation. When rules can’t decide and intent matching is ambiguous, a model evaluates the agent’s behaviour. This is agent-to-agent supervision. The evaluator has access to the full context: the original intent, the agent’s action history, the current transaction, and the state of the mandate. It makes a judgment call on whether the behaviour is reasonable, hallucinatory, or adversarial.

Each layer also enforces what we call forced semantic alignment. Every transaction must justify itself before execution: which intent does it serve, why is it needed, is it reasonable? An error code alone is gibberish. But with an accurate piece of message, the agent knows what’s going on in the environment. The validation loop doesn’t just block, it also gives explanations. The explanation flows back as corrective feedback, giving the agent enough context to try a better approach.

This creates a closed loop. The agent proposes, the harness validates, feedback refines, the agent iterates. All before any money moves.

The Architecture

The financial harness is essentially an operating system for AI spending. For structural mapping:

Intent layer = the API. The user expresses a goal. The system parses it into a structured objective with implicit constraints, translating expression into understanding. This is the interface between human intent and machine execution.

Mandate layer = the filesystem. The sandbox is constructed: budget ceiling, intent scope, time window, authorization boundaries. This is the agent’s financial directory. Everything inside is accessible, everything outside doesn’t exist.

Execution agent = the CPU. The agent operates freely within the mandate: calling APIs, comparing options, preparing transactions. Full autonomy within its scoped world. It explores non-deterministic paths, and the payments it generates one of the path for that exploration.

Risk control = the kernel. Every proposed transaction passes through the three-layer validation: rules, intent matching, model evaluation. The kernel doesn’t just enforce permissions. It understands the semantics of what’s being requested and whether it aligns with the mandate.

Spending validation. Flagged transactions are classified, explained, and fed back with corrective guidance. The agent knows why it was blocked and what to do differently.

Feedback loop. The agent adjusts based on validation feedback and retries within the mandate. The loop continues until the task completes or the agent escalates to the user.

The wallet is the filesystem. The mandate is the sandbox. Every transaction is a write operation. Every write is validated before it commits. Every rejected write comes with enough context to try a better approach.

What comes next

There’s an honest risk we need to name. This entire system rests on the assumption that AI agents will remain not-fully-trustworthy for a meaningful period. If models become deterministic enough, reliable enough, and the execution environment becomes closed enough, the harness layer compresses. The sandbox becomes unnecessary when the process never misbehaves.

We don’t think that’s happening soon. The trajectory points the other direction: agents are getting more capable, taking on longer tasks, operating in more complex environments. More autonomy demands more guardrails, not fewer. And even if base reliability improves, the attack surface grows with it. A more capable agent that gets prompt-injected is a more dangerous agent.

Better environments produce better agents, and the design space doesn’t shrink as models improve. It moves. Stronger models handle more complex tasks, which means the harness needs to handle more complex scenarios.

Same applies to financial harnesses. As agents handle multi-step transactions, manage portfolios, negotiate contracts, the harness evolves. Intent understanding gets deeper. Risk models get more nuanced. Feedback loops get tighter.

Giving AI money isn’t the problem. Making AI spend according to your intent is the problem. That’s what we’re solving.

The question isn’t whether AI agents will handle money. They will. The question is whether we build the harness first, or clean up the mess after.

We’d rather build the harness.

FAQ

What is financial harness engineering?

Financial harness engineering is a design discipline for building constraint systems that make AI agents safe to operate with financial permissions. It combines three constructs: intent-based control (authorizing tasks instead of transactions), the mandate-as-sandbox (bounded financial reality), and the spending validation loop (three-layer transaction verification).

How is AI agent spending different from human spending?

AI agent spending is high-frequency, small, distributed, and non-enumerable — fundamentally different from traditional single-transaction human payments. An agent isn’t executing a payment; it’s executing a task that produces payments as side effects. Traditional risk control asks "is this really you?" but for agents, the real question is whether the spending aligns with user intent.

What is a financial mandate in the context of AI agents?

A financial mandate is a scoped sandbox that defines an agent’s entire financial reality — including a hard budget ceiling, intent scope, and time window. Unlike rules layered on top of open access, the mandate constitutes the agent’s complete financial world. The agent operates freely within it and encounters resistance when stepping outside.

How does intent-based authorization differ from traditional payment authorization?

Traditional authorization works per-transaction: one payment, one approval. Intent-based authorization works per-task: "Book me a flight to Tokyo under $2,000" authorizes goal completion with financial parameters. The agent might make many transactions in pursuit of the goal, and each is evaluated for semantic consistency with the original intent.

What are the three layers of spending validation?

Layer 1 is rule-based filtering — fast, deterministic checks for blacklisted categories and obvious deviations. Layer 2 is intent matching — checking each transaction for semantic consistency with the originating mandate. Layer 3 is model evaluation — when rules and intent matching are ambiguous, a model evaluates the full context including original intent, action history, and current transaction.