NORNRMandates, approvals and evidence for autonomous agents.
Concept guide
Comparison guideAgent spend governance vs API rate limiting — what's the difference?
Rate limiting caps request volume. Agent spend governance controls which payments move and keeps a human-readable audit trail. They solve different problems and belong in different layers of your agent architecture.
1. What API rate limiting does
API rate limiting is a mechanism enforced by the API provider — OpenAI, Anthropic, Stripe, or any other vendor. It caps how many requests your account can make in a given time window: requests per minute, tokens per day, or API calls per month. When your agent exceeds the limit, the provider returns a 429 error and your code must back off and retry.
Rate limiting exists to protect provider infrastructure and ensure fair access across all customers. It operates at the account level: if you have ten agents running simultaneously, they all share the same rate limit bucket. It does not know which agent made which request, what the purpose of the request was, or whether a human approved it. It is a blunt, provider-enforced instrument that reacts after a threshold is crossed — not before.
Rate limiting is useful for preventing your agents from triggering 429 errors and causing downstream failures. It is not a governance tool. It does not record who spent what, on what, and why. It does not route spend to a human reviewer. It does not distinguish between an agent that has explicit authorization and one that has gone rogue.
2. What spend governance does
Agent spend governance is a control layer you implement inside your own code, before any money moves. Unlike rate limiting, which is enforced externally after the fact, spend governance runs proactively: your agent checks a policy engine before it initiates any paid action, receives a decision, and only proceeds if the decision is approved.
A spend governance system tracks budget at the agent level, not the account level. Each agent has its own wallet with its own daily limit, per-call threshold, and counterparty allowlist. A human reviewer can see every decision — approved, queued, or rejected — in a centralized audit trail. When a request exceeds the automatic-approval threshold, it is queued for human review rather than silently allowed through or silently blocked.
Spend governance provides accountability. It answers the question "which agent spent how much, on what vendor, for what purpose, and who approved it?" Rate limiting answers only "how many requests were made from this account?" The two serve fundamentally different purposes and operate on fundamentally different data.
3. Key differences
| Dimension | API rate limiting | Agent spend governance |
|---|---|---|
| Enforced by | Provider (external) | You (inside agent code) |
| Timing | After the limit is hit (reactive) | Before the action executes (proactive) |
| Granularity | Account-level, all agents share one limit | Per-agent, per-wallet, per-call |
| Human approval | None | Yes — queued decisions require human sign-off |
| Audit trail | Provider usage logs only | Full decision log: agent, amount, counterparty, purpose, outcome |
| Counterparty control | None | Yes — allowlist per wallet |
| Policy language | Requests per minute (integer) | Budget, threshold, counterparty, mandate |
| Failure mode | 429 error — retry | rejected or queued — stop or escalate |
4. Why both matter and how they coexist
Rate limiting and spend governance are not alternatives — they are complementary controls that belong in different layers of your agent architecture. Rate limiting belongs at the SDK retry layer: configure exponential backoff in your OpenAI or Anthropic client so 429 errors are handled cleanly without crashing the agent. Spend governance belongs at the business logic layer: add wallet.pay() before each paid action so every spend intent is evaluated, recorded, and routed appropriately.
An agent that has spend governance but no rate limit handling will crash on 429 errors from a busy provider. An agent that has rate limit handling but no spend governance will spend freely within the provider's limits, with no per-agent accountability and no human escalation path. You need both.
5. When to use which
- Use rate limiting when: your agent is hitting 429 errors from a provider; you need to stay within provider-enforced quotas; you are optimizing throughput across multiple concurrent agents.
- Use spend governance when: you need per-agent budget caps; you need a human approval path for large or unusual spend; you need an audit trail for compliance or accountability; you want to restrict which vendors an agent can pay.
- Use both when: you operate autonomous agents in production, which is nearly always the right answer. Rate limiting prevents infrastructure failures; spend governance prevents financial and reputational risk.
6. NORNR's approach
NORNR implements the spend governance layer. It provides a wallet API that your agent code calls before any paid action. The wallet evaluates the request against your policy and returns a decision in milliseconds. Every decision — approved, queued, or rejected — is written to an immutable audit trail that you can review in the NORNR control room or export for compliance purposes.
NORNR does not replace your rate limit handling. It sits above it in the stack. Your SDK handles 429s; NORNR handles governance. Together they give you both the operational resilience of proper retry logic and the business controls of a governed spend layer that keeps humans informed and in control.