// AI Agent Rate Control Protocol
A plain-text file convention for defining rate limits and cost controls in AI agent projects. Define token throughput ceilings, API call rates, spend limits, and automatic slow-down behaviour — before your agent hits a hard wall.
THROTTLE.md is a plain-text Markdown file you place in the root of any repository that contains an AI agent. It defines the rate limits and cost controls your agent must respect — and what to do when it approaches them.
AI agents consume tokens, make API calls, write files, and spend money — at whatever rate the underlying model and tools allow. Without explicit rate controls, a busy agent can exhaust a daily budget in minutes, hammer a rate-limited API until it's blocked, or overwhelm a database with concurrent writes.
Drop THROTTLE.md in your repo root and define: token and API call rate ceilings, hourly and daily cost limits, concurrency caps, and the behaviour at each threshold — warn at 80%, slow at 95%, pause at 100% and hand off to ESCALATE.md. The agent reads it on startup. Your compliance team reads it in the audit.
Enterprise AI governance frameworks require documented resource controls. The EU AI Act (effective 2 August 2026) mandates resource consumption reporting and control mechanisms for high-risk AI systems. Gartner's AI Agent Report identifies governance and resource control as critical deployment requirements. THROTTLE.md gives you the documented controls and the audit trail.
Copy the template from GitHub and place it in your project root:
Before THROTTLE.md, rate control rules were scattered: hardcoded in the system prompt, buried in config files, missing entirely, or documented in a Notion page no one reads. THROTTLE.md makes rate controls version-controlled, auditable, and co-located with your code.
The AI agent reads it on startup. Your engineer reads it during code review. Your compliance team reads it during audits. Your regulator reads it if something goes wrong. One file serves all four audiences.
THROTTLE.md is one file in a complete open specification for AI agent safety. Each file addresses a different level of intervention.
A plain-text Markdown file defining rate limits and cost controls for AI agents. It sets ceilings on token throughput, API call rates, concurrent tasks, and spend per hour and per day. When an agent approaches a limit, it slows automatically. When it hits a limit, it pauses and hands off to the escalation protocol.
API rate limits are enforced externally by the service provider — they cut your agent off without warning. THROTTLE.md is your own proactive control layer. It slows the agent gracefully before an external limit is hit, preserves queued work, and notifies you before things go wrong rather than after.
With queue enabled (the default), tasks are buffered — not dropped. The agent processes them at the reduced rate. Priority tasks (human responses, safety checks) skip the queue entirely. Tasks older than the configured timeout are dropped and logged.
Yes. The spec supports priority task lists that bypass queue restrictions, and the limit fields cover distinct resource types (tokens, API calls, file writes, database queries, cost). You can tune each independently per project.
Warning (default 80%) — agent logs the event and reduces rate by 25%, but continues. Throttle (default 95%) — agent cuts rate by 50% and notifies the operator. Limit breach (100%) — agent pauses all new tasks and hands off to ESCALATE.md for human intervention.
Yes — it is framework-agnostic. It defines the policy; your agent implementation enforces it. Works with LangChain, AutoGen, CrewAI, Claude Code, custom agents, or any AI system that can read its own configuration files.
This domain is available for acquisition. It is the canonical home of the THROTTLE.md specification — the rate control layer of the AI agent safety stack, essential for any production AI deployment.
Inquire About AcquisitionOr email directly: info@throttle.md