ATP
Agent Tool Protocol

Agent Tool Protocol

Why AI Agents Need to Write & Execute Code, Not Call Tools

The evolution from tool calling to code execution. Built by the team that scaled monday.com's MCP gateway to hundreds of developers.

The Journey: From MCP Gateway Builder to Protocol Designer

After building monday.com's internal MCP gateway - now serving hundreds of developers and hundreds of tools, I started noticing something weird. Every time I talked to friends at other companies, they were building the exact same thing.

Turns out, MCP was designed as a server protocol. monday.com exposes their MCP server, Slack exposes theirs, GitHub exposes theirs. But actually using them together? You need a gateway that aggregates them all. So everyone was building their own gateway, over and over.

Then the real scaling problems show up:

  • keeping stateful connections stable across multiple providers is complex and resource heavy
  • managing hundreds of tools requires smarter discovery and ranking or your context and latency explode
  • ensuring security and auditability means layering in access control, integrating external guardrails, logging, and credential management

That's when I read Cloudflare's Codemode article which blown my mind. Everything clicked. AI agents shouldn't call pre-defined tools - they should write code.

So I built Agent Tool Protocol (ATP).

The Real Problem: OpenAPI Was Built for Code, Not 'Tools'

OpenAPI was designed for developers writing code. Look at any REST API - it's generic on purpose. A /users endpoint returns 50 fields because different use cases need different data. Perfect when a developer writes:

const users = await fetch('/users').then(r => r.json());
const activeEmails = users
  .filter(u => u.status === 'active')
  .map(u => u.email);

But try wrapping that /users endpoint as an MCP "tool"? Disaster:

❌ Too generic: Returns 50 fields, bloats context with data you don't need
❌ Too specific: When you need just one more field, tough luck
❌ No composition: Want to filter + map + reduce? Multiple tool calls with LLM roundtrips

Why OpenAPI is Perfect for This

OpenAPI isn't some new experimental spec. It's the industry standard for REST APIs:

  • Battle-tested: Created in 2011, evolved for 14+ years
  • Widely adopted: Google, Microsoft, Stripe, GitHub, AWS, Twilio - basically everyone
  • Built-in output schemas: Every endpoint defines its response structure with JSON Schema
  • Agents know exactly what they'll get back (unlike MCP's unstructured content)

Five Problems I Hit at monday.com

1The Strawberry Problem: Agents Can't Build Their Own Solutions

There's a thing to test every new model with the strawberry test. Count the r's in "strawberry" and watch them fail. Not because they're dumb. Because llm's are limited - and not designed for counting.

But imagine your agent could write code. It could just run this:

const count = "strawberry".split('').filter(c => c === 'r').length; // 3

✓ ATP lets agents write code. Problem solved.

2Everyone's Building the Same Gateway

MCP is a server protocol - each service exposes their own:

  • monday.com has monday-mcp
  • GitHub has github-mcp
  • Slack has slack-mcp

Want to use them together? Build a gateway that aggregates them all, handles auth, manages stdio processes, provides unified discovery.

✓ ATP has aggregation built-in. One server, all your APIs.

3MCP Discovery Happens Too Early

Here's the fatal flaw with MCP's design: list_tools runs before the agent even starts reasoning.

1. Agent connects to MCP server
2. Server runs list_tools → returns ALL tools with descriptions
3. ALL tool descriptions get loaded into the prompt
4. Agent finally starts reasoning with bloated context

The fundamental issue: MCP requires you to decide which tools the agent can see before the agent tells you what it's trying to do. It's backwards.

// ATP's approach:
const tools = await fetch('/api/search', {
  body: JSON.stringify({ query: "send email" })
});
// Gets back: [{ name: "email.send", signature: "...", description: "..." }]
// Only relevant tools, only when needed

4Security & Control: Code Execution with Guardrails

Here's the real problem with MCP adoption: community servers run locally with stdio.

# Install community MCP server
npx awesome-notion-mcp
# You just gave it: file system, network, env vars, process execution
❌ Full file system access (read your SSH keys, API tokens, etc.)
❌ Network access (exfiltrate your data)
❌ Environment variables (all your secrets)
❌ Process execution (install malware)

✓ ATP provides multiple layers of protection:

Isolated sandbox: Code runs in isolated-vm with no file/network/process access
API annotations: Mark APIs as destructive, sensitive, or safe
Black/whitelist controls: Explicitly allow or block specific APIs
Human approval: Destructive APIs automatically trigger approval before execution
Rules file: Define custom guidelines for when agents should pause or ask for approval

Controlled Execution

Unlike exposing all_monday_api blindly, ATP lets you annotate each endpoint. Create/read operations can run automatically, while delete/update operations require approval. You get the power of code execution with the control of granular tool permissions.

5Multi-Level Agents Need Nested LLM Calls

With atp.llm.call(), you can convert any agent to a supervisor with multi sub-agents:

// Main agent gets task: "Process these 1000 customer feedback messages"

const feedback = await api.feedback.list({ limit: 1000 });

// Spawn sub-agents to process in parallel
const results = await Promise.all(
  feedback.map(item =>
    atp.llm.extract({
      prompt: `Extract key issues from: ${item.message}`,
      schema: { category: 'string', severity: 'number', issue: 'string' }
    })
  )
);

// Only aggregated results go back to main context
return { totalIssues: results.length, byCategory: groupBy(results, 'category') };

Why this matters

  • Main agent context stays clean (no 1000 messages bloating it)
  • Parallel LLM calls with Promise.all (1000x faster than sequential)
  • Sub-agents can have different context/prompts
  • Multi-level reasoning (main agent strategizes, sub-agents execute)

Real-World Benchmarks: Where ATP Dominates

I tested ATP against MCP using real tasks from our monday.com gateway. Here's what happened:

Note: All ATP code shown below was dynamically written by the LLM during execution - not pre-defined templates or scripts.

1. Email Filtering: 200 Emails by Assignee

Filter 200 support emails to find all assigned to 'sarah@company.com'

MCP Approach

Tool calls: 1
LLM calls: 1
Context tokens: ~85,000 tokens
Cost: $0.85
Time: 18 seconds

ATP Approach

Tool calls: 1
LLM calls: 0
Context tokens: ~1,500 tokens
Cost: $0.015
Time: 1.2 seconds

57x cheaper, 15x faster, 98% less context

const emails = await api.email.list({ limit: 200 });
const sarahEmails = emails.filter(e => e.assignee === 'sarah@company.com');
return sarahEmails; // Only 12 emails

2. Article Summary Pipeline: Fetch → Summarize → Send

Get last 20 tech articles, summarize each, post summary to Slack

MCP Approach

Tool calls: 22
LLM calls: 21
Context tokens: ~45,000 tokens
Cost: $2.10
Time: 4 minutes

ATP Approach

Tool calls: 2
LLM calls: 20
Context tokens: ~2,800 tokens
Cost: $0.28
Time: 12 seconds

7.5x cheaper, 20x faster, 94% less context

const articles = await api.news.fetchLatest({ limit: 20 });

// Parallel summarization (multi-level agents)
const summaries = await Promise.all(
  articles.map(article =>
    atp.llm.extract({
      prompt: `Summarize: ${article.content}`,
      schema: { title: 'string', summary: 'string', keyPoints: 'string[]' }
    })
  )
);

await api.slack.postMessage({
  channel: '#tech-news',
  text: summaries.map(s => `*${s.title}*: ${s.summary}`).join('\n\n')
});

3. Customer Data Enrichment: 150 Records

Fetch 150 customer records, enrich each with Clearbit, update CRM

MCP Approach

Tool calls: 301
LLM calls: 150
Context tokens: ~95,000 tokens
Cost: $9.50
Time: 25 minutes

ATP Approach

Tool calls: 152
LLM calls: 0
Context tokens: ~1,200 tokens
Cost: $0.12
Time: 8 seconds

79x cheaper, 188x faster, 99% less context

const customers = await api.crm.getCustomers({ limit: 150 });

// Parallel enrichment
const enriched = await Promise.all(
  customers.map(async customer => {
    const companyData = await api.clearbit.enrich({ domain: customer.domain });
    return { ...customer, ...companyData };
  })
);

await api.crm.batchUpdate(enriched);

4. Multi-Source Data Aggregation

Get sales from Stripe, support tickets from Zendesk, user activity from Mixpanel → Create weekly report

MCP Approach

Tool calls: 4
LLM calls: 1
Context tokens: ~48,000 tokens
Cost: $2.40
Time: 22 seconds

ATP Approach

Tool calls: 4
LLM calls: 1
Context tokens: ~3,200 tokens
Cost: $0.32
Time: 4 seconds

7.5x cheaper, 5.5x faster, 93% less context

// Fetch in parallel
const [sales, tickets, activity] = await Promise.all([
  api.stripe.getSales({ period: 'week' }),
  api.zendesk.getTickets({ period: 'week', status: 'closed' }),
  api.mixpanel.getEvents({ period: 'week', event: 'feature_used' })
]);

// Sub-agent analyzes (keeps main context clean)
const insights = await atp.llm.extract({
  prompt: `Analyze: Sales ${sales.total}, Tickets ${tickets.length}, Active users ${activity.uniqueUsers}`,
  schema: {
    salesTrend: 'string',
    topIssues: 'string[]',
    userEngagement: 'string',
    recommendations: 'string[]'
  }
});

const report = {
  period: 'Last 7 days',
  metrics: { sales: sales.total, tickets: tickets.length, users: activity.uniqueUsers },
  insights
};

await api.email.send({ 
  to: 'team@company.com', 
  subject: 'Weekly Report', 
  body: JSON.stringify(report) 
});

5. GitHub PR Review Automation

Find all open PRs, check if tests pass, summarize changes, post review comments

MCP Approach

Tool calls: 181
LLM calls: 45
Context tokens: ~125,000 tokens
Cost: $12.50
Time: 8 minutes

ATP Approach

Tool calls: 60
LLM calls: 20
Context tokens: ~2,100 tokens
Cost: $0.21
Time: 15 seconds

60x cheaper, 32x faster, 98% less context

const prs = await api.github.listPRs({ state: 'open', repo: 'company/product' });

// Filter PRs with passing tests (no LLM needed)
const passingPRs = prs.filter(pr => pr.checks === 'passing');

// Parallel analysis
const reviews = await Promise.all(
  passingPRs.map(async pr => {
    const diff = await api.github.getDiff({ prNumber: pr.number });
    
    // Sub-agent analyzes
    const analysis = await atp.llm.extract({
      prompt: `Review this PR: ${diff}`,
      schema: { summary: 'string', concerns: 'string[]', lgtm: 'boolean' }
    });
    
    // Post comment if needed
    if (analysis.concerns.length > 0) {
      await api.github.postComment({
        prNumber: pr.number,
        body: `Automated review:\n${analysis.concerns.join('\n')}`
      });
    }
    
    return analysis;
  })
);

return { 
  reviewed: reviews.length, 
  prsNeedingAttention: reviews.filter(r => !r.lgtm).length 
};

Summary: ATP vs MCP

ScenarioMCP CostATP CostMCP TimeATP TimeSavings
Email Filtering$0.85$0.01518s1.2s57x cheaper, 15x faster
Article Pipeline$2.10$0.284min12s7.5x cheaper, 20x faster
Data Enrichment$9.50$0.1225min8s79x cheaper, 188x faster
Data Aggregation$2.40$0.3222s4s7.5x cheaper, 5.5x faster
PR Review$12.50$0.218min15s60x cheaper, 32x faster
Average$5.47$0.197.5min8s29x cheaper, 56x faster

Why ATP dominates:

  • Code handles filtering/logic → No LLM needed for simple operations
  • Parallel execution → Promise.all runs operations simultaneously
  • Multi-level agents → Sub-agents keep main context clean
  • No context pollution → Raw data never enters main context, only results
  • Real composition → Chain operations in code, not through tool calls

Getting Started: Stop Building Gateways

import { createServer } from '@agent-tool-protocol/server';
import { openapi } from '@agent-tool-protocol/openapi';

const server = createServer();

server.use(
  // Official APIs with built-in output schemas
  openapi.fromSpec('stripe', {
    url: 'https://api.stripe.com/openapi.json',
    auth: { scheme: 'bearer', envVar: 'STRIPE_KEY' }
  }),
  
  openapi.fromSpec('github', {
    url: 'https://api.github.com/openapi.json',
    auth: { scheme: 'bearer', envVar: 'GITHUB_TOKEN' }
  }),
  
  // Your existing MCPs (if you have them)
  mcpConnector.connect('monday', { command: 'monday-stdio-mcp' })
);

await server.listen(3000);

One server. No gateway building. Official APIs. Safe execution.

Why This Matters

OpenAPI was designed for code

It's battle-tested (14+ years), widely adopted (every major API), has built-in output schemas, and expects composition.

MCP discovery happens too early

list_tools runs before the agent reasons, forcing you to either overflow context or guess which tools are relevant.

Security requires official APIs

Community MCP servers with stdio = risk. Official REST APIs with 14+ years of hardening = secure.

Agents need autonomy

Not pre-defined tools. Code execution. Multi-level reasoning. Parallel execution.

The strawberry problem isn't about strawberries. It's about whether agents can build solutions we never anticipated.

ATP lets them.

Get Beta Access

ATP is going to be released to beta soon! Stop building gateways. Start building agents.