November 10, 2025

Tool Invocation via Code Generation—and Beyond

How OneMCP extends the next stage of agent efficiency

In today’s agent systems, the model itself is responsible for invoking tools — issuing every call, selecting parameters, and processing results directly through the model’s context window. This works at small scale but doesn’t hold up as agents connect to dozens of tools. Each tool definition and intermediate result must pass through the model, inflating token cost and reducing responsiveness.

A new paradigm is emerging: instead of invoking tools directly, the agent generates code that invokes the tools.

This is the key insight behind Anthropic’s recent article, “Code Execution with MCP: Building More Efficient Agents.”

The idea is elegant and powerful — let the model write executable code that calls MCP tools, executes logic locally, and returns only the final results. It shifts effort away from expensive in-context reasoning toward compact, externalized computation.

The article frames this as an optimization for MCP-based systems, where tools are defined and accessed through the Model Context Protocol. But the underlying concept — separating reasoning from execution — is broader and more transformative than the MCP layer itself.

From Idea to Implementation: How OneMCP Extends the Approach

The Anthropic post conveniently captures the fundamental shift that also lies at the heart of OneMCP’s architecture: moving from direct invocation to invocation through a generated or reusable execution layer.

However, OneMCP takes this idea much further — addressing key limitations that code execution alone cannot solve, and expanding the scope beyond MCP to include any API, even when no MCP server or toolset exists.

Think of OneMCP as transforming the same insight into a general-purpose runtime — one that can represent and execute API logic efficiently, deterministically, and at scale.

The Three Major Advancements in OneMCP

1. From MCP Tools to Raw APIs

Anthropic’s code execution approach assumes an existing MCP server — meaning every system you want to connect must already have MCP tools designed, implemented, and deployed. That’s still significant engineering work.

OneMCP removes that step entirely.

You simply provide an API specification and reference materials (collectively called a handbook), and OneMCP automatically exposes it as a single, natural-language MCP interface.

In other words, OneMCP works not only with MCP tools, but also directly with raw APIs — eliminating the need to manually create or maintain MCP servers. This allows developers to plug in APIs immediately and still benefit from the same agent-level abstraction and caching.

This is more than convenience — it means that the code execution principle can now apply to any backend, not just those that have been MCP-wrapped.

2. Eliminating Latency and Non-Determinism with a Semantic Cache

While code execution reduces token use, it introduces two new challenges:

Latency: Each new query still requires the model to generate and validate fresh code.
Non-determinism: The generated code may differ slightly between runs, even for identical tasks.

OneMCP solves both through a semantic execution cache.

Instead of generating code on each request, OneMCP retrieves or constructs a cached execution plan — a structured description of how to call relevant API endpoints and extract their parameters. These plans are semantically indexed and reusable across queries, providing:

Deterministic behavior: The same intent always maps to the same plan.
Low latency: Plans execute instantly, without regenerating logic.
Auto-synchronization: When APIs change, OneMCP automatically detects schema mismatches and updates affected plans.

This turns what was once a dynamic reasoning problem into a fast, predictable runtime lookup.

3. Delegating Execution to a Subagent

In Anthropic’s design, the model remains responsible for generating, executing, and interpreting the code itself — all within the agent’s cognitive workload.

In contrast, OneMCP deliberately decouples this responsibility by introducing a subagent model.

Here’s how it works:

The main agent issues high-level goals, not individual operations.
OneMCP acts as a subagent dedicated to a particular service or API.
The main agent sees only a single MCP tool (for example, onemcp.salesforce) with the instruction that “everything related to Salesforce should go here.”

The subagent handles the rest — retrieving the right execution plan, performing the call, managing caching, and returning structured results.

This architectural separation reduces cognitive load on the primary agent, allowing it to reason more effectively about goals and planning, while OneMCP ensures precise, efficient API execution in the background.

Summary

Anthropic’s “Code Execution with MCP” highlights a pivotal conceptual shift — moving from agents that directly call tools to agents that generate code that calls tools.

OneMCP embraces that same shift but advances it dramatically by addressing its practical limitations and expanding its reach:

Beyond MCP Tools: Works directly with raw APIs, eliminating the need to design or run MCP servers.
Beyond Code Generation: Uses a semantic cache to make plan execution deterministic and instant.
Beyond Single-Agent Workloads: Introduces a subagent layer that handles all operations for a given service, reducing cognitive overhead.

Code execution makes agents faster by delegating tool use to code.

OneMCP makes them scalable, stable, and universal — by turning that code into a reusable execution layer that works with both MCP tools and any API directly.

Writer’s Note

This article was authored by me based on practical experience designing and implementing LLM-backed systems. I used GPT to assist with editing and refining the language for clarity, structure, and tone — but all architectural decisions, patterns, and examples reflect my own work and reasoning.

Note: Cross-posted from Medium

Patrick Chan

Customized Plans for Real Enterprise Needs

Gentoro makes it easier to operationalize AI across your enterprise. Get in touch to explore deployment options, scale requirements, and the right pricing model for your team.

Get in Touch