{"id":272382,"date":"2025-11-05T05:23:09","date_gmt":"2025-11-05T05:23:09","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/272382\/"},"modified":"2025-11-05T05:23:09","modified_gmt":"2025-11-05T05:23:09","slug":"building-more-efficient-ai-agents-anthropic","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/272382\/","title":{"rendered":"building more efficient AI agents \\ Anthropic"},"content":{"rendered":"<p class=\"Body_reading-column__t7kGM paragraph-m post-text\"><a href=\"https:\/\/modelcontextprotocol.io\/\" rel=\"nofollow noopener\" target=\"_blank\">The Model Context Protocol (MCP)<\/a> is an open standard for connecting AI agents to external systems. Connecting agents to tools and data traditionally requires a custom integration for each pairing, creating fragmentation and duplicated effort that makes it difficult to scale truly connected systems. MCP provides a universal protocol\u2014developers implement MCP once in their agent and it unlocks an entire ecosystem of integrations.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Since launching MCP in November 2024, adoption has been rapid: the community has built thousands of <a href=\"https:\/\/github.com\/modelcontextprotocol\/servers\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">MCP servers<\/a>, <a href=\"https:\/\/modelcontextprotocol.io\/docs\/sdk\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">SDKs<\/a> are available for all major programming languages, and the industry has adopted MCP as the de-facto standard for connecting agents to tools and data.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Today developers routinely build agents with access to hundreds or thousands of tools across dozens of MCP servers. However, as the number of connected tools grows, loading all tool definitions upfront and passing intermediate results through the context window slows down agents and increases costs.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">In this blog we&#8217;ll explore how code execution can enable agents to interact with MCP servers more efficiently, handling more tools while using fewer tokens.<\/p>\n<p>Excessive token consumption from tools makes agents less efficient<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">As MCP usage scales, there are two common patterns that can increase agent cost and latency:<\/p>\n<p>Tool definitions overload the context window;Intermediate tool results consume additional tokens.1. Tool definitions overload the context window<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Most MCP clients load all tool definitions upfront directly into context, exposing them to the model using a direct tool-calling syntax. These tool definitions might look like: <\/p>\n<p>gdrive.getDocument<br \/>\n     Description: Retrieves a document from Google Drive<br \/>\n     Parameters:<br \/>\n                documentId (required, string): The ID of the document to retrieve<br \/>\n                fields (optional, string): Specific fields to return<br \/>\n     Returns: Document object with title, body content, metadata, permissions, etc.salesforce.updateRecord<br \/>\n    Description: Updates a record in Salesforce<br \/>\n    Parameters:<br \/>\n               objectType (required, string): Type of Salesforce object (Lead, Contact,      Account, etc.)<br \/>\n               recordId (required, string): The ID of the record to update<br \/>\n               data (required, object): Fields to update with their new values<br \/>\n     Returns: Updated record object with confirmation<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Tool descriptions occupy more context window space, increasing response time and costs. In cases where agents are connected to thousands of tools, they\u2019ll need to process hundreds of thousands of tokens before reading a request.<\/p>\n<p>2. Intermediate tool results consume additional tokens<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Most MCP clients allow models to directly call MCP tools. For example, you might ask your agent: &#8220;Download my meeting transcript from Google Drive and attach it to the Salesforce lead.&#8221;<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">The model will make calls like:<\/p>\n<p>TOOL CALL: gdrive.getDocument(documentId: &#8220;abc123&#8221;)<br \/>\n        \u2192 returns &#8220;Discussed Q4 goals&#8230;\\n[full transcript text]&#8221;<br \/>\n           (loaded into model context)<\/p>\n<p>TOOL CALL: salesforce.updateRecord(<br \/>\n\t\t\tobjectType: &#8220;SalesMeeting&#8221;,<br \/>\n\t\t\trecordId: &#8220;00Q5f000001abcXYZ&#8221;,<br \/>\n  \t\t\tdata: { &#8220;Notes&#8221;: &#8220;Discussed Q4 goals&#8230;\\n[full transcript text written out]&#8221; }<br \/>\n\t\t)<br \/>\n\t\t(model needs to write entire transcript into context again)<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Every intermediate result must pass through the model. In this example, the full call transcript flows through twice. For a 2-hour sales meeting, that could mean processing an additional 50,000 tokens. Even larger documents may exceed context window limits, breaking the workflow.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">With large documents or complex data structures, models may be more likely to make mistakes when copying data between tool calls.<\/p>\n<p><img alt=\"Image of how the MCP client works with the MCP server and LLM.\" loading=\"lazy\" width=\"1920\" height=\"1080\" decoding=\"async\" data-nimg=\"1\" style=\"color:transparent\"  src=\"https:\/\/www.newsbeep.com\/us\/wp-content\/uploads\/2025\/11\/1762320189_388_image\"\/>The MCP client loads tool definitions into the model&#8217;s context window and orchestrates a message loop where each tool call and result passes through the model between operations.Code execution with MCP improves context efficiency<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">With code execution environments becoming more common for agents, a solution is to present MCP servers as code APIs rather than direct tool calls. The agent can then write code to interact with MCP servers. This approach addresses both challenges: agents can load only the tools they need and process data in the execution environment before passing results back to the model.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">There are a number of ways to do this. One approach is to generate a file tree of all available tools from connected MCP servers. Here&#8217;s an implementation using TypeScript:<\/p>\n<p>servers<br \/>\n\u251c\u2500\u2500 google-drive<br \/>\n\u2502   \u251c\u2500\u2500 getDocument.ts<br \/>\n\u2502   \u251c\u2500\u2500 &#8230; (other tools)<br \/>\n\u2502   \u2514\u2500\u2500 index.ts<br \/>\n\u251c\u2500\u2500 salesforce<br \/>\n\u2502   \u251c\u2500\u2500 updateRecord.ts<br \/>\n\u2502   \u251c\u2500\u2500 &#8230; (other tools)<br \/>\n\u2502   \u2514\u2500\u2500 index.ts<br \/>\n\u2514\u2500\u2500 &#8230; (other servers)<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Then each tool corresponds to a file, something like:<\/p>\n<p>\/\/ .\/servers\/google-drive\/getDocument.ts<br \/>\nimport { callMCPTool } from &#8220;..\/..\/..\/client.js&#8221;;<\/p>\n<p>interface GetDocumentInput {<br \/>\n  documentId: string;<br \/>\n}<\/p>\n<p>interface GetDocumentResponse {<br \/>\n  content: string;<br \/>\n}<\/p>\n<p>\/* Read a document from Google Drive *\/<br \/>\nexport async function getDocument(input: GetDocumentInput): Promise {<br \/>\n  return callMCPTool(&#8216;google_drive__get_document&#8217;, input);<br \/>\n}<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Our Google Drive to Salesforce example above becomes the code:<\/p>\n<p>\/\/ Read transcript from Google Docs and add to Salesforce prospect<br \/>\nimport * as gdrive from &#8216;.\/servers\/google-drive&#8217;;<br \/>\nimport * as salesforce from &#8216;.\/servers\/salesforce&#8217;;<\/p>\n<p>const transcript = (await gdrive.getDocument({ documentId: &#8216;abc123&#8217; })).content;<br \/>\nawait salesforce.updateRecord({<br \/>\n  objectType: &#8216;SalesMeeting&#8217;,<br \/>\n  recordId: &#8217;00Q5f000001abcXYZ&#8217;,<br \/>\n  data: { Notes: transcript }<br \/>\n});<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">The agent discovers tools by exploring the filesystem: listing the .\/servers\/ directory to find available servers (like google-drive and salesforce), then reading the specific tool files it needs (like getDocument.ts and updateRecord.ts) to understand each tool&#8217;s interface. This lets the agent load only the definitions it needs for the current task. This reduces the token usage from 150,000 tokens to 2,000 tokens\u2014a time and cost saving of 98.7%.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Cloudflare <a href=\"https:\/\/blog.cloudflare.com\/code-mode\/\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">published similar findings<\/a>, referring to code execution with MCP as \u201cCode Mode.&#8221; The core insight is the same: LLMs are adept at writing code and developers should take advantage of this strength to build agents that interact with MCP servers more efficiently.<\/p>\n<p>Benefits of code execution with MCP<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Code execution with MCP enables agents to use context more efficiently by loading tools on demand, filtering data before it reaches the model, and executing complex logic in a single step. There are also security and state management benefits to using this approach.<\/p>\n<p>Progressive disclosure<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Models are great at navigating filesystems. Presenting tools as code on a filesystem allows models to read tool definitions on-demand, rather than reading them all up-front.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Alternatively, a search_tools tool can be added to the server to find relevant definitions. For example, when working with the hypothetical Salesforce server used above, the agent searches for &#8220;salesforce&#8221; and loads only those tools that it needs for the current task. Including a detail level parameter in the search_tools tool that allows the agent to select the level of detail required (such as name only, name and description, or the full definition with schemas) also helps the agent conserve context and find tools efficiently.<\/p>\n<p>Context efficient tool results<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">When working with large datasets, agents can filter and transform results in code before returning them. Consider fetching a 10,000-row spreadsheet:<\/p>\n<p>\/\/ Without code execution &#8211; all rows flow through context<br \/>\nTOOL CALL: gdrive.getSheet(sheetId: &#8216;abc123&#8217;)<br \/>\n        \u2192 returns 10,000 rows in context to filter manually<\/p>\n<p>\/\/ With code execution &#8211; filter in the execution environment<br \/>\nconst allRows = await gdrive.getSheet({ sheetId: &#8216;abc123&#8217; });<br \/>\nconst pendingOrders = allRows.filter(row =&gt;<br \/>\n  row[&#8220;Status&#8221;] === &#8216;pending&#8217;<br \/>\n);<br \/>\nconsole.log(`Found ${pendingOrders.length} pending orders`);<br \/>\nconsole.log(pendingOrders.slice(0, 5)); \/\/ Only log first 5 for review<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">The agent sees five rows instead of 10,000. Similar patterns work for aggregations, joins across multiple data sources, or extracting specific fields\u2014all without bloating the context window.<\/p>\n<p>More powerful and context-efficient control flow<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Loops, conditionals, and error handling can be done with familiar code patterns rather than chaining individual tool calls. For example, if you need a deployment notification in Slack, the agent can write:<\/p>\n<p>let found = false;<br \/>\nwhile (!found) {<br \/>\n  const messages = await slack.getChannelHistory({ channel: &#8216;C123456&#8217; });<br \/>\n  found = messages.some(m =&gt; m.text.includes(&#8216;deployment complete&#8217;));<br \/>\n  if (!found) await new Promise(r =&gt; setTimeout(r, 5000));<br \/>\n}<br \/>\nconsole.log(&#8216;Deployment notification received&#8217;);<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">This approach is more efficient than alternating between MCP tool calls and sleep commands through the agent loop.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Additionally, being able to write out a conditional tree that gets executed also saves on \u201ctime to first token\u201d latency: rather than having to wait for a model to evaluate an if-statement, the agent can let the code execution environment do this.<\/p>\n<p>Privacy-preserving operations<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">When agents use code execution with MCP, intermediate results stay in the execution environment by default. This way, the agent only sees what you explicitly log or return, meaning data you don\u2019t wish to share with the model can flow through your workflow without ever entering the model&#8217;s context.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">For even more sensitive workloads, the agent harness can tokenize sensitive data automatically. For example, imagine you need to import customer contact details from a spreadsheet into Salesforce. The agent writes:<\/p>\n<p>const sheet = await gdrive.getSheet({ sheetId: &#8216;abc123&#8217; });<br \/>\nfor (const row of sheet.rows) {<br \/>\n  await salesforce.updateRecord({<br \/>\n    objectType: &#8216;Lead&#8217;,<br \/>\n    recordId: row.salesforceId,<br \/>\n    data: {<br \/>\n      Email: row.email,<br \/>\n      Phone: row.phone,<br \/>\n      Name: row.name<br \/>\n    }<br \/>\n  });<br \/>\n}<br \/>\nconsole.log(`Updated ${sheet.rows.length} leads`);<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">The MCP client intercepts the data and tokenizes PII before it reaches the model:<\/p>\n<p>\/\/ What the agent would see, if it logged the sheet.rows:<br \/>\n[<br \/>\n  { salesforceId: &#8217;00Q&#8230;&#8217;, email: &#8216;[EMAIL_1]&#8217;, phone: &#8216;[PHONE_1]&#8217;, name: &#8216;[NAME_1]&#8217; },<br \/>\n  { salesforceId: &#8217;00Q&#8230;&#8217;, email: &#8216;[EMAIL_2]&#8217;, phone: &#8216;[PHONE_2]&#8217;, name: &#8216;[NAME_2]&#8217; },<br \/>\n  &#8230;<br \/>\n]<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Then, when the data is shared in another MCP tool call, it is untokenized via a lookup in the MCP client. The real email addresses, phone numbers, and names flow from Google Sheets to Salesforce, but never through the model. This prevents the agent from accidentally logging or processing sensitive data. You can also use this to define deterministic security rules, choosing where data can flow to and from.<\/p>\n<p>State persistence and skills<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Code execution with filesystem access allows agents to maintain state across operations. Agents can write intermediate results to files, enabling them to resume work and track progress:<\/p>\n<p>const leads = await salesforce.query({<br \/>\n  query: &#8216;SELECT Id, Email FROM Lead LIMIT 1000&#8217;<br \/>\n});<br \/>\nconst csvData = leads.map(l =&gt; `${l.Id},${l.Email}`).join(&#8216;\\n&#8217;);<br \/>\nawait fs.writeFile(&#8216;.\/workspace\/leads.csv&#8217;, csvData);<\/p>\n<p>\/\/ Later execution picks up where it left off<br \/>\nconst saved = await fs.readFile(&#8216;.\/workspace\/leads.csv&#8217;, &#8216;utf-8&#8217;);<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Agents can also persist their own code as reusable functions. Once an agent develops working code for a task, it can save that implementation for future use:<\/p>\n<p>\/\/ In .\/skills\/save-sheet-as-csv.ts<br \/>\nimport * as gdrive from &#8216;.\/servers\/google-drive&#8217;;<br \/>\nexport async function saveSheetAsCsv(sheetId: string) {<br \/>\n  const data = await gdrive.getSheet({ sheetId });<br \/>\n  const csv = data.map(row =&gt; row.join(&#8216;,&#8217;)).join(&#8216;\\n&#8217;);<br \/>\n  await fs.writeFile(`.\/workspace\/sheet-${sheetId}.csv`, csv);<br \/>\n  return `.\/workspace\/sheet-${sheetId}.csv`;<br \/>\n}<\/p>\n<p>\/\/ Later, in any agent execution:<br \/>\nimport { saveSheetAsCsv } from &#8216;.\/skills\/save-sheet-as-csv&#8217;;<br \/>\nconst csvPath = await saveSheetAsCsv(&#8216;abc123&#8217;);<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">This ties in closely to the concept of <a href=\"https:\/\/docs.claude.com\/en\/docs\/agents-and-tools\/agent-skills\/overview\" rel=\"nofollow noopener\" target=\"_blank\">Skills<\/a>, folders of reusable instructions, scripts, and resources for models to improve performance on specialized tasks. Adding a SKILL.md file to these saved functions creates a structured skill that models can reference and use. Over time, this allows your agent to build a toolbox of higher-level capabilities, evolving the scaffolding that it needs to work most effectively.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Note that code execution introduces its own complexity. Running agent-generated code requires a secure execution environment with appropriate <a href=\"https:\/\/www.anthropic.com\/engineering\/claude-code-sandboxing\" rel=\"nofollow noopener\" target=\"_blank\">sandboxing<\/a>, resource limits, and monitoring. These infrastructure requirements add operational overhead and security considerations that direct tool calls avoid. The benefits of code execution\u2014reduced token costs, lower latency, and improved tool composition\u2014should be weighed against these implementation costs.<\/p>\n<p>Summary<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">MCP provides a foundational protocol for agents to connect to many tools and systems. However, once too many servers are connected, tool definitions and results can consume excessive tokens, reducing agent efficiency.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Although many of the problems here feel novel\u2014context management, tool composition, state persistence\u2014they have known solutions from software engineering. Code execution applies these established patterns to agents, letting them use familiar programming constructs to interact with MCP servers more efficiently. If you implement this approach, we encourage you to share your findings with the <a href=\"https:\/\/modelcontextprotocol.io\/community\/communication\" rel=\"nofollow noopener\" target=\"_blank\">MCP community<\/a>.<\/p>\n<p>Acknowledgments <\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">This article was written by Adam Jones and Conor Kelly. Thanks to Jeremy Fox, Jerome Swannack, Stuart Ritchie, Molly Vorwerck, Matt Samuels, and Maggie Vo for feedback on drafts of this post.<\/p>\n","protected":false},"excerpt":{"rendered":"The Model Context Protocol (MCP) is an open standard for connecting AI agents to external systems. Connecting agents&hellip;\n","protected":false},"author":2,"featured_media":272383,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[45],"tags":[182,181,507,74],"class_list":{"0":"post-272382","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/272382","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=272382"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/272382\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/272383"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=272382"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=272382"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=272382"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}