{"id":538767,"date":"2026-04-19T03:17:21","date_gmt":"2026-04-19T03:17:21","guid":{"rendered":"https:\/\/www.newsbeep.com\/uk\/538767\/"},"modified":"2026-04-19T03:17:21","modified_gmt":"2026-04-19T03:17:21","slug":"cloudflare-can-remember-it-for-you-wholesale-the-register","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/uk\/538767\/","title":{"rendered":"Cloudflare can remember it for you wholesale \u2022 The Register"},"content":{"rendered":"<p>Not only is hardware memory scarce these days, but context memory, the conversational data exchanged with AI models, can be an issue too.<\/p>\n<p>Cloudflare&#8217;s answer to this particular problem is Agent Memory, a managed service for siphoning AI conversations when space is scarce, then injecting the data back on demand.<\/p>\n<p>&#8220;It gives AI agents persistent memory, allowing them to recall what matters, forget what doesn&#8217;t, and get smarter over time,&#8221; said Tyson Trautmann, senior director of engineering, and Rob Sutter, engineering manager, in a <a href=\"https:\/\/blog.cloudflare.com\/introducing-agent-memory\/\" rel=\"nofollow noopener\" target=\"_blank\">blog post<\/a>.<\/p>\n<p>AI models can accept a limited amount of input, referred to as context. Measured in tokens, the amount varies by model.<\/p>\n<p>Anthropic&#8217;s Claude Opus 4.7, for example, has a <a href=\"https:\/\/platform.claude.com\/docs\/en\/about-claude\/models\/overview\" rel=\"nofollow noopener\" target=\"_blank\">1M token context window<\/a>, which can accommodate ~555,000 words or ~2.5 million Unicode characters. Claude Sonnet 4.6 also has a 1M context window, but it holds ~750,000 words or ~3.4 million Unicode characters because it relies on a different tokenizer.\u00a0<\/p>\n<p>Google&#8217;s Gemma 4 family of models has context windows of 128,000 for the smaller models and 256,000 for the larger ones.<\/p>\n<p>That may seem like an ample amount of space for model prompts, but there&#8217;s a lot of extra text that accompanies every prompt \u2013 the system prompt, system tools, custom agents, memory files, skills, messages, and the <a href=\"https:\/\/claudelog.com\/faqs\/what-is-auto-compact-buffer\/\" rel=\"nofollow noopener\" target=\"_blank\">auto-compact buffer<\/a>. So the actual context space available might be 10 to 20 percent less.<\/p>\n<p>Storing prompts and responses as &#8220;memories&#8221; makes the most of available space by providing a place to offload useful chat details that may not be needed for every conversational turn (prompt).<\/p>\n<p>At the same time, <a href=\"https:\/\/arxiv.org\/html\/2510.05381v1\" rel=\"nofollow noopener\" target=\"_blank\">more context isn&#8217;t always better<\/a> \u2013 there may be times when AI models provide better results when given less context. So memory is potentially useful for pulling data out of a conversation as a quality enhancement as well as a storage management option.<\/p>\n<p>There are already various <a href=\"https:\/\/github.com\/modelcontextprotocol\/servers\/tree\/main\/src\/memory\" rel=\"nofollow noopener\" target=\"_blank\">software<\/a> <a href=\"https:\/\/github.com\/neo4j-contrib\/mcp-neo4j\/tree\/main\/servers\/mcp-neo4j-memory\" rel=\"nofollow noopener\" target=\"_blank\">projects<\/a> and <a href=\"https:\/\/platform.claude.com\/docs\/en\/agents-and-tools\/tool-use\/memory-tool\" rel=\"nofollow noopener\" target=\"_blank\">integrated memory tools<\/a> available to help remember AI conversations. Cloudflare is proposing that AI memory should be a managed service.<\/p>\n<p>&#8220;Agents running for weeks or months against real codebases and production systems need memory that stays useful as it grows \u2014 not just memory that performs well on a clean benchmark dataset that may fit entirely into a newer model&#8217;s context window,&#8221; wrote Trautmann and Sutter, arguing that this can be done quickly at a reasonable per-query cost, in a way that doesn&#8217;t block the conversation.<\/p>\n<p>Basically, they&#8217;re talking about an asynchronous CRUD operation. For example, after storing a memory about the user&#8217;s preferred package manager (e.g. pnpm), that memory could be recalled via the following commands:<\/p>\n<p>\n  const results = await profile.recall(&#8220;What package manager does the user prefer?&#8221;);\n<\/p>\n<p>\n  console.log(results.result); \/\/ &#8220;The user prefers pnpm over npm.&#8221;\n<\/p>\n<p>Agent Memory can be accessed via a binding to a Cloudflare Worker, but also via REST API for those outside the Cloudflare Worker ecosystem. It&#8217;s currently in <a target=\"_blank\" rel=\"nofollow noopener\" href=\"https:\/\/forms.gle\/RAXbK6gN9Yy89ECw8\">private beta<\/a>.<\/p>\n<p>And in case anyone is possessive about their AI chat logs, Trautmann and Sutter offer reassurance that the memory data belongs to the customer.<\/p>\n<p>&#8220;Agent Memory is a managed service, but your data is yours,&#8221; they wrote. &#8220;Every memory is exportable, and we&#8217;re committed to making sure the knowledge your agents accumulate on Cloudflare can leave with you if your needs change.&#8221;<\/p>\n<p>That&#8217;s a touching thought, though some work might be required to take your text dump of conversations and make memories functional on another platform. \u00ae<\/p>\n","protected":false},"excerpt":{"rendered":"Not only is hardware memory scarce these days, but context memory, the conversational data exchanged with AI models,&hellip;\n","protected":false},"author":2,"featured_media":538768,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[554,733,4308,86,56,54,55],"class_list":{"0":"post-538767","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-technology","12":"tag-uk","13":"tag-united-kingdom","14":"tag-unitedkingdom"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts\/538767","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/comments?post=538767"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts\/538767\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/media\/538768"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/media?parent=538767"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/categories?post=538767"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/tags?post=538767"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}