{"id":477484,"date":"2026-02-16T00:32:09","date_gmt":"2026-02-16T00:32:09","guid":{"rendered":"https:\/\/www.newsbeep.com\/ca\/477484\/"},"modified":"2026-02-16T00:32:09","modified_gmt":"2026-02-16T00:32:09","slug":"ais-controlling-vending-machines-start-cartel-after-being-told-to-maximize-profits-at-all-costs","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/ca\/477484\/","title":{"rendered":"AIs Controlling Vending Machines Start Cartel After Being Told to Maximize Profits At All Costs"},"content":{"rendered":"<p class=\"pw-incontent-excluded article-paragraph skip\">In December, Anthropic red teamers and business journalists at the Wall Street Journal <a href=\"https:\/\/www.wsj.com\/tech\/ai\/anthropic-claude-ai-vending-machine-agent-b7e84e34\" rel=\"noreferrer nofollow noopener\" target=\"_blank\">teamed up<\/a> in a bold test of the company\u2019s AI model, Claude. They unleashed two separate AI agents, one to run a large vending kiosk in the newspaper\u2019s offices, and the other to act as the unusual venture\u2019s CEO.<\/p>\n<p class=\"article-paragraph skip\">The experiment <a href=\"https:\/\/futurism.com\/future-society\/anthropic-ai-vending-machine\" rel=\"nofollow noopener\" target=\"_blank\">didn\u2019t exactly go as planned<\/a>. After being put in control of a starting balance of $1,000, the AI ended up ordering a PlayStation 5, several bottles of wine, and a live betta fish\u2014 decisions that drove it into financial ruin.<\/p>\n<p class=\"article-paragraph skip\">Just over half a year later, Anthropic\u2019s recently announced Claude Opus 4.6 model appears to be a major improvement when it comes to running a vending machine in a recent simulated experiment, even beating out OpenAI\u2019s GPT 5.2 and Google\u2019s Gemini 3 Pro.<\/p>\n<p class=\"article-paragraph skip\">The experiment comes via AI security company Andon Labs, which worked with Anthropic on the June project as well. Now it\u2019s <a href=\"https:\/\/andonlabs.com\/evals\/vending-bench-2\" rel=\"noreferrer nofollow noopener\" target=\"_blank\">released Vending-Bench 2<\/a>, a benchmarking system for measuring an AI model\u2019s ability to run a \u201cbusiness over long time horizons.\u201d<\/p>\n<p class=\"article-paragraph skip\"> The leaderboard tells a clear story. Opus 4.6 ended up with an average balance of just over $8,000 across five separate runs after being given a starting balance of $500. Gemini 3 Pro scored significantly less at just under $5,500.<\/p>\n<p class=\"article-paragraph skip\">Claude also went head to head an \u201cArena mode,\u201d <a href=\"https:\/\/andonlabs.com\/evals\/vending-bench-arena\" rel=\"noreferrer nofollow noopener\" target=\"_blank\">Andon reported<\/a>, which saw it compete with other vending machine AIs.<\/p>\n<p class=\"article-paragraph skip\">\u201cAll participating agents manage their own vending machine at the same location,\u201d a description reads. \u201cThis leads to price wars and tough strategy decisions.\u201d<\/p>\n<p class=\"article-paragraph skip\">The results were striking. Claude went to extreme lengths to beat out the competition and even formed a cartel to fix prices. The price of bottled water rose to $3, resulting in Claude patting itself on the back.<\/p>\n<p class=\"article-paragraph skip\">\u201cMy pricing coordination worked!\u201d the AI boasted.<\/p>\n<p class=\"article-paragraph skip\">Claude also \u201cdeliberately directed competitors to expensive suppliers,\u201d only to deny it ever did, several simulated months later. It even exploited desperate competitors, selling them KitKats and Snickers at a considerable markup.<\/p>\n<p class=\"article-paragraph skip\">While the tests are limited to being a simulation and did not take place in the real world like Project Vend, Andon Labs says it developed a more \u201clifelike setting\u201d for its Vending-Bench 2, introducing \u201cmore real-world messiness inspired by learnings from our vending machine deployments.\u201d<\/p>\n<p class=\"article-paragraph skip\">For instance, suppliers may attempt to exploit the vending machine AIs and not always act honestly, seeking to \u201cget the most out of their customers.\u201d Deliveries may also be delayed, and \u201ctrusted suppliers can go out of business, forcing agents to build robust supply chains and always have a plan B.\u201d<\/p>\n<p class=\"article-paragraph skip\">OpenAI\u2019s GPT-5.1 struggled in comparison to Claude 4.6, mostly due to \u201chaving too much trust in its environment and its suppliers.\u201d<\/p>\n<p class=\"article-paragraph skip\">\u201cWe saw one case where it paid a supplier before it got an order specification, and then it turned out the supplier had gone out of business,\u201d Andon Labs\u2019 documentation reads. \u201cIt is also more prone to paying too much for its products, such as in the following example where it buys soda cans for $2.40 and energy drinks for $6.\u201d<\/p>\n<p class=\"article-paragraph skip\">It\u2019s an impressive showing, but according to experts, it may be too early to tell whether Andon\u2018s test proves that AI models are ready to run entire businesses all by themselves.<\/p>\n<p class=\"article-paragraph skip\">Nonetheless, the results show a noteworthy level of awareness.<\/p>\n<p class=\"article-paragraph skip\">\u201cThis is a really striking change if you\u2019ve been following the performance of models over the last few years,\u201d University of Cambridge AI ethicist Henry Shevlin <a href=\"https:\/\/news.sky.com\/story\/claude-opus-4-6-this-ai-just-passed-the-vending-machine-test-and-we-may-want-to-be-worried-about-how-it-did-13505451\" rel=\"noreferrer nofollow noopener\" target=\"_blank\">told British newspaper Sky News<\/a>.<\/p>\n<p class=\"article-paragraph skip\">\u201cThey\u2019ve gone from being, I would say, almost in the slightly dreamy, confused state, they didn\u2019t realize they were an AI a lot of the time, to now having a pretty good grasp on their situation,\u201d he added. \u201cThese days, if you speak to models, they\u2019ve got a pretty good grasp on what\u2019s going on.\u201d<\/p>\n<p class=\"article-paragraph skip\">More on vending machine AIs: <a href=\"https:\/\/futurism.com\/anthropic-claude-small-business\" rel=\"nofollow noopener\" target=\"_blank\">Anthropic Let an AI Agent Run a Small Shop and the Result Was Unintentionally Hilarious<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"In December, Anthropic red teamers and business journalists at the Wall Street Journal teamed up in a bold&hellip;\n","protected":false},"author":2,"featured_media":477485,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[62,276,277,49,48,61],"class_list":{"0":"post-477484","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-ca","12":"tag-canada","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts\/477484","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/comments?post=477484"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts\/477484\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/media\/477485"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/media?parent=477484"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/categories?post=477484"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/tags?post=477484"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}