{"id":135832,"date":"2025-11-16T05:19:07","date_gmt":"2025-11-16T05:19:07","guid":{"rendered":"https:\/\/www.newsbeep.com\/il\/135832\/"},"modified":"2025-11-16T05:19:07","modified_gmt":"2025-11-16T05:19:07","slug":"ai-code-doesnt-survive-in-production-heres-why","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/il\/135832\/","title":{"rendered":"AI Code Doesn&#8217;t Survive in Production: Here&#8217;s Why"},"content":{"rendered":"<p>I see a new demo every day that looks something like this: A single prompt generates a complete application. A few lines of natural language and ta-da: a polished product emerges. Yet despite the viral trends, a confusing fact endures: We aren\u2019t seeing an increase in shipped products or the pace of innovation we expected.<\/p>\n<p>A <a href=\"https:\/\/www.linkedin.com\/posts\/bmadams_i-was-having-drinks-with-a-vp-of-engineering-activity-7384597639838838784-huwZ?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAAuFsHsBP_HwHsuxCO0h3fGdkfU-2O13GPQ\" class=\"ext-link\" rel=\"external  nofollow noopener\" onclick=\"this.target=&#039;_blank&#039;;\" target=\"_blank\">vice president of engineering at Google <\/a>was recently quoted as saying: \u201cPeople would be shocked if they knew how little code from LLMs actually makes it to production.\u201d Despite impressive demos and billions in funding, there\u2019s a massive gap between AI-generated prototypes and production-ready systems. But why? The truth lies in these three fundamental challenges:<\/p>\n<p>Greenfield vs. existing tech stacks: AI excels at unconstrained prototyping but struggles to integrate with existing systems. Beyond that, operating in production environments imposes hard limits that turn prototypes brittle.<br \/>\nThe Dory problem: AI struggles to debug its own code because it lacks persistent understanding. It can\u2019t <a href=\"https:\/\/thenewstack.io\/future-proofing-ai-repeating-mistakes-or-learning-from-the-past\/\" data-wpil-monitor-id=\"3315\" class=\"local-link\" rel=\"nofollow noopener\" target=\"_blank\">learn from past mistakes<\/a> or have enough context to troubleshoot systems.<br \/>\nInconsistent tool maturity: While AI code generation tools are evolving rapidly, deployment, maintenance, code review, quality assurance and customer support functions still operate at pre-AI velocities.<\/p>\n<p>Greenfield vs. Existing Tech Stacks<\/p>\n<p>Large language models (LLMs) can draft a new microservice quickly in a vacuum that will perform well in isolation. But operating in production demands integration with messy realities: legacy code, service boundaries, data contracts, authorization middleware, protobuf schemas, CI\/CD pipelines, observability stacks, service-level objectives (SLOs), performance budgets\u2026 I could go on. This is all before unpredictable users interact with the software.<\/p>\n<p>When you build new software, you engage in what might be called an artistic process. You start with a vision of expected behavior: Data should flow from this initial state to this end state, transformed in this particular way through a specific control flow. You\u2019re painting with possibility, creating something from nothing.<\/p>\n<p>This is why AI coding assistants produce such impressive prototypes. They\u2019re phenomenal at this forward-looking, unconstrained creative generation. But to run high-quality software well, more code isn\u2019t the answer. You need code that can operate within a very specific set of parameters. The challenge is that communicating these many and nuanced parameters to an LLM is not a simple task.<\/p>\n<p>Because LLMs excel at communicating with us in our natural language, we overestimate their ability to write quality software. But while language is flexible and forgiving, code is not. Code is executable and compositional: Correctness depends on precise contracts across files\/services. The compiler and runtime are unforgiving; small errors cause cascading failures, security holes or performance regressions.<\/p>\n<p>The Dory Problem<\/p>\n<p>We\u2019ve established that current LLMs struggle to write code that will operate outside of controlled greenfield environments. But why can\u2019t we use AI to troubleshoot and debug that code?<\/p>\n<p>To debug properly, you need to wrap your head around the entire architecture. You need to understand how data actually flows through the system, not just how it was supposed to flow. You need the ability to reverse-engineer a system starting with a defect. You need models that can consume massive, complex architectures built over decades and understand why they behave the way they do. You need an understanding of what already exists, what came before, the paths not taken.<\/p>\n<p>Unfortunately, most LLMs operate a lot like the character Dory in \u201cFinding Nemo\u201d: They have no context from one query to the next and have extremely short memories.<\/p>\n<p>Many companies operate codebases accumulated over 20, 30 or 40 years. These systems have emergent behaviors, implicit dependencies and historical workarounds \u2014 compound interest on their technical debt. Without a broad understanding of the entire system architecture, the interconnections of multiple code repos, past decisions and deploys, it\u2019s nearly impossible to troubleshoot complex issues.<\/p>\n<p>Inconsistent Tool Maturity<\/p>\n<p>The last reason AI code struggles in production is because the AI tools to support the <a href=\"https:\/\/thenewstack.io\/how-ai-is-reshaping-the-software-development-life-cycle\/\" data-wpil-monitor-id=\"3314\" class=\"local-link\" rel=\"nofollow noopener\" target=\"_blank\">software delivery life cycle<\/a> (SDLC) have not all matured at the same rate. Take for example, the evolution of the digital camera. The first digital cameras looked a lot like their analog counterparts \u2014 we couldn\u2019t imagine another way to apply the technology. But soon we learned we could embed cameras everywhere: from laptops to phones to doorbells to cars. Cameras aren\u2019t just for taking pictures anymore; they can also help us get from point A to point B.<\/p>\n<p>Even though it\u2019s only been a few years, AI code generation tools have already gone through a rapid transformation. Our first attempts at integrating AI into our SDLC looked a lot like slapping AI into our IDE \u2014 the equivalent of a digital SLR camera. The initial version of GitHub Copilot was essentially enhanced IDE autocomplete.<\/p>\n<p>But over the past few years, tools like <a href=\"https:\/\/thenewstack.io\/using-cursor-ai-as-part-of-your-development-workflow\/\" class=\"local-link\" rel=\"nofollow noopener\" target=\"_blank\">Cursor<\/a>, <a href=\"https:\/\/thenewstack.io\/windsurf-an-agentic-ide-that-thinks-and-codes-with-you\/\" class=\"local-link\" rel=\"nofollow noopener\" target=\"_blank\">Windsurf<\/a> and <a href=\"https:\/\/thenewstack.io\/anthropics-claude-code-comes-to-web-and-mobile\/\" class=\"local-link\" rel=\"nofollow noopener\" target=\"_blank\">Claude Code<\/a> took over with a very different approach. They imagined a whole new workflow where you\u2019re not actually writing the code at all. Instead of working in the code editor, you\u2019re working in a chat box, expressing your intent, and the changes in the code happen naturally.<\/p>\n<p>Today\u2019s standard for AI code generation is a second-generation product that changed the entire workflow. But when we look beyond code generation at the rest of the SDLC, we\u2019re still in the first generation of these products. If we really want to improve engineering velocity, we need to look beyond code generation. We need tools that will help us reimagine and manage the complete end-to-end SDLC with AI.<\/p>\n<p>The Path Forward<\/p>\n<p>There are many tools that tackle <a href=\"https:\/\/thenewstack.io\/why-ai-alone-fails-at-large-scale-code-modernization\/\" data-wpil-monitor-id=\"3313\" class=\"local-link\" rel=\"nofollow noopener\" target=\"_blank\">modern code operations<\/a>, but they are looking at each step in the process in a silo. They are building very effective digital cameras, but don\u2019t have the imagination to rethink entire processes from scratch.<\/p>\n<p>You can get incremental gains from a better AI-powered code review system or with an agentic site reliability engineer, but the biggest advances will come from tools that rethink the entire software operations process, not just enhance an existing process.<\/p>\n<p>The AI tools that succeed in helping operate production environments will be those that can reverse-engineer complex systems, enumerate states systematically and help developers pin down the specific conditions that produce unexpected behavior. They\u2019ll need to be more than artistic builders \u2014 they must also be scientific investigators. And they will look at the problem holistically, not in silos.<\/p>\n<p>Until then, expect to see impressive prototypes and frustrating production experiences. The cognitive mismatch isn\u2019t going away \u2014 it\u2019s fundamental to how these systems work.<\/p>\n<p>\t<a class=\"row youtube-subscribe-block\" href=\"https:\/\/youtube.com\/thenewstack?sub_confirmation=1\" target=\"_blank\" rel=\"nofollow noopener\"><\/p>\n<p>\n\t\t\t\tYOUTUBE.COM\/THENEWSTACK\n\t\t\t<\/p>\n<p>\n\t\t\t\tTech moves fast, don&#8217;t miss an episode. Subscribe to our YouTube<br \/>\n\t\t\t\tchannel to stream all our podcasts, interviews, demos, and more.\n\t\t\t<\/p>\n<p>\t\t\t\tSUBSCRIBE<\/p>\n<p>\t<\/a><\/p>\n<p>    Group<br \/>\n    Created with Sketch.<\/p>\n<p>\t\t<a href=\"https:\/\/thenewstack.io\/author\/animesh-koratana\/\" class=\"author-more-link\" rel=\"nofollow noopener\" target=\"_blank\"><\/p>\n<p>\t\t\t\t\t<img decoding=\"async\" class=\"post-author-avatar\" src=\"https:\/\/www.newsbeep.com\/il\/wp-content\/uploads\/2025\/11\/689e701c-cropped-25d4a37b-animesh-koratana-scaled-1-600x600.jpeg\"\/><\/p>\n<p>\n\t\t\t\t\t\t\tAnimesh Koratana has been immersed in software startups from a young age, involved with technical support, QA and testing for his father\u2019s companies. Later, during his time at the Stanford DAWN lab, he would have an insider view into the&#8230;\t\t\t\t\t\t<\/p>\n<p>\t\t\t\t\t\tRead more from Animesh Koratana\t\t\t\t\t\t<\/p>\n<p>\t\t<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"I see a new demo every day that looks something like this: A single prompt generates a complete&hellip;\n","protected":false},"author":2,"featured_media":135833,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[345,343,344,43392,85,46,125],"class_list":{"0":"post-135832","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-contributed","12":"tag-il","13":"tag-israel","14":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/posts\/135832","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/comments?post=135832"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/posts\/135832\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/media\/135833"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/media?parent=135832"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/categories?post=135832"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/tags?post=135832"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}