{"id":471592,"date":"2026-02-13T03:52:10","date_gmt":"2026-02-13T03:52:10","guid":{"rendered":"https:\/\/www.newsbeep.com\/ca\/471592\/"},"modified":"2026-02-13T03:52:10","modified_gmt":"2026-02-13T03:52:10","slug":"openai-sidesteps-nvidia-with-unusually-fast-coding-model-on-plate-sized-chips","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/ca\/471592\/","title":{"rendered":"OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips"},"content":{"rendered":"<p>But 1,000 tokens per second is actually modest by Cerebras standards. The company has <a href=\"https:\/\/www.cerebras.ai\/blog\/cerebras-inference-3x-faster\" rel=\"nofollow noopener\" target=\"_blank\">measured<\/a> 2,100 tokens per second on Llama 3.1 70B and <a href=\"https:\/\/insidehpc.com\/2025\/08\/cerebras-reports-3000-tokens-per-second-inference-on-openai-gpt-oss-120b-model\/\" rel=\"nofollow noopener\" target=\"_blank\">reported<\/a> 3,000 tokens per second on OpenAI\u2019s own open-weight gpt-oss-120B model, suggesting that Codex-Spark\u2019s comparatively lower speed reflects the overhead of a larger or more complex model.<\/p>\n<p>AI coding agents have had a <a href=\"https:\/\/arstechnica.com\/information-technology\/2026\/01\/10-things-i-learned-from-burning-myself-out-with-ai-coding-agents\/\" rel=\"nofollow noopener\" target=\"_blank\">breakout year<\/a>, with tools like <a href=\"https:\/\/arstechnica.com\/ai\/2026\/01\/openai-spills-technical-details-about-how-its-ai-coding-agent-works\/\" rel=\"nofollow noopener\" target=\"_blank\">OpenAI\u2019s Codex<\/a> and Anthropic\u2019s <a href=\"https:\/\/arstechnica.com\/ai\/2026\/02\/sixteen-claude-ai-agents-working-together-created-a-new-c-compiler\/\" rel=\"nofollow noopener\" target=\"_blank\">Claude Code<\/a> reaching a new level of usefulness for rapidly building prototypes, interfaces, and boilerplate code. OpenAI, Google, and Anthropic have all been racing to ship more capable coding agents, and latency has become what separates the winners; a model that codes faster lets a developer iterate faster.<\/p>\n<p>With fierce competition from Anthropic, OpenAI has been iterating on its Codex line at a rapid rate, <a href=\"https:\/\/arstechnica.com\/information-technology\/2025\/12\/openai-releases-gpt-5-2-after-code-red-google-threat-alert\/\" rel=\"nofollow noopener\" target=\"_blank\">releasing<\/a> GPT-5.2 in December after CEO Sam Altman issued an internal \u201ccode red\u201d memo about competitive pressure from Google, then shipping GPT-5.3-Codex just days ago.<\/p>\n<p>Diversifying away from Nvidia<\/p>\n<p>Spark\u2019s deeper hardware story may be more consequential than its benchmark scores. The model runs on Cerebras\u2019 Wafer Scale Engine 3, a chip the size of a dinner plate that Cerebras has <a href=\"https:\/\/arstechnica.com\/information-technology\/2022\/11\/hungry-for-ai-new-supercomputer-contains-16-dinner-plate-size-chips\/\" rel=\"nofollow noopener\" target=\"_blank\">built<\/a> its business around since at least 2022. OpenAI and Cerebras <a href=\"https:\/\/techcrunch.com\/2026\/02\/12\/a-new-version-of-openais-codex-is-powered-by-a-new-dedicated-chip\/\" rel=\"nofollow noopener\" target=\"_blank\">announced<\/a> their partnership in January, and Codex-Spark is the first product to come out of it.<\/p>\n<p>OpenAI has spent the past year systematically reducing its dependence on Nvidia. The company <a href=\"https:\/\/arstechnica.com\/ai\/2025\/10\/amd-wins-massive-ai-chip-deal-from-openai-with-stock-sweetener\/\" rel=\"nofollow noopener\" target=\"_blank\">signed<\/a> a massive multi-year deal with AMD in October 2025, <a href=\"https:\/\/arstechnica.com\/ai\/2025\/11\/openai-signs-massive-ai-compute-deal-with-amazon\/\" rel=\"nofollow noopener\" target=\"_blank\">struck<\/a> a $38 billion cloud computing agreement with Amazon in November, and has been <a href=\"https:\/\/arstechnica.com\/ai\/2025\/02\/openais-secret-weapon-against-nvidia-dependence-takes-shape\/\" rel=\"nofollow noopener\" target=\"_blank\">designing<\/a> its own custom AI chip for eventual fabrication by TSMC.<\/p>\n<p>Meanwhile, a planned $100 billion infrastructure deal with Nvidia has <a href=\"https:\/\/arstechnica.com\/information-technology\/2026\/02\/five-months-later-nvidias-100-billion-openai-investment-plan-has-fizzled-out\/\" rel=\"nofollow noopener\" target=\"_blank\">fizzled<\/a> so far, though Nvidia has since committed to a $20 billion investment. Reuters <a href=\"https:\/\/arstechnica.com\/information-technology\/2026\/02\/five-months-later-nvidias-100-billion-openai-investment-plan-has-fizzled-out\/\" rel=\"nofollow noopener\" target=\"_blank\">reported<\/a> that OpenAI grew unsatisfied with the speed of some Nvidia chips for inference tasks, which is exactly the kind of workload that OpenAI designed Codex-Spark for.<\/p>\n<p>Regardless of which chip is under the hood, speed matters, though it may come at the cost of accuracy. For developers who spend their days inside a code editor waiting for AI suggestions, 1,000 tokens per second may feel less like carefully piloting a jigsaw and more like running a rip saw. Just watch what you\u2019re cutting.<\/p>\n","protected":false},"excerpt":{"rendered":"But 1,000 tokens per second is actually modest by Cerebras standards. The company has measured 2,100 tokens per&hellip;\n","protected":false},"author":2,"featured_media":471593,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[62,276,277,49,48,61],"class_list":{"0":"post-471592","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-ca","12":"tag-canada","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts\/471592","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/comments?post=471592"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts\/471592\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/media\/471593"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/media?parent=471592"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/categories?post=471592"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/tags?post=471592"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}