OpenAI's new GPT‑5.1-Codex-Max — all about the agentic coding model that can work for long hours

OpenAI this week rolled out GPT-5.1-Codex-Max, a new agentic coding model, built for long-running and detailed software development tasks.

Described as “faster, more intelligent, and more token-efficient at every stage of the development cycle,” the new model is available across all Codex surfaces.

This launch comes shortly after Google unveiled Antigravity, its agentic, developer-focused AI platform, setting the stage for a head-to-head battle between the two biggest players in AI for the future of software development.

What is GPT-5.1-Codex-Max

Codex Max is OpenAI’s newest and most advanced coding model, built on top of the GPT-5.1 architecture. The model was trained using real software engineering work — things like creating pull requests, reviewing code, building websites and answering technical questions.

Because of these features, it performs better than OpenAI’s previous coding models on frontier coding evaluations. It is also more useful in everyday situations, as GPT‑5.1-Codex-Max is the first version that can work smoothly in Windows, and it was trained to be a much better teammate when using the Codex command-line tool, according to the company’s blog post.

Also Read | How will ads work on ChatGPT? OpenAI CEO just gave a major hint

OpenAI also said the model can work on its own for many hours at a stretch. In internal tests, Codex-Max kept improving its code, fixing errors, and ultimately deliver a successful result, even when the task ran for more than 24 hours.

CEO Sam Altman praised the team behind rapid progress of the new model, calling them “beasts.”

“The product/model is already so good and will get much better; I believe they will create the best and most important product in the space, and enable so much downstream work,” he wrote on X (formerly Twitter).

Who can access this new model?

The GPT-5.1-Codex-Max model is accessible to users on ChatGPT Plus, Pro, Business, Edu, and Enterprise plans. For developers using Codex CLI via the API key, the new model can be accessed when API support rolls out.

GPT-5.1-Codex-Max will now replace GPT-5.1-Codex as the default in all Codex interfaces.

OpenAI also said that 95% of its internal engineering team uses Codex weekly and that engineers “ship roughly 70% more pull requests since adopting Codex.”

Accuracy and efficiency

Open AI says GPT-5.1-Codex-Max is much better than the previous versions. In one coding test (SWE-Lancer), it got 79.9% of answers right, compared with 66.3% with the older GPT-5.1-Codex.

In another test (SWE-bench Verified), the newer model solved more problems with higher accuracy at the same reasoning level while using about 30% lesser “thinking tokens,” meaning it works faster and more efficiently.

Also Read | Sam Altman’s memo flags tough times ahead amid Google’s AI revival: Report

OpenAI said the efficiency gains translate to lower costs for developers. In one example, the model generated a full browser-based CartPole reinforcement learning sandbox requiring 27,000 thinking tokens, compared to 37,000 for the earlier Codex model.

The company is also introducing a new extra-high reasoning option for non-latency-sensitive tasks, which allows the model to think longer and better before producing output, the company said.

OpenAI’s new GPT‑5.1-Codex-Max — all about the agentic coding model that can work for long hours

Tags: