SUNNYVALE, Calif. & SAN FRANCISCO — Cerebras Systems today announced inference support for gpt-oss-120B, OpenAI’s first open-weight reasoning model, running at record inference speeds of 3,000 tokens per second on the Cerebras AI Inference Cloud, according to Cerebras. This is the first time an OpenAI model leverages Cerebras’ wafer-scale AI infrastructure to run full-model inference. […]