OpenAI Introduces Jalapeño, the First AI Inference Chip Produced with Broadcom

In recent hours, OpenAI presented Jalapeño, its first chip for artificial intelligence, developed together with Broadcom and officially named Intelligence Processor. It is an ASIC designed exclusively for the inference of language models, which is the phase in which the system responds to user requests, and not for model training.

The data that Sam Altman's company emphasizes is the development time: from the first design to the tape-out, only nine months have passed, a cycle that OpenAI and Broadcom declare to be the fastest ever achieved for an advanced high-performance semiconductor, where the norm is around one and a half to two years. Part of that acceleration, the two companies claim, comes from using OpenAI's own AI models to optimize the chip design.

The two companies officially announced their collaboration in October 2025, about eighteen months after the actual work began, and for some time now, OpenAI has been distributing its inference workloads beyond Nvidia, also relying on Cerebras, AMD, and AWS's Trainium. The company clarifies that Nvidia remains a key partner for training: Jalapeño concerns inference and is a diversification on the margins, not a complete replacement for the GPU.

A Chip Optimized Around Models

On the technical side, Richard Ho, head of OpenAI's hardware program, stated that the architecture has been "optimized around the kernels, memory movements, networking, and serving patterns that matter most for cutting-edge AI models," aiming to execute the most critical workloads near the theoretical limits of the hardware.

Regarding performance, OpenAI mentions a per-watt consumption that is "significantly better than the current state of the art," but warns that definitive measurements are not yet complete and a detailed technical report will arrive in the coming months. Currently, there are no public benchmarks to support this. Some engineering samples are already functioning in the company’s labs at production frequency and consumption, running real workloads including GPT-5.3-Codex-Spark.

From Prototyping Scale to Ten Gigawatts

OpenAI manages the chip design, Broadcom handles silicon implementation along with connectivity and networking (including Tomahawk switching technology), and Celestica takes care of boards, racks, and systems. Jalapeño is described as the first piece of a multi-generation compute platform, intended for gigawatt-scale data centers alongside Microsoft and other partners.

The timeline indicates that an initial deployment at a prototyping scale will be possible by the end of 2026, volume production is expected during 2027, and full operational status in the first half of 2028. The company's long-term horizon is ambitious: to bring its custom chips to support 10 gigawatts of computing capacity by 2029, roughly the output of ten nuclear reactors.

Behind this move is a precise reading of the market, which Broadcom has made explicit. In statements to CNBC, CEO Hock Tan described the computing demand from its customers as "much more than we can currently fulfill," expected to remain high even in 2028, and argued that relying on a third-party GPU for such a crucial component is not sustainable in the long run.

Greg Brockman, also speaking to CNBC, added that the contribution of AI models in accelerating design "has truly surprised us." The underlying logic, reiterated by OpenAI, is to control the entire stack, from chip architecture to product experience.