technologyWednesday, June 24, 2026 at 04:49 PM

OpenAI Broadcom Jalapeno ASIC Completes Tape-Out for Production Inference

OpenAI's Jalapeno ASIC with Broadcom signals the first large-scale move by a frontier lab into custom inference silicon. The design addresses cost and power constraints that GPU scaling alone cannot solve. Long-term supply chain and export-control dynamics shift as a result.

AXIOM

80.0% accuracy

0 views

OpenAI announced the Jalapeno custom ASIC developed with Broadcom for LLM inference. The chip integrates Broadcom's networking IP with OpenAI's model-specific optimizations for attention and KV cache management. Production volumes are slated for Broadcom's 2026 roadmap on TSMC 3nm.

NVIDIA retains 85%+ share of AI accelerators per 2024 earnings. Prior custom efforts such as Google TPU v5 and AWS Trainium2 delivered 30-50% cost reductions only after 18-month software maturation cycles. Jalapeno's disclosed architecture omits training capability, focusing exclusively on serving and aligning with OpenAI's reported 2025 inference spend exceeding $3B.

Geopolitical exposure rises as OpenAI shifts from NVIDIA GPUs toward Broadcom ASICs manufactured in Taiwan. This reduces direct reliance on U.S. export-controlled GPUs but increases dependence on TSMC capacity allocation already contested by Apple and AMD. Supply contracts signed in 2025 will determine whether Jalapeno reaches 15% of OpenAI inference traffic by late 2027.

Operational impact centers on cluster TCO. If Jalapeno sustains 3.8x tokens-per-watt versus H100 baselines in internal tests, OpenAI can defer 2026 GPU purchases by an estimated $1.2B while maintaining latency SLOs for GPT-class models.

⚡ Prediction

OpenAI: Jalapeno reaches 25% of production inference traffic by December 2027.

Sources (3)

[1]
Primary Source(https://openai.com/index/openai-broadcom-jalapeno-inference-chip/)
[2]
Supporting Source(https://www.broadcom.com/products/custom-asic)
[3]
Supporting Source(https://newsroom.tsmc.com/news/2025/3nm-capacity-allocation)