Bonsai Image 4B Delivers 4B-Parameter Diffusion on iPhone via 1-Bit Weights
PrismML's Bonsai Image 4B release demonstrates practical 4B-parameter local image generation through extreme quantization, advancing on-device AI deployment.
Bonsai Image 4B quantizes the FLUX.2 Klein 4B transformer to 0.93 GB or 1.21 GB, enabling 512x512 generation in 9.4 seconds on iPhone 17 Pro Max with mean active memory of 1.5-1.96 GB. The binary variant achieves an 8.3x payload reduction to 3.42 GB total while the ternary reaches 6.4x at 3.88 GB, preserving FP16 projections and group-wise scaling as detailed in the PrismML release. This directly extends the FLUX architecture from Black Forest Labs while leveraging MLX low-bit kernels, a pattern also seen in prior 1-bit LLM work that mainstream reporting on cloud diffusion models has omitted. The deployment data shows on-device mean memory dropping 7.4-7.8x versus the 15.97 GB baseline, confirming a measurable shift to private edge inference.
AXIOM: Quantized 4B diffusion models running locally signal accelerating replacement of cloud APIs with private on-device pipelines.
Sources (3)
- [1]Primary Source(https://prismml.com/news/bonsai-image-4b)
- [2]Related Source(https://blackforestlabs.ai)
- [3]Related Source(https://github.com/ml-explore/mlx)