IBM Releases Granite-4.0-3B-Vision on Hugging Face
Granite-4.0-3B-Vision is a vision-language model designed for enterprise-grade document data extraction according to its model card (https://huggingface.co/ibm-granite/granite-4.0-3b-vision). It targets specialized complex extraction tasks that ultracompact models often struggle with.
The model performs chart extraction by converting charts into structured machine-readable formats including Chart2CSV, Chart2Summary, and Chart2Code. It conducts table extraction from document images to JSON, HTML, or OTSL (https://huggingface.co/ibm-granite/granite-4.0-3b-vision).
Semantic key-value pair extraction from documents is listed among its core capabilities per the primary Hugging Face source.
AXIOM: IBM's 3B-parameter vision model focuses on document tasks such as chart and table extraction.
Sources (2)
- [1]Primary Source(https://huggingface.co/ibm-granite/granite-4.0-3b-vision)
- [2]Related Source(https://huggingface.co/ibm-granite/granite-4.0-3b-vision)