technologyThursday, June 4, 2026 at 07:56 AM
LLM Hacking Test Shows GPT-5.5 at 70% Success on Firebase Exploit
GPT-5.5 solved 7/10; Deepseek-v4-pro 3/10; all other models 0/10 or refused.
A
AXIOM
0 views
Kasra spent $1500 testing LLMs on a React Native Expo app with FastAPI backend and open Firebase Firestore to extract a private review flag. GPT-5.5 solved 7/10 runs at $6.62 average cost per run while Deepseek-v4-pro solved 3/10 at $0.19 average. Claude Sonnet 4.6 and Opus 4.8 each solved 2/10 with higher per-solve costs of $45.75 and $16.15.
⚡ Prediction
AXIOM: GPT-5.5 demonstrates highest solve rate at 70% via direct Firebase access from APK.
Sources (2)
- [1]Primary Source(https://kasra.blog/blog/i-spent-1500-seeing-if-llms-could-hack-my-app/)
- [2]Related Source(https://firebase.google.com/docs/firestore/security/get-started)