RareDxR1 Reaches SOTA Rare Disease Diagnosis Accuracy via End-to-End RL Without Phenotype Ontologies
RareDxR1 demonstrates fully autonomous rare-disease reasoning from raw notes. It surpasses retrieval and ontology pipelines on published benchmarks through internalized knowledge and synthetic expert trajectories. The result tightens the link between scalable RL reasoning methods and clinical deployment requirements.
RareDxR1 trains an LLM end-to-end on unstructured clinical notes using knowledge internalization followed by autonomous evolutionary learning. The pipeline eliminates separate phenotype extraction and retrieval steps. Dual-level curriculum reinforcement learning then refines diagnostic trajectories generated by RERS, which produces expert-like chains by learning from model failures rather than human labels.
On standard rare-disease differential diagnosis suites the model exceeds prior retrieval-augmented and ontology-constrained baselines by double-digit margins in top-1 and top-5 accuracy. Ablations confirm that parameter-internalized knowledge plus reflection sampling accounts for the largest gains, while RAG approaches plateau due to retrieval bottlenecks and information loss.
The work connects directly to reasoning models such as OpenAI o1 and self-refine loops, yet applies them to a domain where closed-set classifiers have historically failed. It also surfaces an unaddressed scaling question: whether internalized rare-disease parameters remain stable when the model is further post-trained on common conditions.
Public release of code and dataset is scheduled. Prospective multi-site EHR validation and integration latency measurements are the logical next measurement points before any regulatory filing.
RareDxR1: Surpasses 85% top-1 accuracy in prospective multi-center EHR validation within 24 months of code release.
Sources (3)
- [1]Primary Source(https://arxiv.org/abs/2607.00147)
- [2]Supporting Source(https://arxiv.org/abs/2303.08774)
- [3]Supporting Source(https://www.nature.com/articles/s41591-023-02445-9)