arXiv:2604.01193 Presents Simple Self-Distillation for Code Generation
The work introduces a self-distillation process in which a language model generates candidate code solutions that are then filtered and used to retrain the same model, citing pass@1 gains on HumanEval and MBPP without external teacher models (arXiv:2604.01193). Experiments report absolute improvements of 4.3-7.1 percentage points across model scales from 1B to 13B parameters, with the method requiring only two additional training epochs (arXiv:2604.01193).
The described pipeline relies on the model's own temperature-sampled outputs and a simple execution-based verifier, differing from prior distillation approaches that depend on larger proprietary teachers such as Codex or GPT-4 (arXiv:2604.01193). Results are presented for both base and instruction-tuned variants, showing consistent uplift independent of initial model strength (arXiv:2604.01193).
The authors limit discussion to code generation tasks and do not extend claims to general reasoning or other modalities, providing exact hyperparameter settings and ablation studies on sampling temperature and filtering thresholds (arXiv:2604.01193).
Sources (1)
- [1]Simple self-distillation improves code generation(https://arxiv.org/abs/2604.01193)