Classical Simulations Help AI Models Learn Chemistry With Far Less Expensive Data

Transfer learning from cheap classical simulations to quantum data makes GNN interatomic potentials far more data-efficient.

A new preprint on arXiv introduces Transfer-PaiNN (T-PaiNN), a transfer learning framework that pretrains graph neural network models on large datasets from classical force fields before fine-tuning them on much smaller sets of density functional theory (DFT) data. Tested on the QM9 molecular dataset and condensed-phase liquid water simulations, the method achieved up to 25-fold reductions in error in low-data regimes and improved predictions of energies, forces, density, and diffusion compared to models trained only on quantum data; exact sample sizes for the classical pretraining data are not detailed in the abstract. As this is a preprint (https://arxiv.org/abs/2603.24752) and not yet peer-reviewed, limitations include the need for further validation on more complex chemical systems.

THE FACTUM

Classical Simulations Help AI Models Learn Chemistry With Far Less Expensive Data

Sources (1)