0 Datasets
0 Files
Get instant academic access to this publication’s datasets.
Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.
Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.
Yes, message the author after sign-up to request supplementary files or replication code.
Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaborationJoin our academic network to download verified datasets and collaborate with researchers worldwide.
Get Free AccessIn silico materials design has long faced a fundamental tradeoff between accuracy, universality, and efficiency. In 2022, we pioneered the concept of a universal machine learning interatomic potential (UMLIP) [Chen & Ong, Nat. Comput. Sci., 2022, 2, 718–728] – a foundational materials model (FMM) with comprehensive coverage of the periodic table. FMMs enable accurate, large-scale simulations across a broad spectrum of materials, offering transformative potential for materials discovery and design. More recently, the field has seen a trend toward increasingly complex FMM architectures—often with over 10 million parameters—trained on datasets exceeding 100 million structures, driven largely by major tech companies like Google DeepMind, Microsoft, and various startups. In this talk, I challenge the prevailing “bigger is better” paradigm in FMM development. I will present MatPES, a foundational, community-curated potential energy surface (PES) dataset of ~400,000 structures. Leveraging MatPES, we demonstrate that gains in FMM performance are primarily driven by data quality, and there are no “accuracy moat” in FMM architectures. Models trained on MatPES match or exceed the accuracy of previous FMMs across a diverse set of equilibrium, near-equilibrium, and dynamic benchmarks. Finally, I will argue that the key priorities in architectural and algorithmic development should be in the parallelization and scaling of such FMMs in high performance computing and their integration in high-throughput materials workflows.
Shyue Ping Ong, Aaron D. Kaplan, Runze Liu, Ji Qi, Tsz Wai Ko, Bowen Deng, Gerbrand Ceder, Kristin A. Persson (2025). Rightsizing AI Models and Datasets for Materials Design. , MA2025-02(7), DOI: https://doi.org/10.1149/ma2025-027989mtgabs.
Datasets shared by verified academics with rich metadata and previews.
Authors choose access levels; downloads are logged for transparency.
Students and faculty get instant access after verification.
Type
Article
Year
2025
Authors
8
Datasets
0
Total Files
0
DOI
https://doi.org/10.1149/ma2025-027989mtgabs
Access datasets from 50,000+ researchers worldwide with institutional verification.
Get Free Access