Raw Data Library
About
Aims and ScopeAdvisory Board Members
More
Who We Are?
User Guide
Green Science
​
​
EN
Kurumsal BaşvuruSign inGet started
​
​

About
Aims and ScopeAdvisory Board Members
More
Who We Are?
User GuideGreen Science

Language

Kurumsal Başvuru

Sign inGet started
RDL logo

Verified research datasets. Instant access. Built for collaboration.

Navigation

About

Aims and Scope

Advisory Board Members

More

Who We Are?

Contact

Add Raw Data

User Guide

Legal

Privacy Policy

Terms of Service

Support

Got an issue? Email us directly.

Email: info@rawdatalibrary.netOpen Mail App
​
​

© 2026 Raw Data Library. All rights reserved.
PrivacyTermsContact
  1. Raw Data Library
  2. /
  3. Publications
  4. /
  5. Rightsizing AI Models and Datasets for Materials Design

Verified authors • Institutional access • DOI aware
50,000+ researchers120,000+ datasets90% satisfaction
Article
2025

Rightsizing AI Models and Datasets for Materials Design

0 Datasets

0 Files

English
2025
Vol MA2025-02 (7)
Vol. MA2025-02
DOI: 10.1149/ma2025-027989mtgabs

Get instant academic access to this publication’s datasets.

Create free accountHow it works

Frequently asked questions

Is access really free for academics and students?

Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.

How is my data protected?

Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.

Can I request additional materials?

Yes, message the author after sign-up to request supplementary files or replication code.

Advance your research today

Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.

Get free academic accessLearn more
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaboration
Access Research Data

Join our academic network to download verified datasets and collaborate with researchers worldwide.

Get Free Access
Institutional SSO
Secure
This PDF is not available in different languages.
No localized PDFs are currently available.
Gerbrand Ceder
Gerbrand Ceder

University of California, Berkeley

Verified
Shyue Ping Ong
Aaron D. Kaplan
Runze Liu
+5 more

Abstract

In silico materials design has long faced a fundamental tradeoff between accuracy, universality, and efficiency. In 2022, we pioneered the concept of a universal machine learning interatomic potential (UMLIP) [Chen & Ong, Nat. Comput. Sci., 2022, 2, 718–728] – a foundational materials model (FMM) with comprehensive coverage of the periodic table. FMMs enable accurate, large-scale simulations across a broad spectrum of materials, offering transformative potential for materials discovery and design. More recently, the field has seen a trend toward increasingly complex FMM architectures—often with over 10 million parameters—trained on datasets exceeding 100 million structures, driven largely by major tech companies like Google DeepMind, Microsoft, and various startups. In this talk, I challenge the prevailing “bigger is better” paradigm in FMM development. I will present MatPES, a foundational, community-curated potential energy surface (PES) dataset of ~400,000 structures. Leveraging MatPES, we demonstrate that gains in FMM performance are primarily driven by data quality, and there are no “accuracy moat” in FMM architectures. Models trained on MatPES match or exceed the accuracy of previous FMMs across a diverse set of equilibrium, near-equilibrium, and dynamic benchmarks. Finally, I will argue that the key priorities in architectural and algorithmic development should be in the parallelization and scaling of such FMMs in high performance computing and their integration in high-throughput materials workflows.

How to cite this publication

Shyue Ping Ong, Aaron D. Kaplan, Runze Liu, Ji Qi, Tsz Wai Ko, Bowen Deng, Gerbrand Ceder, Kristin A. Persson (2025). Rightsizing AI Models and Datasets for Materials Design. , MA2025-02(7), DOI: https://doi.org/10.1149/ma2025-027989mtgabs.

Related publications

Why join Raw Data Library?

Quality

Datasets shared by verified academics with rich metadata and previews.

Control

Authors choose access levels; downloads are logged for transparency.

Free for Academia

Students and faculty get instant access after verification.

Publication Details

Type

Article

Year

2025

Authors

8

Datasets

0

Total Files

0

DOI

https://doi.org/10.1149/ma2025-027989mtgabs

Join Research Community

Access datasets from 50,000+ researchers worldwide with institutional verification.

Get Free Access