13 Efficient Data Utilization in Training Machine Learning Models for Nanoporous Materials Screening

Diego A. Gómez-Gualdrón1, Cory M. Simon2, and Yamil J. Colón3

1 Chemical and Biological Engineering, Colorado School of Mines, Colorado, USA2 School of Chemical, Biological, and Environmental Engineering, Oregon State University, Oregon, USA3 Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana, USA

Machine learning is becoming a key tool in the study of nanoporous materials and promises to play a continuous crucial role in the discovery of new materials in the foreseeable future. It is important to keep in mind that machine learning is a “data hungry” approach, whose success in any field is predicated on the ability of the pertinent research community to generate and use data as efficiently as possible. In the past ten years, the nanoporous materials community has seen an explosion in the availability of data, mostly due to the application of molecular simulation to calculate adsorption properties in nanoporous materials databases. Thus, the prediction of adsorption properties has served as a natural “testbed” for the application of machine learning approaches to nanoporous materials discovery. However, it is important to put things into perspective and note that while “big data” in other areas (e.g., social media) refers to billions of datapoints, in the nanoporous materials community, data generation (even for adsorption) has rarely ...

Get AI-Guided Design and Property Prediction for Zeolites and Nanoporous Materials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.