-
High-Tc superconductor candidates proposed by machine learning
Authors:
Siwoo Lee,
Jason Hattrick-Simpers,
Young-June Kim,
O. Anatole von Lilienfeld
Abstract:
We cast the relation between chemical compositions of solid-state materials and their superconducting critical temperature (Tc) in terms of a statistical learning problem with reduced complexity. Training of query-aware similarity-based ridge regression models on experimental SuperCon data with (implicit) and without (ambient) high pressure entries achieves average Tc prediction errors of ~10 K fo…
▽ More
We cast the relation between chemical compositions of solid-state materials and their superconducting critical temperature (Tc) in terms of a statistical learning problem with reduced complexity. Training of query-aware similarity-based ridge regression models on experimental SuperCon data with (implicit) and without (ambient) high pressure entries achieves average Tc prediction errors of ~10 K for unseen out-of-sample materials. Subsequent utilization of the approach to scan ~153k materials in the Materials Project enables the ranking of candidates by Tc while taking into account thermodynamic stability and small band gap. Stable top three high-Tc candidate materials with large band gaps for implicit and ambient pressures are predicted to be Cs2Sn(H2N)6 (324 K), CsH5N2 (315K), Rb2Sn(H2N)6 (305 K), and H15IrBr3N5 (189 K), H12OsN5Cl3O (161 K), B10H13I (151 K), respectively. Stable top three high-Tc candidate materials with small band gaps for implicit and ambient pressures are predicted to be RbLiH12Se3N4 (255 K), CeH14Cl3O7 (246 K), Li(H3N)4 (234 K), and ReH30Ru2(NCl)10 (127 K), AlH18Ru(NF)6 (120 K), Sr(Li2P)2 (117 K), respectively.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Probing out-of-distribution generalization in machine learning for materials
Authors:
Kangming Li,
Andre Niyongabo Rubungo,
Xiangyun Lei,
Daniel Persaud,
Kamal Choudhary,
Brian DeCost,
Adji Bousso Dieng,
Jason Hattrick-Simpers
Abstract:
Scientific machine learning (ML) endeavors to develop generalizable models with broad applicability. However, the assessment of generalizability is often based on heuristics. Here, we demonstrate in the materials science setting that heuristics based evaluations lead to substantially biased conclusions of ML generalizability and benefits of neural scaling. We evaluate generalization performance in…
▽ More
Scientific machine learning (ML) endeavors to develop generalizable models with broad applicability. However, the assessment of generalizability is often based on heuristics. Here, we demonstrate in the materials science setting that heuristics based evaluations lead to substantially biased conclusions of ML generalizability and benefits of neural scaling. We evaluate generalization performance in over 700 out-of-distribution tasks that features new chemistry or structural symmetry not present in the training data. Surprisingly, good performance is found in most tasks and across various ML models including simple boosted trees. Analysis of the materials representation space reveals that most tasks contain test data that lie in regions well covered by training data, while poorly-performing tasks contain mainly test data outside the training domain. For the latter case, increasing training set size or training time has marginal or even adverse effects on the generalization performance, contrary to what the neural scaling paradigm assumes. Our findings show that most heuristically-defined out-of-distribution tests are not genuinely difficult and evaluate only the ability to interpolate. Evaluating on such tasks rather than the truly challenging ones can lead to an overestimation of generalizability and benefits of scaling.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Efficient first principles based modeling via machine learning: from simple representations to high entropy materials
Authors:
Kangming Li,
Kamal Choudhary,
Brian DeCost,
Michael Greenwood,
Jason Hattrick-Simpers
Abstract:
High-entropy materials (HEMs) have recently emerged as a significant category of materials, offering highly tunable properties. However, the scarcity of HEM data in existing density functional theory (DFT) databases, primarily due to computational expense, hinders the development of effective modeling strategies for computational materials discovery. In this study, we introduce an open DFT dataset…
▽ More
High-entropy materials (HEMs) have recently emerged as a significant category of materials, offering highly tunable properties. However, the scarcity of HEM data in existing density functional theory (DFT) databases, primarily due to computational expense, hinders the development of effective modeling strategies for computational materials discovery. In this study, we introduce an open DFT dataset of alloys and employ machine learning (ML) methods to investigate the material representations needed for HEM modeling. Utilizing high-throughput DFT calculations, we generate a comprehensive dataset of 84k structures, encompassing both ordered and disordered alloys across a spectrum of up to seven components and the entire compositional range. We apply descriptor-based models and graph neural networks to assess how material information is captured across diverse chemical-structural representations. We first evaluate the in-distribution performance of ML models to confirm their predictive accuracy. Subsequently, we demonstrate the capability of ML models to generalize between ordered and disordered structures, between low-order and high-order alloys, and between equimolar and non-equimolar compositions. Our findings suggest that ML models can generalize from cost-effective calculations of simpler systems to more complex scenarios. Additionally, we discuss the influence of dataset size and reveal that the information loss associated with the use of unrelaxed structures could significantly degrade the generalization performance. Overall, this research sheds light on several critical aspects of HEM modeling and offers insights for data-driven atomistic modeling of HEMs.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Accurate predictions of keyhole depths using machine learning-aided simulations
Authors:
Jiahui Zhang,
Runbo Jiang,
Kangming Li,
Pengyu Chen,
Xiao Shang,
Zhiying Liu,
Jason Hattrick-Simpers,
Brian J. Simonds,
Qianglong Wei,
Hongze Wang,
Tao Sun,
Anthony D. Rollett,
Yu Zou
Abstract:
The keyhole phenomenon is widely observed in laser materials processing, including laser welding, remelting, cladding, drilling, and additive manufacturing. Keyhole-induced defects, primarily pores, dramatically affect the performance of final products, impeding the broad use of these laser-based technologies. The formation of these pores is typically associated with the dynamic behavior of the ke…
▽ More
The keyhole phenomenon is widely observed in laser materials processing, including laser welding, remelting, cladding, drilling, and additive manufacturing. Keyhole-induced defects, primarily pores, dramatically affect the performance of final products, impeding the broad use of these laser-based technologies. The formation of these pores is typically associated with the dynamic behavior of the keyhole. So far, the accurate characterization and prediction of keyhole features, particularly keyhole depth, as a function of time has been a challenging task. In situ characterization of keyhole dynamic behavior using a synchrotron X-ray is complicated and expensive. Current simulations are hindered by their poor accuracies in predicting keyhole depths due to the lack of real-time laser absorptance data. Here, we develop a machine learning-aided simulation method that allows us to accurately predict keyhole depth over a wide range of processing parameters. Based on titanium and aluminum alloys, two commonly used engineering materials as examples, we achieve an accuracy with an error margin of 10 %, surpassing those simulated using other existing models (with an error margin in a range of 50-200 %). Our machine learning-aided simulation method is affordable and readily deployable for a large variety of materials, opening new doors to eliminate or reduce defects for a wide range of laser materials processing techniques.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Roadmap on Data-Centric Materials Science
Authors:
Stefan Bauer,
Peter Benner,
Tristan Bereau,
Volker Blum,
Mario Boley,
Christian Carbogno,
C. Richard A. Catlow,
Gerhard Dehm,
Sebastian Eibl,
Ralph Ernstorfer,
Ádám Fekete,
Lucas Foppa,
Peter Fratzl,
Christoph Freysoldt,
Baptiste Gault,
Luca M. Ghiringhelli,
Sajal K. Giri,
Anton Gladyshev,
Pawan Goyal,
Jason Hattrick-Simpers,
Lara Kabalan,
Petr Karpov,
Mohammad S. Khorrami,
Christoph Koch,
Sebastian Kokott
, et al. (36 additional authors not shown)
Abstract:
Science is and always has been based on data, but the terms "data-centric" and the "4th paradigm of" materials research indicate a radical change in how information is retrieved, handled and research is performed. It signifies a transformative shift towards managing vast data collections, digital repositories, and innovative data analytics methods. The integration of Artificial Intelligence (AI) a…
▽ More
Science is and always has been based on data, but the terms "data-centric" and the "4th paradigm of" materials research indicate a radical change in how information is retrieved, handled and research is performed. It signifies a transformative shift towards managing vast data collections, digital repositories, and innovative data analytics methods. The integration of Artificial Intelligence (AI) and its subset Machine Learning (ML), has become pivotal in addressing all these challenges. This Roadmap on Data-Centric Materials Science explores fundamental concepts and methodologies, illustrating diverse applications in electronic-structure theory, soft matter theory, microstructure research, and experimental techniques like photoemission, atom probe tomography, and electron microscopy. While the roadmap delves into specific areas within the broad interdisciplinary field of materials science, the provided examples elucidate key concepts applicable to a wider range of topics. The discussed instances offer insights into addressing the multifaceted challenges encountered in contemporary materials research.
△ Less
Submitted 1 May, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
Artificial Intelligence-Enabled Optimization of Battery-Grade Lithium Carbonate Production
Authors:
S. Shayan Mousavi Masouleh,
Corey A. Sanz,
Ryan P. Jansonius,
Samuel Shi,
Maria J. Gendron Romero,
Jason E. Hein,
Jason Hattrick-Simpers
Abstract:
By 2035, the need for battery-grade lithium is expected to quadruple. About half of this lithium is currently sourced from brines and must be converted from a chloride into lithium carbonate (Li2CO3) through a process called softening. Conventional softening methods using sodium or potassium salts contribute to carbon emissions during reagent mining and battery manufacturing, exacerbating global w…
▽ More
By 2035, the need for battery-grade lithium is expected to quadruple. About half of this lithium is currently sourced from brines and must be converted from a chloride into lithium carbonate (Li2CO3) through a process called softening. Conventional softening methods using sodium or potassium salts contribute to carbon emissions during reagent mining and battery manufacturing, exacerbating global warming. This study introduces an alternative approach using carbon dioxide (CO2(g)) as the carbonating reagent in the lithium softening process, offering a carbon capture solution. We employed an active learning-driven high-throughput method to rapidly capture CO2(g) and convert it to lithium carbonate. The model was simplified by focusing on the elemental concentrations of C, Li, and N for practical measurement and tracking, avoiding the complexities of ion speciation equilibria. This approach led to an optimized lithium carbonate process that capitalizes on CO2(g) capture and improves the battery metal supply chain's carbon efficiency.
△ Less
Submitted 20 February, 2024; v1 submitted 10 February, 2024;
originally announced February 2024.
-
Reproducibility in Computational Materials Science: Lessons from 'A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials'
Authors:
Daniel Persaud,
Logan Ward,
Jason Hattrick-Simpers
Abstract:
The integration of machine learning techniques in materials discovery has become prominent in materials science research and has been accompanied by an increasing trend towards open-source data and tools to propel the field. Despite the increasing usefulness and capabilities of these tools, developers neglecting to follow reproducible practices creates a significant barrier for researchers looking…
▽ More
The integration of machine learning techniques in materials discovery has become prominent in materials science research and has been accompanied by an increasing trend towards open-source data and tools to propel the field. Despite the increasing usefulness and capabilities of these tools, developers neglecting to follow reproducible practices creates a significant barrier for researchers looking to use or build upon their work. In this study, we investigate the challenges encountered while attempting to reproduce a section of the results presented in "A general-purpose machine learning framework for predicting properties of inorganic materials." Our analysis identifies four major categories of challenges: (1) reporting computational dependencies, (2) recording and sharing version logs, (3) sequential code organization, and (4) clarifying code references within the manuscript. The result is a proposed set of tangible action items for those aiming to make code accessible to, and useful for the community.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Multi-principal element alloy discovery using directed energy deposition and machine learning
Authors:
Phalgun Nelaturu,
Jason R. Hattrick-Simpers,
Michael Moorehead,
Vrishank Jambur,
Izabela Szlufarska,
Adrien Couet,
Dan J. Thoma
Abstract:
Multi-principal element alloys open large composition spaces for alloy development. The large compositional space necessitates rapid synthesis and characterization to identify promising materials, as well as predictive strategies for alloy design. Additive manufacturing via directed energy deposition is demonstrated as a high-throughput technique for synthesizing alloys in the Cr-Fe-Mn-Ni quaterna…
▽ More
Multi-principal element alloys open large composition spaces for alloy development. The large compositional space necessitates rapid synthesis and characterization to identify promising materials, as well as predictive strategies for alloy design. Additive manufacturing via directed energy deposition is demonstrated as a high-throughput technique for synthesizing alloys in the Cr-Fe-Mn-Ni quaternary system. More than 100 compositions are synthesized in a week, exploring a broad range of compositional space. Uniform compositional control to within +/-5 at% is achievable. The rapid synthesis is combined with conjoint sample heat treatment (25 samples vs 1 sample), and automated characterization including X-ray diffraction, energy-dispersive X-ray spectroscopy, and nano-hardness measurements. The datasets of measured properties are then used for a predictive strengthening model using an active machine learning algorithm that balances exploitation and exploration. A learned parameter that represents lattice distortion is trained using the alloy compositions. This combination of rapid synthesis, characterization, and active learning model results in new alloys that are significantly stronger than previous investigated alloys.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
JARVIS-Leaderboard: A Large Scale Benchmark of Materials Design Methods
Authors:
Kamal Choudhary,
Daniel Wines,
Kangming Li,
Kevin F. Garrity,
Vishu Gupta,
Aldo H. Romero,
Jaron T. Krogel,
Kayahan Saritas,
Addis Fuhr,
Panchapakesan Ganesh,
Paul R. C. Kent,
Keqiang Yan,
Yuchao Lin,
Shuiwang Ji,
Ben Blaiszik,
Patrick Reiser,
Pascal Friederich,
Ankit Agrawal,
Pratyush Tiwary,
Eric Beyerle,
Peter Minch,
Trevor David Rhone,
Ichiro Takeuchi,
Robert B. Wexler,
Arun Mannodi-Kanakkithodi
, et al. (13 additional authors not shown)
Abstract:
Lack of rigorous reproducibility and validation are major hurdles for scientific development across many fields. Materials science in particular encompasses a variety of experimental and theoretical approaches that require careful benchmarking. Leaderboard efforts have been developed previously to mitigate these issues. However, a comprehensive comparison and benchmarking on an integrated platform…
▽ More
Lack of rigorous reproducibility and validation are major hurdles for scientific development across many fields. Materials science in particular encompasses a variety of experimental and theoretical approaches that require careful benchmarking. Leaderboard efforts have been developed previously to mitigate these issues. However, a comprehensive comparison and benchmarking on an integrated platform with multiple data modalities with both perfect and defect materials data is still lacking. This work introduces JARVIS-Leaderboard, an open-source and community-driven platform that facilitates benchmarking and enhances reproducibility. The platform allows users to set up benchmarks with custom tasks and enables contributions in the form of dataset, code, and meta-data submissions. We cover the following materials design categories: Artificial Intelligence (AI), Electronic Structure (ES), Force-fields (FF), Quantum Computation (QC) and Experiments (EXP). For AI, we cover several types of input data, including atomic structures, atomistic images, spectra, and text. For ES, we consider multiple ES approaches, software packages, pseudopotentials, materials, and properties, comparing results to experiment. For FF, we compare multiple approaches for material property predictions. For QC, we benchmark Hamiltonian simulations using various quantum algorithms and circuits. Finally, for experiments, we use the inter-laboratory approach to establish benchmarks. There are 1281 contributions to 274 benchmarks using 152 methods with more than 8 million data-points, and the leaderboard is continuously expanding. The JARVIS-Leaderboard is available at the website: https://pages.nist.gov/jarvis_leaderboard
△ Less
Submitted 26 March, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
AutoEIS: automated Bayesian model selection and analysis for electrochemical impedance spectroscopy
Authors:
Runze Zhang,
Robert Black,
Debashish Sur,
Parisa Karimi,
Kangming Li,
Brian DeCost,
John Scully,
Jason Hattrick-Simpers
Abstract:
Electrochemical Impedance Spectroscopy (EIS) is a powerful tool for electrochemical analysis; however, its data can be challenging to interpret. Here, we introduce a new open-source tool named AutoEIS that assists EIS analysis by automatically proposing statistically plausible equivalent circuit models (ECMs). AutoEIS does this without requiring an exhaustive mechanistic understanding of the elect…
▽ More
Electrochemical Impedance Spectroscopy (EIS) is a powerful tool for electrochemical analysis; however, its data can be challenging to interpret. Here, we introduce a new open-source tool named AutoEIS that assists EIS analysis by automatically proposing statistically plausible equivalent circuit models (ECMs). AutoEIS does this without requiring an exhaustive mechanistic understanding of the electrochemical systems. We demonstrate the generalizability of AutoEIS by using it to analyze EIS datasets from three distinct electrochemical systems, including thin-film oxygen evolution reaction (OER) electrocatalysis, corrosion of self-healing multi-principal components alloys, and a carbon dioxide reduction electrolyzer device. In each case, AutoEIS identified competitive or in some cases superior ECMs to those recommended by experts and provided statistical indicators of the preferred solution. The results demonstrated AutoEIS's capability to facilitate EIS analysis without expert labels while diminishing user bias in a high-throughput manner. AutoEIS provides a generalized automated approach to facilitate EIS analysis spanning a broad suite of electrochemical applications with minimal prior knowledge of the system required. This tool holds great potential in improving the efficiency, accuracy, and ease of EIS analysis and thus creates an avenue to the widespread use of EIS in accelerating the development of new electrochemical materials and devices.
△ Less
Submitted 24 May, 2023; v1 submitted 8 May, 2023;
originally announced May 2023.
-
On the redundancy in large material datasets: efficient and robust learning with less data
Authors:
Kangming Li,
Daniel Persaud,
Kamal Choudhary,
Brian DeCost,
Michael Greenwood,
Jason Hattrick-Simpers
Abstract:
Extensive efforts to gather materials data have largely overlooked potential data redundancy. In this study, we present evidence of a significant degree of redundancy across multiple large datasets for various material properties, by revealing that up to 95 % of data can be safely removed from machine learning training with little impact on in-distribution prediction performance. The redundant dat…
▽ More
Extensive efforts to gather materials data have largely overlooked potential data redundancy. In this study, we present evidence of a significant degree of redundancy across multiple large datasets for various material properties, by revealing that up to 95 % of data can be safely removed from machine learning training with little impact on in-distribution prediction performance. The redundant data is related to over-represented material types and does not mitigate the severe performance degradation on out-of-distribution samples. In addition, we show that uncertainty-based active learning algorithms can construct much smaller but equally informative datasets. We discuss the effectiveness of informative data in improving prediction performance and robustness and provide insights into efficient data acquisition and machine learning training. This work challenges the "bigger is better" mentality and calls for attention to the information richness of materials data rather than a narrow emphasis on data volume.
△ Less
Submitted 25 July, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
Why is EXAFS analysis for multicomponent metals so hard? Challenges and opportunities for measuring ordering in complex concentrated alloys using x-ray absorption spectroscopy
Authors:
Howie Joress,
Bruce Ravel,
Elaf Anber,
Jonathan Hollenbach,
Debashish Sur,
Jason Hattrick-Simpers,
Mitra L. Taheri,
Brian DeCost
Abstract:
Short range order is a critical driver of properties (e.g. corrosion resistance and tensile strength) in multicomponent alloys such as complex concentrated alloys (CCAs). Extended x-ray absorption fine structure (EXAFS) is a powerful technique well suited for quantifying this short range order.Here, we described in detail the characteristics of CCAs that make the already challenging task of analyz…
▽ More
Short range order is a critical driver of properties (e.g. corrosion resistance and tensile strength) in multicomponent alloys such as complex concentrated alloys (CCAs). Extended x-ray absorption fine structure (EXAFS) is a powerful technique well suited for quantifying this short range order.Here, we described in detail the characteristics of CCAs that make the already challenging task of analyzing EXAFS data even more difficult. We then illustrate novel paths towards robust and scalable quantitative SRO analysis which will accelerate the scientific understanding and development of CCAs.
△ Less
Submitted 22 March, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
A High Throughput Aqueous Passivation Testing Methodology for Compositionally Complex Alloys using Scanning Droplet Cell
Authors:
Debashish Sur,
Howie Joress,
Jason Hattrick-Simpers,
John R. Scully
Abstract:
Compositionally complex alloy systems containing more than five principal elements allow exploring a wide range of compositions, processing, and structural variables with the hope for identifying unique properties. Such opportunities also apply to designing materials for improved corrosion resistance, regulated by a self-healing passive film. Such a rich landscape in reactivity and protectivity de…
▽ More
Compositionally complex alloy systems containing more than five principal elements allow exploring a wide range of compositions, processing, and structural variables with the hope for identifying unique properties. Such opportunities also apply to designing materials for improved corrosion resistance, regulated by a self-healing passive film. Such a rich landscape in reactivity and protectivity demands the search for high-throughput experimental testing workflows to uncover key metrics, indicative of superior properties. In this communication, one such methodology is demonstrated for evaluating passivation performance of a combinatorial library of Al0.7-x-yCoxCryFe0.15Ni0.15 thin film alloys in deaerated 0.1 mol/L H2SO4(aq), using a scanning droplet cell.
△ Less
Submitted 18 September, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
An experimental high-throughput to high-fidelity study towards discovering Al-Cr containing corrosion-resistant compositionally complex alloys
Authors:
Debashish Sur,
Emily F. Holcombe,
William H. Blades,
Elaf A. Anber,
Daniel L. Foley,
Brian L. DeCost,
Jing Liu,
Jason Hattrick-Simpers,
Karl Sieradzki,
Howie Joress,
John R. Scully,
Mitra L. Taheri
Abstract:
Compositionally complex alloys hold the promise of simultaneously attaining superior combinations of properties, such as corrosion resistance, light-weighting, and strength. Achieving this goal is a challenge due in part to a large number of possible compositions and structures in the vast alloy design space. High-throughput methods offer a path forward, but a strong connection between the synthes…
▽ More
Compositionally complex alloys hold the promise of simultaneously attaining superior combinations of properties, such as corrosion resistance, light-weighting, and strength. Achieving this goal is a challenge due in part to a large number of possible compositions and structures in the vast alloy design space. High-throughput methods offer a path forward, but a strong connection between the synthesis of an alloy of a given composition and structure with its properties has not been fully realized to date. Here, we present the rapid identification of corrosion-resistant alloys based on combinations of Al and Cr in a base Al-Co-Cr-Fe-Ni alloy. Previously unstudied alloy stoichiometries were identified using a combination of high-throughput experimental screening coupled with key metallurgical and electrochemical corrosion tests, identifying alloys with excellent passivation behavior. The alloy native oxide performance and its self-healing attributes were probed using rapid tests in deaerated 0.1 mol/L H2SO4. Importantly, a correlation was found between the electrochemical impedance modulus of the exposure-modified air-formed film and self-healing rate of the CCAs. Multi-element extended x-ray absorption fine structure analyses connected more ordered type chemical short-range order in the Ni-Al 1st nearest-neighbor shell to poorer corrosion resistance. This report underscores the utility of high throughput exploration of compositionally complex alloys for the identification and rapid screening of a vast stoichiometric space.
△ Less
Submitted 19 March, 2024; v1 submitted 15 February, 2023;
originally announced February 2023.
-
Rapid Reconstruction of 3-D Membrane Pore Structure Using a Single 2-D Micrograph
Authors:
Hooman Chamani,
Arash Rabbani,
Kaitlyn P. Russell,
Andrew L. Zydney,
Enrique D. Gomez,
Jason Hattrick-Simpers,
Jay R. Werber
Abstract:
Conventional 2-D scanning electron microscopy (SEM) is commonly used to rapidly and qualitatively evaluate membrane pore structure. Quantitative 2-D analyses of pore sizes can be extracted from SEM, but without information about 3-D spatial arrangement and connectivity, which are crucial to the understanding of membrane pore structure. Meanwhile, experimental 3-D reconstruction via tomography is c…
▽ More
Conventional 2-D scanning electron microscopy (SEM) is commonly used to rapidly and qualitatively evaluate membrane pore structure. Quantitative 2-D analyses of pore sizes can be extracted from SEM, but without information about 3-D spatial arrangement and connectivity, which are crucial to the understanding of membrane pore structure. Meanwhile, experimental 3-D reconstruction via tomography is complex, expensive, and not easily accessible. Here, we employ data-science tools to demonstrate a proof-of-principle reconstruction of the 3-D structure of a membrane using a single 2-D image pulled from a 3-D tomographic data set. The reconstructed and experimental 3-D structures were then directly compared, with important properties such as mean pore radius, mean throat radius, coordination number and tortuosity differing by less than 15%. The developed algorithm will dramatically improve the ability of the membrane community to characterize membranes, accelerating the design and synthesis of membranes with desired structural and transport properties.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
A critical examination of robustness and generalizability of machine learning prediction of materials properties
Authors:
Kangming Li,
Brian DeCost,
Kamal Choudhary,
Michael Greenwood,
Jason Hattrick-Simpers
Abstract:
Recent advances in machine learning (ML) methods have led to substantial improvement in materials property prediction against community benchmarks, but an excellent benchmark score may not imply good generalization of performance. Here we show that ML models trained on the Materials Project 2018 (MP18) dataset can have severely degraded prediction performance on new compounds in the Materials Proj…
▽ More
Recent advances in machine learning (ML) methods have led to substantial improvement in materials property prediction against community benchmarks, but an excellent benchmark score may not imply good generalization of performance. Here we show that ML models trained on the Materials Project 2018 (MP18) dataset can have severely degraded prediction performance on new compounds in the Materials Project 2021 (MP21) dataset. We document performance degradation in graph neural networks and traditional descriptor-based ML models for both quantitative and qualitative predictions. We find the source of the predictive degradation is due to the distribution shift between the MP18 and MP21 versions. This is revealed by the uniform manifold approximation and projection (UMAP) of the feature space. We then show that the performance degradation issue can be foreseen using a few simple tools. Firstly, the UMAP can be used to investigate the connectivity and relative proximity of the training and test data within feature space. Secondly, the disagreement between multiple ML models on the test data can illuminate out-of-distribution samples. We demonstrate that the simple yet efficient UMAP-guided and query-by-committee acquisition strategies can greatly improve prediction accuracy through adding only 1~\% of the test data. We believe this work provides valuable insights for building materials databases and ML models that enable better prediction robustness and generalizability.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
Development of an automated millifluidic platform and data-analysis pipeline for rapid electrochemical corrosion measurements: a pH study on Zn-Ni
Authors:
Howie Joress,
Brian DeCost,
Najlaa Hassan,
Trevor M. Braun,
Justin M. Gorham,
Jason Hattrick-Simpers
Abstract:
We describe the development of a millifluidic based scanning droplet cell platform for rapid and automated corrosion. This system allows for measurement of corrosion properties (e.g., open circuit potential, corrosion current through Tafel and linear polarization resistance measurements, and cyclic voltammograms) on a localized section of a planar sample. Our system is highly automated and flexibl…
▽ More
We describe the development of a millifluidic based scanning droplet cell platform for rapid and automated corrosion. This system allows for measurement of corrosion properties (e.g., open circuit potential, corrosion current through Tafel and linear polarization resistance measurements, and cyclic voltammograms) on a localized section of a planar sample. Our system is highly automated and flexible, allowing for scripted changing and mixing of solutions and point-to-point motion on the sample. We have also created an automated data analysis pipeline. Here we demonstrate this tool by corroding a plate of electroplated Zn$_{85}$Ni$_{15}$ alloy over a range of pH values and correlate our results with XPS measurements and literature.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Towards automated design of corrosion resistant alloy coatings with an autonomous scanning droplet cell
Authors:
Brian DeCost,
Howie Joress,
Suchismita Sarker,
Apurva Mehta,
Jason Hattrick-Simpers
Abstract:
We present an autonomous scanning droplet cell platform designed for on-demand alloy electrodeposition and real-time electrochemical characterization for investigating the corrosion-resistance properties of multicomponent alloys. Automation and machine learning are currently driving rapid innovation in high throughput and autonomous materials design and discovery. We present two alloy design case…
▽ More
We present an autonomous scanning droplet cell platform designed for on-demand alloy electrodeposition and real-time electrochemical characterization for investigating the corrosion-resistance properties of multicomponent alloys. Automation and machine learning are currently driving rapid innovation in high throughput and autonomous materials design and discovery. We present two alloy design case studies: one focusing on a multi-objective corrosion resistant alloy optimization, and a case study highlighting the complexity of the multimodal characterization needed to provide insight into the underlying structural and chemical factors that drive observed material behavior. This motivates a close coupling between autonomous research platforms and scientific machine learning methodology that blends mechanistic physical models and black box machine learning models. This emerging research area presents new opportunities to accelerate materials synthesis, evaluation, and hence discovery and design.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
Reflections on the future of machine learning for materials research
Authors:
Naohiro Fujinuma,
Brian L. DeCost,
Jason Hattrick-Simpers,
Samuel E. Lofland
Abstract:
Applied machine learning (ML) has rapidly spread throughout the physical sciences; in fact, ML-based data analysis and experimental decision-making has become commonplace. We suggest a shift in the conversation from proving that ML can be used to evaluating how to equitably and effectively implement ML for science.We advocate a shift from a "more data, more compute" mentality to a model-oriented a…
▽ More
Applied machine learning (ML) has rapidly spread throughout the physical sciences; in fact, ML-based data analysis and experimental decision-making has become commonplace. We suggest a shift in the conversation from proving that ML can be used to evaluating how to equitably and effectively implement ML for science.We advocate a shift from a "more data, more compute" mentality to a model-oriented approach that prioritizes using machine learning to support the ecosystem of computational models and experimental measurements.We also recommend an open conversation about dataset bias to stabilize productive research through careful model interrogation and deliberate exploitation of known biases. Further, we encourage the community to develop ML methods that connect experiments with theoretical models to increase scientific understanding rather than incrementally optimizing materials. Moreover we envision a future of radical materials innovations enabled by computational creativity tools combined with online visualization and analysis tools that support active outside-the-box thinking inside the scientific knowledge feedback loop. Finally, as a community we must acknowledge ethical issues that can arise from blindly following machine learning predictions and the issues of social equity that will arise if data, code, and computational resources are not readily available to all.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Accelerated Discovery of Molten Salt Corrosion-resistant Alloy by High-throughput Experimental and Modeling Methods Coupled to Data Analytics
Authors:
Yafei Wang,
Bonita Goh,
Phalgun Nelaturu,
Thien Duong,
Najlaa Hassan,
Raphaelle David,
Michael Moorehead,
Santanu Chaudhuri,
Adam Creuziger,
Jason Hattrick-Simpers,
Dan J. Thoma,
Kumar Sridharan,
Adrien Couet
Abstract:
Insufficient availability of molten salt corrosion-resistant alloys severely limits the fruition of a variety of promising molten salt technologies that could otherwise have significant societal impacts. To accelerate alloy development for molten salt applications and develop fundamental understanding of corrosion in these environments, here we present an integrated approach using a set of high-th…
▽ More
Insufficient availability of molten salt corrosion-resistant alloys severely limits the fruition of a variety of promising molten salt technologies that could otherwise have significant societal impacts. To accelerate alloy development for molten salt applications and develop fundamental understanding of corrosion in these environments, here we present an integrated approach using a set of high-throughput alloy synthesis, corrosion testing, and modeling coupled with automated characterization and machine learning. By using this approach, a broad range of Cr-Fe-Mn-Ni alloys were evaluated for their corrosion resistances in molten salt simultaneously demonstrating that corrosion-resistant alloy development can be accelerated by thousands of times. Based on the obtained results, we unveiled a sacrificial mechanism in the corrosion of Cr-Fe-Mn-Ni alloys in molten salts which can be applied to protect the less unstable elements in the alloy from being depleted, and provided new insights on the design of high-temperature molten salt corrosion-resistant alloys.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
The Joint Automated Repository for Various Integrated Simulations (JARVIS) for data-driven materials design
Authors:
Kamal Choudhary,
Kevin F. Garrity,
Andrew C. E. Reid,
Brian DeCost,
Adam J. Biacchi,
Angela R. Hight Walker,
Zachary Trautt,
Jason Hattrick-Simpers,
A. Gilad Kusne,
Andrea Centrone,
Albert Davydov,
Jie Jiang,
Ruth Pachter,
Gowoon Cheon,
Evan Reed,
Ankit Agrawal,
Xiaofeng Qian,
Vinit Sharma,
Houlong Zhuang,
Sergei V. Kalinin,
Bobby G. Sumpter,
Ghanshyam Pilania,
Pinar Acar,
Subhasish Mandal,
Kristjan Haule
, et al. (3 additional authors not shown)
Abstract:
The Joint Automated Repository for Various Integrated Simulations (JARVIS) is an integrated infrastructure to accelerate materials discovery and design using density functional theory (DFT), classical force-fields (FF), and machine learning (ML) techniques. JARVIS is motivated by the Materials Genome Initiative (MGI) principles of developing open-access databases and tools to reduce the cost and d…
▽ More
The Joint Automated Repository for Various Integrated Simulations (JARVIS) is an integrated infrastructure to accelerate materials discovery and design using density functional theory (DFT), classical force-fields (FF), and machine learning (ML) techniques. JARVIS is motivated by the Materials Genome Initiative (MGI) principles of developing open-access databases and tools to reduce the cost and development time of materials discovery, optimization, and deployment. The major features of JARVIS are: JARVIS-DFT, JARVIS-FF, JARVIS-ML, and JARVIS-Tools. To date, JARVIS consists of 40,000 materials and 1 million calculated properties in JARVIS-DFT, 1,500 materials and 110 force-fields in JARVIS-FF, and 25 ML models for material-property predictions in JARVIS-ML, all of which are continuously expanding. JARVIS-Tools provides scripts and workflows for running and analyzing various simulations. We compare our computational data to experiments or high-fidelity computational methods wherever applicable to evaluate error/uncertainty in predictions. In addition to the existing workflows, the infrastructure can support a wide variety of other technologically important applications as part of the data-driven materials design paradigm. The JARVIS datasets and tools are publicly available at the website: https://jarvis.nist.gov .
△ Less
Submitted 11 July, 2021; v1 submitted 3 July, 2020;
originally announced July 2020.
-
On-the-fly Closed-loop Autonomous Materials Discovery via Bayesian Active Learning
Authors:
A. Gilad Kusne,
Heshan Yu,
Changming Wu,
Huairuo Zhang,
Jason Hattrick-Simpers,
Brian DeCost,
Suchismita Sarker,
Corey Oses,
Cormac Toher,
Stefano Curtarolo,
Albert V. Davydov,
Ritesh Agarwal,
Leonid A. Bendersky,
Mo Li,
Apurva Mehta,
Ichiro Takeuchi
Abstract:
Active learning - the field of machine learning (ML) dedicated to optimal experiment design, has played a part in science as far back as the 18th century when Laplace used it to guide his discovery of celestial mechanics [1]. In this work we focus a closed-loop, active learning-driven autonomous system on another major challenge, the discovery of advanced materials against the exceedingly complex…
▽ More
Active learning - the field of machine learning (ML) dedicated to optimal experiment design, has played a part in science as far back as the 18th century when Laplace used it to guide his discovery of celestial mechanics [1]. In this work we focus a closed-loop, active learning-driven autonomous system on another major challenge, the discovery of advanced materials against the exceedingly complex synthesis-processes-structure-property landscape. We demonstrate autonomous research methodology (i.e. autonomous hypothesis definition and evaluation) that can place complex, advanced materials in reach, allowing scientists to fail smarter, learn faster, and spend less resources in their studies, while simultaneously improving trust in scientific results and machine learning tools. Additionally, this robot science enables science-over-the-network, reducing the economic impact of scientists being physically separated from their labs. We used the real-time closed-loop, autonomous system for materials exploration and optimization (CAMEO) at the synchrotron beamline to accelerate the fundamentally interconnected tasks of rapid phase mapping and property optimization, with each cycle taking seconds to minutes, resulting in the discovery of a novel epitaxial nanocomposite phase-change memory material.
△ Less
Submitted 10 November, 2020; v1 submitted 10 June, 2020;
originally announced June 2020.
-
Scientific AI in materials science: a path to a sustainable and scalable paradigm
Authors:
Brian DeCost,
Jason Hattrick-Simpers,
Zachary Trautt,
Aaron Kusne,
Eva Campo,
Martin Green
Abstract:
Recently there has been an ever-increasing trend in the use of machine learning (ML) and artificial intelligence (AI) methods by the materials science, condensed matter physics, and chemistry communities. This perspective article identifies key scientific, technical, and social opportunities that the materials community must prioritize to consistently develop and leverage Scientific AI to provide…
▽ More
Recently there has been an ever-increasing trend in the use of machine learning (ML) and artificial intelligence (AI) methods by the materials science, condensed matter physics, and chemistry communities. This perspective article identifies key scientific, technical, and social opportunities that the materials community must prioritize to consistently develop and leverage Scientific AI to provide a credible path towards the advancement of current materials-limited technologies. Here we highlight the intersections of these opportunities with a series of proposed paths forward. The opportunities are roughly sorted from scientific/technical (e.g., development of robust, physically meaningful multiscale material representations) to social (e.g., promoting an AI-ready workforce). The proposed paths forward range from developing new infrastructure and capabilities to deploying them in industry and academia. We provide a brief introduction to AI in materials science and engineering, followed by detailed discussions of each of the opportunities and paths forward.
△ Less
Submitted 18 March, 2020;
originally announced March 2020.
-
A high-throughput structural and electrochemical study of metallic glass formation in Ni-Ti-Al
Authors:
Howie Joress,
Brian L. DeCost,
Suchismita Sarker,
Trevor M. Braun,
Sidra Jilani,
Ryan Smith,
Logan Ward,
Kevin J. Laws,
Apurva Mehta,
Jason Hattrick-Simpers
Abstract:
Based on a set of machine learning predictions of glass formation in the Ni-Ti-Al system, we have undertaken a high-throughput experimental study of that system. We utilized rapid synthesis followed by high-throughput structural and electrochemical characterization. Using this dual-modality approach, we are able to better classify the amorphous portion of the library, which we found to be the port…
▽ More
Based on a set of machine learning predictions of glass formation in the Ni-Ti-Al system, we have undertaken a high-throughput experimental study of that system. We utilized rapid synthesis followed by high-throughput structural and electrochemical characterization. Using this dual-modality approach, we are able to better classify the amorphous portion of the library, which we found to be the portion with a full-width-half-maximum (FWHM) of 0.42 A$^{-1}$ for the first sharp x-ray diffraction peak. We demonstrate that the FWHM and corrosion resistance are correlated but that, while chemistry still plays a role, a large FWHM is necessary for the best corrosion resistance.
△ Less
Submitted 19 December, 2019;
originally announced December 2019.
-
On-the-fly Segmentation Approaches for X-ray Diffraction Datasets for Metallic Glasses
Authors:
Fang Ren,
Travis Williams,
Jason Hattrick-Simpers,
Apurva Mehta
Abstract:
Investment in brighter sources and larger detectors has resulted in an explosive rise in the data collected at synchrotron facilities. Currently, human experts extract scientific information from these data, but they cannot keep pace with the rate of data collection. Here, we present three on-the-fly approaches - attribute extraction, nearest-neighbor distance, and cluster analysis - to quickly se…
▽ More
Investment in brighter sources and larger detectors has resulted in an explosive rise in the data collected at synchrotron facilities. Currently, human experts extract scientific information from these data, but they cannot keep pace with the rate of data collection. Here, we present three on-the-fly approaches - attribute extraction, nearest-neighbor distance, and cluster analysis - to quickly segment x-ray diffraction (XRD) data into groups with similar XRD profiles. An expert can then analyze representative spectra from each group in detail with much reduced time, but without loss of scientific insights. On-the-fly segmentation would, therefore, result in accelerated scientific productivity.
△ Less
Submitted 26 September, 2017;
originally announced September 2017.