Search | arXiv e-print repository

Robust Machine Learning Inference from X-ray Absorption Near Edge Spectra through Featurization

Authors: Yiming Chen, Chi Chen, Inhui Hwang, Michael J. Davis, Wanli Yang, Chengjun Sun, Shyue Ping Ong, Maria K. Y. Chan

Abstract: X-ray absorption spectroscopy (XAS) is a commonly-employed technique for characterizing functional materials. In particular, x-ray absorption near edge spectra (XANES) encodes local coordination and electronic information and machine learning approaches to extract this information is of significant interest. To date, most ML approaches for XANES have primarily focused on using the raw spectral int… ▽ More X-ray absorption spectroscopy (XAS) is a commonly-employed technique for characterizing functional materials. In particular, x-ray absorption near edge spectra (XANES) encodes local coordination and electronic information and machine learning approaches to extract this information is of significant interest. To date, most ML approaches for XANES have primarily focused on using the raw spectral intensities as input, overlooking the potential benefits of incorporating spectral transformations and dimensionality reduction techniques into ML predictions. In this work, we focused on systematically comparing the impact of different featurization methods on the performance of ML models for XAS analysis. We evaluated the classification and regression capabilities of these models on computed datasets and validated their performance on previously unseen experimental datasets. Our analysis revealed an intriguing discovery: the cumulative distribution function (CDF) feature achieves both high prediction accuracy and exceptional transferability. This remarkably robust performance can be attributed to its tolerance to horizontal shifts in spectra, which is crucial when validating models using experimental data. While this work exclusively focuses on XANES analysis, we anticipate that the methodology presented here will hold promise as a versatile asset to the broader spectroscopy community. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2203.10349 [pdf]

doi 10.1063/5.0083877

Machine learning for impurity charge-state transition levels in semiconductors from elemental properties using multi-fidelity datasets

Authors: Maciej P. Polak, Ryan Jacobs, Arun Mannodi-Kanakkithodi, Maria K. Y. Chan, Dane Morgan

Abstract: Quantifying charge-state transition energy levels of impurities in semiconductors is critical to understanding and engineering their optoelectronic properties for applications ranging from solar photovoltaics to infrared lasers. While these transition levels can be measured and calculated accurately, such efforts are time-consuming and more rapid prediction methods would be beneficial. Here, we si… ▽ More Quantifying charge-state transition energy levels of impurities in semiconductors is critical to understanding and engineering their optoelectronic properties for applications ranging from solar photovoltaics to infrared lasers. While these transition levels can be measured and calculated accurately, such efforts are time-consuming and more rapid prediction methods would be beneficial. Here, we significantly reduce the time typically required to predict impurity transition levels using multi-fidelity datasets and a machine learning approach employing features based on elemental properties and impurity positions. We use transition levels obtained from low-fidelity (i.e., local-density approximation or generalized gradient approximation) density functional theory (DFT) calculations, corrected using a recently proposed modified band alignment scheme, which well-approximates transition levels from high-fidelity DFT (i.e., hybrid HSE06). The model fit to the large multi-fidelity database shows improved accuracy compared to the models trained on the more limited high-fidelity values. Crucially, in our approach, when using the multi-fidelity data, high-fidelity values are not required for model training, significantly reducing the computational cost required for training the model. Our machine learning model of transition levels has a root mean squared (mean absolute) error of 0.36 (0.27) eV vs high-fidelity hybrid functional values when averaged over 14 semiconductor systems from the II-VI and III-V families. As a guide for use on other systems, we assessed the model on simulated data to show the expected accuracy level as a function of bandgap for new materials of interest. Finally, we use the model to predict a complete space of impurity charge-state transition levels in all zinc blende III-V and II-VI systems. △ Less

Submitted 19 March, 2022; originally announced March 2022.

Journal ref: J. Chem. Phys. 156, 114110 (2022)

arXiv:1908.05585 [pdf, other]

Defect Physics of Pseudo-cubic Mixed Halide Lead Perovskites from First Principles

Authors: Arun Mannodi-Kanakkithodi, Ji-Sang Park, Alex B. F. Martinson, Maria K. Y. Chan

Abstract: Owing to the increasing popularity of lead-based hybrid perovskites for photovoltaic (PV) applications, it is crucial to understand their defect physics and its influence on their optoelectronic properties. In this work, we simulate various point defects in pseudo-cubic structures of mixed iodide-bromide and bromide-chloride methylammonium lead perovskites with the general formula MAPbI_{3-y}Br_{y… ▽ More Owing to the increasing popularity of lead-based hybrid perovskites for photovoltaic (PV) applications, it is crucial to understand their defect physics and its influence on their optoelectronic properties. In this work, we simulate various point defects in pseudo-cubic structures of mixed iodide-bromide and bromide-chloride methylammonium lead perovskites with the general formula MAPbI_{3-y}Br_{y} or MAPbBr_{3-y}Cl_{y} (where y is between 0 and 3), and use first principles based density functional theory computations to study their relative formation energies and charge transition levels. We identify vacancy defects and Pb on MA anti-site defect as the lowest energy native defects in each perovskite. We observe that while the low energy defects in all MAPbI_{3-y}Br_{y} systems only create shallow transition levels, the Br or Cl vacancy defects in the Cl-containing pervoskites have low energy and form deep levels which become deeper for higher Cl content. Further, we study extrinsic substitution by different elements at the Pb site in MAPbBr_{3}, MAPbCl_{3} and the 50-50 mixed halide perovskite, MAPbBr_{1.5}Cl_{1.5}, and identify some transition metals that create lower energy defects than the dominant intrinsic defects and also create mid-gap charge transition levels. △ Less

Submitted 15 August, 2019; originally announced August 2019.

Comments: 10 pages, 5 figures, 1 supplementary information file

arXiv:1905.03928 [pdf]

A Deep Learning Model for Atomic Structures Prediction Using X-ray Absorption Spectroscopic Data

Authors: Liang Li, Mindren Lu, Maria K. Y. Chan

Abstract: A deep neural network (DNN) model consisting of two hidden layers was proposed for predicting the immediate environments of specific atoms based on X-ray absorption near-edge spectra (XANES). The output layer of the DNN can be adjusted to form a classifier or regressor, to predict the local and overall coordination environments, respectively. Using Li3FeO3.5 as a model system, it was demonstrated… ▽ More A deep neural network (DNN) model consisting of two hidden layers was proposed for predicting the immediate environments of specific atoms based on X-ray absorption near-edge spectra (XANES). The output layer of the DNN can be adjusted to form a classifier or regressor, to predict the local and overall coordination environments, respectively. Using Li3FeO3.5 as a model system, it was demonstrated that the prediction accuracy of the DNN classifier is higher than 98%, and the predictions of the DNN regressor also showed notable agreement with the ground truth. Therefore, despite its simplicity, this DNN architecture can be expected to be generally capable of predicting the structural properties of various systems. Fine tuning of the hyperparameters, bias-variance tradeoff, and strategies to enrich the versatility of the model were also discussed. △ Less

Submitted 10 May, 2019; originally announced May 2019.

Comments: 11 pages, 4 figures

arXiv:1812.00326 [pdf, other]

Comparing optimization strategies for force field parameterization

Authors: Fatih G. Sen, Badri Narayanan, Jeffrey Larson, Alper Kinaci, Kiran Sasikumar, Michael J. Davis, Stefan M. Wild, Stephen K. Gray, Subramanian K. R. S. Sankaranarayanan, Maria K. Y. Chan

Abstract: Classical molecular dynamics (MD) simulations enable modeling of materials and examination of microscopic details that are not accessible experimentally. The predictive capability of MD relies on the force field (FF) used to describe interatomic interactions. FF parameters are typically determined to reproduce selected material properties computed from density functional theory (DFT) and/or measur… ▽ More Classical molecular dynamics (MD) simulations enable modeling of materials and examination of microscopic details that are not accessible experimentally. The predictive capability of MD relies on the force field (FF) used to describe interatomic interactions. FF parameters are typically determined to reproduce selected material properties computed from density functional theory (DFT) and/or measured experimentally. A common practice in parameterizing FFs is to use least-squares local minimization algorithms. Genetic algorithms (GAs) have also been demonstrated as a viable global optimization approach, even for complex FFs. However, an understanding of the relative effectiveness and efficiency of different optimization techniques for the determination of FF parameters is still lacking. In this work, we evaluate various FF parameter optimization schemes, using as example a training data set calculated from DFT for different polymorphs of Ir$O_2$. The Morse functional form is chosen for the pairwise interactions and the optimization of the parameters against the training data is carried out using (1) multi-start local optimization algorithms: Simplex, Levenberg-Marquardt, and POUNDERS, (2) single-objective GA, and (3) multi-objective GA. Using random search as a baseline, we compare the algorithms in terms of reaching the lowest error, and number of function evaluations. We also compare the effectiveness of different approaches for FF parameterization using a test data set with known ground truth (i.e generated from a specific Morse FF). We find that the performance of optimization approaches differs when using the Test data vs. the DFT data. Overall, this study provides insight for selecting a suitable optimization method for FF parameterization, which in turn can enable more accurate prediction of material properties and chemical phenomena. △ Less

Submitted 1 December, 2018; originally announced December 2018.

arXiv:1607.04188 [pdf]

Methodology of Parameterization of Molecular Mechanics Force Field From Quantum Chemistry Calculations using Genetic Algorithm: A case study of methanol

Authors: Ying Li, Hui Li, Maria K. Y. Chan, Subramanian Sankaranarayanan, Benoît Rouxb

Abstract: In molecular dynamics (MD) simulation, force field determines the capability of an individual model in capturing physical and chemistry properties. The method for generating proper parameters of the force field form is the key component for computational research in chemistry, biochemistry, and condensed-phase physics. Our study showed that the feasibility to predict experimental condensed phase p… ▽ More In molecular dynamics (MD) simulation, force field determines the capability of an individual model in capturing physical and chemistry properties. The method for generating proper parameters of the force field form is the key component for computational research in chemistry, biochemistry, and condensed-phase physics. Our study showed that the feasibility to predict experimental condensed phase properties (i.e., density and heat of vaporization) of methanol through problem specific force field from only quantum chemistry information. To acquire the satisfying parameter sets of the force field, the genetic algorithm (GA) is the main optimization method. For electrostatic potential energy, we optimized both the electrostatic parameters of methanol using the GA method, which leads to low deviations of between the quantum mechanics (QM) calculations and the GA optimized parameters. We optimized the van der Waals (vdW) parameters both using GA and guided GA methods by calibrating interaction energy of various methanol homo-clusters, such as nonamers, undecamers, or tridecamers. Excellent agreement between the training dataset from QM calculations (i.e., MP2) and GA optimized parameters can be achieved. However, only the guided GA method, which eliminates the overestimation of interaction energy from MP2 calculations in the optimization process, provides proper vdW parameters for MD simulation to get the condensed phase properties (i.e., density and heat of vaporization) of methanol. Throughout the whole optimization process, the experimental value were not involved in the objective functions, but were only used for the purpose of justifying models (i.e., nonamers, undecamers, or tridecamers) and validating methods (i.e., GA or guided GA). Our method shows the possibility of developing descriptive polarizable force field using only QM calculations. △ Less

Submitted 15 July, 2016; v1 submitted 14 July, 2016; originally announced July 2016.

Comments: not submitted to anywhere else by July 2016

Showing 1–6 of 6 results for author: Chan, M K Y