This is Penn State

Knowledge-Based Machine Learning (supported in part by grants from the National Science Foundation and the National Institutes of Health)
 

Scientific applications of machine learning present a growing need for incorporating physics, constraints, e.g., symmetries, invariances, equivariances, measurement uncertainties, process descriptions, e.g., differential equations, algebraic functional forms, knowledge graphs, e.g., periodic table of elements, to overcome the limitations of otherwise purely data-driven ML models. Existing knowledge-informed ML (KIML) methods fall into the following : (i) Physics-informed ML (PIML) methods that incorporate physics into ML, e.g., neural network architecture or loss function; (ii) Methods that incorporate domain knowledge as constraints or terms in the objective function; (iii) Geometric learning methods that ensure that incorporate invariances, and equivariances into learned representations; and (iv) Bayesian models that incorporate knowledge into priors. Our recent work has resulted in:

  • Methods that exploit knowledge of the periodic table to adapt machine learned interatomic potentials to new material structures made of elements that do not appear in the training data.
  • Knowledge-based extensions of the state-of-the-art physics-based reactive force-field pioneered by van Duin with additional parameterized functional obtaining significant improvements the state-of-the-art methods in accuracy as well as speed. This approach could vastly make high-accuracy potential based reactive MD simulations more accessible to a broader range of researchers.
Work in progress aims to develop generalizable approaches to informing data driven machine learning methods with knowledge of various forms, and evaluating the resulting methods on scientific applications, e.g., construction of universal interatomic potentials and thermodynamic equations for materials discovery, design, and synthesis.