Research Overview
The broad research foci of the Artificial Intelligence Research Laboratory include:
- Machine Learning and Data Sciences Machine Learning and Data Science: Statistical, information theoretic, linguistic and structural, relational, neural network, decision theoretic approaches to machine learning. Federated machine learning. Continual learning. Physics-based machine learning. Multi-modal and multi-fidelity learning; Causal learning. Learning predictive models from tabular, relational, text, image, genomic, time series, longitudinal, spatio-temporal and network data; Bias and fairness in machine learning; Explainable and interpretable machine learning.
- Causal AI: Estimating causal effects from observational data – including tabular, relational, and temporal data; Learning causal models from data; Generalizability of causal effects across settings; Causal explanations.
- AI for Science: Computational models of scientific discovery; Discovery informatics infrastructure to integrate data, hypothesis, and knowledge-based inference, predictive modeling, experimentation, simulation, and hypothesis testing to provide an orderly formal framework and exploratory apparatus for science; Applications in life sciences, health sciences, materials sciences.
- Knowledge Representation and InferenceLogical, probabilistic, relational, structural, syntactic (grammatical), decision-theoretic, neural, epistemic knowledge representation; federated data and knowledge bases, federated query answering, reasoning, knowledge sharing; federated services – service composition, substitution, and adaptation; epistemic description logics; secrecy-preserving query answering, representing and reasoning about single and multi-stakeholder qualitative preferences; representing and reasoning about actions; representing and reasoning about causes and their effects.
- Information Integration: Probabilistic, network based, relational, logical, knowledge-based, and representation learning approaches to information integrationi across abstractions, modalities, and scales.
- Bioinformatics and Computational Molecular and Systems Biology: Computational analyses and prediction of macromolecular sequence-structure-function-interaction-expression relationships, identification of sequence and structural correlates of protein-protein, protein-RNA, and protein- DNA interactions, protein sub-cellular localization, automated protein structure and function annotation, modeling and inference of genetic regulatory networks from gene expression (micro- array, proteomics) data, modeling and inference of signal transduction and metabolic pathways, comparative analysis of biological networks (network alignment), integrative analysis of molecular interaction networks and macro-molecular interfaces.
- Computational Materials Design, Discovery, and Synthesis: Foundation models for materials discovery; optimizing material synthesis using machine learning; learning of universal interatomic potentials and thermodynamic equations; inverse materials design; materials property prediction; multi-modal materials characterization; closed-loop integration of data, knowledge, simulations, experiments, and human-AI interactions for materials design, discovery, and synthesis.
- Predictive and Causal Modeling of Health Risks and Health Outcomes. Machine learning methods for predictive modeling of health risks from longitudinal, multi-modal, clinical (electronic health records), socio-demographic and environmental data; Estimating causal effects from clinical data. Biomedical image analysis. Biomedical text analysis.
- Applied Informatics: Applied machine learning and applied causal inference in cognitive and brain sciences, infrastructure management, education, and related topics.
- Other topics of interest:Biological Computation, Sensory systems and behavior evolution, Language evolution, Mimetic evolution; Computational Semiotics – Origins and use of signs, emergence of semantics; Computational organization theory; Computational neuroscience; Computational models of creativity; Computational models of discovery; Computational argumentation theory.
Motivating Questions
Our research spans both foundations of AI and applications of AI and is motivated by questions of scientific or societal importance, such as the following:
- How can AI systems augment human intellect to dramatically accelerate science?
- How can we efficiently build robust and explainable predictive models from data?
- How can we elicit causal relations from multiple disparate observational and experimental studies?
- How can we ensure that AI systems are fair, explainable, and accountable?
- How can we build predictive models from federated data?
- How can we build predictive models from multi-modal (multi-view) data?
- What are the information requirements of learning in specific settings?
- How can we learn language syntax (grammars) and semantics?
- How can we build predictive models from high-dimensional longitudinal data?
- How can we construct computational abstractions of scientific artifacts and scientific domains?
- How can we efficiently represent and reason about preferences?
- How can we query and reason with federated data and knowledge bases?
- How can we assemble, adapt, and execute complex services from component services?
- How can we answer queries against knowledge bases without revealing secrets?
- How can we build systems for data access and use policy compliant data integration and analysis?
- How can we learn to predict health risks and outcomes multi-modal longitudinal health data?
- How can we predict macromolecular interactions, interfaces, and complexes?
- How can we construct, compare, and analyze models of molecular networks?
- How can we model, construct, compare, and analyze brain networks from data?
- How can we build systems that learn continually from observations and interactions?
- How can we reduce the energy footprint of AI and machine learning?
- How can we build robust intelligent agents that incorporate multiple facets of intelligence?
- Deep learning algorithms for representation learning, classification, regression, and multi-modal information fusion.
- Scalable algorithms for building predictive models from large, distributed, semantically disparate data (big data).
- Algorithms for constructing predictive models from sequence, image, text, multi-relational, graph-structured data.
- New approaches to selective sharing of knowledge across federated knowledge bases, and secrey-preserving query answering.
- Theoretically sound yet practically useful approaches to functional and non-functional specification driven composition of complex services from components.
- Expressive languages for representing, and model checking approaches to reasoning with, qualitative preferences.
- Algorithms for eliciting causal effects from disparate sources of observational and experimental data;
- Scalable algorithms and software for comparative analyses of large biomolecular networks.
- Machine learning approaches to analysis and prediction of macromolecular interactions and interfaces with applications in the analysis and prediction of protein-protein, protein-DNA, and protein-RNA interactions and interfaces, B-cell and T-cell epitopes.
Current Research Foci
Some of the current research foci of the lab include:
- Computational abstractions scientific artifacts, scientific domains, and scientific processes.
- Design of optimal human-AI teams (including organizational structures and processes) to accelerate science.
- Design and analysis of algorithms for predictive modeling from very large, high dimensional, richly structured, multi-modal, longitudinal data.
- Development of theoretically sound and practically useful characterizations of and algorithms for continual learning.
- Elucidation of causal relationships from disparate experimental and observational studies and from relational, temporal, and temporal-relational data.
- Design and analyses of accountable, explainable, and fair AI systems.
- Analysis and prediction of macromolecular interactions, elucidation of complex biological pathways e.g., those involved in immune response, development, and disease.
- Predictive and causal modeling of individual and population health outcomes from behavioral, biomedical, clinical, environmental, socio- demographic data.
- Predictive and causal modeling of behavioral and cognitive systems in naturalistic settings.
- Predictive modeling of material properties from composition and adaptive design of materials.
- Causal effect estimation from federated data.
- Development and applications of quantum machine learning algorithms.
The laboratory's research is funded in part by grants from the National Science Foundation, the National Institutes of Health, the US Department of Agriculture, the US Department of Defense, and Pennsylvania State University.
Additional information about current projects in the Artificial Intelligence Research Laboratory can be found on the projects and publications pages.