This is Penn State

Research Overview


The broad research foci of the Artificial Intelligence Research Laboratory include:

  • Machine Learning, Data Science and Big Data Analytics: Statistical, information theoretic, linguistic and structural approaches to machine learning, Learning and refinement of Bayesian networks, causal networks, decision networks, neural networks, support vector machines, kernel classifiers, multi-relational models, language models (n-grams, grammars, automata), grammars; Learning classifiers from attribute value taxonomies and partially specified data; Learning attribute value taxonomies from data; Learning classifiers from sequential and spatial data; Learning relationships from multi-modal data (e.g., text, images), Learning classifiers from distributed data, multiple instance data, multiple instance, multiple class data; networked data; multi-relational data, linked open data (RDF), and semantically heterogeneous data; Incremental learning, Ensemble methods, multi-agent learning, curriculum-based learning; selected topics in computational learning theory.
  • Bioinformatics, Computational Molecular Biology, and Computational Systems Biology: Bioinformatics and Computational Molecular and Systems Biology: Data-driven discovery of macromolecular sequence-structure-function-interaction-expression relationships, identification of sequence and structural correlates of protein-protein, protein-RNA, and protein-DNA interactions, protein sub-cellular localization, automated protein structure and function annotation, modeling and inference of genetic regulatory networks from gene expression (micro-array, proteomics) data, modeling and inference of signal transduction and metabolic pathways, comparative analysis of biological networks (network alignment), integrative analysis of molecular interaction networks and macro-molecular interfaces.
  • Discovery Informatics: Computational models of scientific discovery; Informatics infrastructure to integrate data, hypothesis, and knowledge-based inference, predictive modeling, experimentation, simulation, and hypothesis testing to provide an orderly formal framework and exploratory apparatus for science
  • Knowledge Representation Probabilistic, grammatical, network based, relational, logical, epistemic knowledge representation; knowledge-based, network based, and probabilistic approaches to information integration; description logics, federated data bases – statistical queries against federated databases, knowledge bases – federated reasoning, selective knowledge sharing, services – service composition, substitution, and adaptation; epistemic description logics; secrecy-preserving query answering, representing and reasoning about qualitative preferences, representing and reasoning about causality.
  • Applied Informatics and Applied Data Sciences: Applications of artificial intelligence, machine learning, and big data analytics to problems in bioinformatics, biomedical and health informatics, brain informatics security informatics, social informatics, and materials informatics.
  • Other Topics of Interest: Biological Computation, Evolutionary, Cellular and Neural Computation, Complex Adaptive Systems, Sensory systems and behavior evolution, Language evolution, Mimetic evolution; Computational Semiotics- Origins and use of signs, emergence of semantics; Computational organization theory; Computational Neuroscience; Computational models of creativity, Computational models of discovery.

Current Research Foci

Some of the current research foci of the lab include:

  • Computational abstractions scientific artifacts (e.g., data, knowledge, hypotheses), and universes of scientific discourse (e.g., biology), and scientific processes (e.g., hypothesis generation, predictive modeling, experimentation, simulation, and hypothesis testing, argumentation), cognitive tools that augment and extend human intellect; and human- machine cyberinfrastructure (including organizational structures and processes) to accelerate science;
  • Design and analysis of algorithms for predictive modeling from very large, high dimensional, richly structured, multi-modal, longitudinal data;
  • Elucidation of causal relationships from disparate experimental and observational studies;
  • Elucidation of causal relationships from relational, temporal, and temporal-relational data;
  • Design and analyses of accountable, explainable, and fair AI systems;
  • Analysis and prediction of macromolecular interactions, elucidation of complex biological pathways e.g., those involved in immune response, development, and disease;
  • Predictive and causal modeling of individual and population health outcomes from behavioral, biomedical, clinical, environmental, socio- demographic data;
  • Predictive and causal modeling of behavioral and cognitive systems in naturalistic settings;
  • Modeling the structure, activity, and function of brain networks from fMRI and other types of data;
  • Predictive modeling of material properties from composition and adaptive design of materials;
  • Algorithmic fairness criteria and their operationalization.

Recent Research Results

Some of the lab's most recent work has focused on:
  • Scalable algorithms for building predictive models from large, distributed, semantically disparate data (big data), including more recently, linked open data
  • Algorithms for constructing predictive models from sequence, image, text, multi-relational, graph-structured data;
  • New approaches to selective sharing of knowledge across autonomous knowledge bases (including knowledge base federation, secrecy-preserving query answering);
  • Theoretically sound yet practically useful approaches to functional and non-functional specification driven composition of complex services from components;
  • Expressive languages for representing, and model checking approaches to reasoning with, qualitative preferences;
  • Algorithms for eliciting causal effects from disparate sources of observational and experimental data;
  • Scalable algorithms and software for comparative analyses of large bio- molecular networks and
  • Machine learning approaches to analysis and prediction of macromolecular interactions and interfaces (including in particular, the first algorithm for partner- specific prediction of protein-protein interface sites and state-of-the-art sequence based protein-RNA interface predictors) that have resulted in several widely used web servers for analysis and prediction of protein-protein, protein-DNA, and protein-RNA interactions and interfaces, B-cell and T-cell epitopes.

The laboratory's research is funded in part by grants from the National Science Foundation, the National Institutes of Health, the US Department of Agriculture, the US Department of Defense, and Pennsylvania State University.

Additional information about current projects in the Artificial Intelligence Research Laboratory can be found on the projects and publications pages.