This is Penn State

Research Projects

I am currently participating in a broad range of interdisciplinary research projects, where I am developing more sophisticated, integrative, and longitudinal algorithms, frameworks, and analyses of data. Current active research projects include:

Health Informatics

Wearable mHealth devices are being used in a variety of healthcare scenarios including tracking of sleep and physical activities. mHealth technology is a promising direction for revolutionizing the preventive and medical healthcare. To unlock the potential of mHealth, several data-related challenges including scalability, heterogeneity, noise, and privacy need to be addressed. Using polysomnography (PSG) and accelerometer recorded sleep data collected in Buxton's lab, our goal is to develop reliable machine learning based models for sleep scoring and sleep parameter estimation.

Main research directions:

  1. How to develop reliable cost-effective models for sleep/wake classification and sleep parameter estimation using unlabeled accelerometer data?
  2. How to develop personalized models for sleep for sleep/wake classification and sleep parameter estimation?
  3. How to integrate data from multiple accelerometer sensors for tracking sleep quality?
Selected publications:

Khademi A, El-Manzalawy Y, Buxton O, Honavar V (2018) Toward Personalized Actigraphy Sleep/Wake Classification. IEEE International Conference on Biomedical and Health Informatics (in press).

El-Manzalawy Y, Buxton O, Honavar V (2017) Sleep/wake state prediction and sleep parameter estimation using unsupervised classification via clustering. IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2017). pp. 718-723.
Translational Bioinformatics

Large-scale collaborative precision medicine initiatives (e.g., The Cancer Genome Atlas (TCGA)) are yielding rich multi-omics data. Integrative analyses of the resulting multi-omics data offer the tantalizing possibilities of realizing the potential of precision medicine in cancer prevention, diagnosis, and treatment by substantially improving our understanding of underlying mechanisms as well as the discovery of novel biomarkers for different types of cancers. However, such analyses present a number of challenges, including the heterogeneity of data types, and the high-dimensionality of omics data.

This project addresses the challenge of integrating heterogeneous high-dimensional data (e.g., multi-omics data) in order to develop testable hypothesis and reliable predictive models for deciphering phenotype-genotype relationships in complex diseases such as cancer. Our methodology is based on multi-view learning, representation learning, and graph mining approaches.

Selected publications:

El-Manzalawy Y, Hsieh T-Y, Shivakumar M, Kim D, Honavar V (2017) Min-Redundancy and Max-Relevance Multi-view Feature Selection for Predicting Ovarian Cancer Survival using Multi-omics Data. Presented at the 7th Annual Translational Bioinformatics Conference.

El-Manzalawy Y (2018) CCA based multi-view feature selection for multi-omics data integration. IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) (Accepted).


Metagenome-wide analysis studies provide a unique set of microbial features for biomarker discovery of associated disease as well as for studying diversity and dynamics of microbial communities under different conditions. In general, metagenome-wide analysis studies focus on one major approach (i.e., statistical, predictive, or comparative network analysis) and are limited to generating taxonomy profiles using one tool.

Main research directions:

  1. Developing integrative machine learning and comparative network approaches for metagenome-wide analysis studies of the human microbiota and associated diseases (e.g., diabetes, obesity, and cardiovascular disease).
  2. Developing dynamic network analysis algorithms and tools for studying the temporal variation of microbial communities in a variety of environments.
  3. Developing integrated models and analyses for the integration of multi-view (multi-modal) longitudinal data such as the data collected using ongoing efforts in the Integrative Human Microbiome Project (iHMP) which are creating integrated data sets of microbiome and host functional properties (e.g., omics data).
  4. Developing more sophisticated methodologies for across-studies and/or across-disease meta-analysis of rapidly increasing amounts of publically available metagenomics data.
Selected publications:

Abbas M, El-Manzalawy Y (2017) Predictive and Comparative Network Analysis of the Gut Microbiota in Type 2 Diabetes. Proceedings of the 8th ACM Conference on Bioinformatics, Computational Biology and Health Informatics: ACM. pp. 313-320.

Past Research Projects