Genome-wide association studies have recently become popular as a tool for identifying the genetic loci that are responsible for increased disease susceptibility by examining genetic and phenotypic variation across a large number of individuals. The cause of many complex disease syndromes involves the complex interplay of a large number of genomic variations that perturb disease-related genes in the context of a regulatory network. As patient cohorts are routinely surveyed for a large number of traits such as hundreds of clinical phenotypes and genome-wide profiling for thousands of gene expressions, this raises new computational challenges in identifying genetic variations associated simultaneously with multiple correlated traits. In this talk, I will present algorithms that go beyond the traditional approach of examining the correlation between a single genetic marker and a single trait. Our algorithms build on a sparse regression method in statistics, and are able to discover genetic variants that perturb modules of correlated molecular and clinical phenotypes during genome-phenome association mapping. Our approach is significantly better at detecting associations when genetic markers influence synergistically a group of traits.
Speaker Biography
Seyoung Kim is currently a project scientist in the Machine Learning Department at Carnegie Mellon University. Her work as a postdoctoral fellow and project scientist at Carnegie Mellon University has included developing machine-learning algorithms for disease association mapping. She received her Ph.D. in computer science from the University of California, Irvine, in 2007. During her Ph.D., she worked on statistical machine learning methods for problems in biomedical domain.