Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Tae Hyun Hwang Contact Information Quantitative Biomedical Research Center, Department of Clinical Science phone: 214-648-4127 2201 Inwood Ave http://www.taehyunlab.org The University of Texas Southwestern Medical Center e-mail: [email protected] Research Interests Education Computational Biology, Machine Learning, Data Mining, and Statistical Methods Research Experience University of Minnesota Twin Cities, USA Mar. 2011 Ph.D., Department of Computer Science and Engineering • Thesis: Network-based Learning Algorithms for Understanding Human Disease • Committee: Rui Kuang, Ph.D., Assistant professor, Department of Computer Science and Engineering at University of Minnesota Vipin Kumar, Ph.D., William Norris Professor and Head, Department of Computer Science and Engineering at University of Minnesota Chad L. Myers, Ph.D., Assistant professor, Department of Computer Science and Engineering at University of Minnesota Dennis Wigle, M.D. Ph.D., Assistant professor, Mayo Clinic Cancer Center and Division of Thoracic Surgery at Mayo Clinic Inha University, KOREA Aug. 2005 M.S. (transferred), Department of Computer Science and Engineering • Advisor: Prof. Gun-Sik Jo Inha University, KOREA Feb. 2004 B.E., Department of Computer Science and Engineering Quantitative Biomedical Research Center, Department of Clinical Science, University of Texas Southwestern Medical Center, USA Assistant Professor Jan. 2013 – current Masonic Cancer Center, University of Minnesota Twin Cities, USA Principal Investigator, Research Scientist Mar. 2011 – Jan. 2013 • Developing computational methods to integrate diverse high-throughput genomic, genetic, phenotypic, and clinical data, and custom pipelines to analyze next generation sequencing data (e.g. DNA sequence, RNA sequence, CHIP-seq, and etc) to discover underlying molecular mechanisms of cancers. Genentech, Inc., USA Research Intern Jun. 2010 – Sept. 2010 Advisor: Jinfeng Liu, Ph.D., Department of Bioinformatics and Computational Biology • Developed and implemented network-based integrative analysis for discovering gene modules (e.g. pathway, subnetworks or gene sets) that are commonly and cancer-type specific dysregulated in copy number alternations across human cancers. University of Minnesota Twin Cities, USA Research Assistant Jun. 2007 – present Advisor: Prof. Rui Kuang, Department of Computer Science and Engineering • Developed and implemented network-based learning algorithms to integrate gene expression, copy number, and protein interaction networks for cancer outcome prediction and biomarker discovery. • Developed and implemented network-based learning algorithm to prioritize potential disease candidate genes using integrated human phenome-genome interactome networks. The algorithm can be also applicable to predict functions of genes, and new drug targets of a drug (or a combination of drugs). Inha University, KOREA Research Assistant Mar. 2004 – Jun. 2005 Advisor: Prof. Gun-sik Jo, Department of Computer Science and Engineering • Developed and implemented web-based tools for use in mining information on the semantic web Research Grants 1. Patient Stratification and Pathway Discovery using Genomic Data Integration (pending) Samsung Advanced Institute of Technology Role: Principal Investigator The goal of this project is to develop novel computational tools to predict which patients who may or may not respond to drugs and pathways related with drug response to create personalized, more effective treatment strategies. 2. AR Gene Structure Alterations and Prostate Cancer Progression American Cancer Society ($600,000), 01/01/2012∼12/31/2015. PI: Scott Dehm The goal of this project is to study cell- and xenograft-based models of prostate cancer progression to understand the mechanisms by which alternatively-spliced, truncated AR isoforms are synthesized, translocate to the nucleus, bind DNA, activate transcription, and mediate resistance to AR-targeted therapies in CRPCa. 3. Genetic Background and the Angiogenic Phenotype in Cancer AKC Canine Health Foundation ($230,424), 01/01/2010∼12/31/2012. PI: Jaime Modiano The major goals of this project are to confirm that heritable traits (breed) have distinct and discrete influence on gene expression signatures of canine hemangiosarcoma, and that these traits lead these tumors to respond differently to angiogenic and pro-inflammatory signals. 4. Genomic Signatures of Colorectal Cancer Masonic Cancer Center ($180,000) 03/01/2012∼02/28/2014. PI: David Lagaespada This proposal enlists a multi center team at the University of Minnesota and the VUMC in Amsterdam to identify markers of colorectal cancer based on somatic DNA mutations and specific chromosomal alterations for use in diagnosis and therapy. 5. The role of AR gene rearrangements in prostate cancer progression R01 NIH Role: Co Investigator This project is designed to study AR gene rearrangements in prostate cancer progression using nextgeneration sequencing analysis. 6. Efficient Algorithms and Database Architecture for Big Bio Data (pending) The Small and Medium Business Administration (Korea Government Agency) ($400,000) Role: Co Investigator This project is designed to develop algorithms and design infrastructures for the discovery of biomarkers for clinical use from Big Bio data. 7. Comparative Assessment of the Etiology and Clonal Diversity of Non-Hodgkin Lymphoma (submitted) Mn Partnership for Biotechnology and Medical Genomics PI: Jaime Modiano This project is designed to develop and optimize infrastructure for comparative multispecies systems approaches to study non-hodgkin lymphoma. 8. Tumor-Microenvironment Interactions in Osteosarcoma Progression (submitted) Department of Defense PI: Jaime Modiano This project is designed to understand how bidirectional interactions between osteosarcoma and the host tumor microenvironment contribute to aggressive biological behavior and metastasis in this pediatric tumor. Peer-reviewed publications 1.Yingming Li, Siu Chiu Chan, Lucas Brand, TaeHyun Hwang, Kevin Silverstein and Scott Dehm,“Androgen receptor splice variants mediate enzalutamide resistance in castration-resistant prostate cancer cell lines”,Cancer Research November 2012, Impact factor: 7.856 2.TaeHyun Hwang, Maoqiang Xie, Gowtham Atluri, Sanjoy Dey, Vipin Kumar, Changjin Hong and Rui Kuang.,“Co-clustering Phenome-genome for Phenotype Classification and Disease Gene Discovery”, Nucleic Acids Research June 2012; doi: 10.1093/nar/gks615, Impact factor: 8.026 3.Yingming Li*, TaeHyun Hwang*, LeAnn Oseth, Betsy Hirsch, Robert Vessella, Kenny Beckman, Kevin Silverstein, and Scott Dehm,“AR intragenic deletions linked to androgen receptor splice variant expression and activity in models of prostate cancer progression”, Oncogene, Jan. 2012; doi:10.1038/onc.2011.637 Impact factor: 7.414 (*Joint first authors) 4.Young-Mi Kim, Matthew Stone, Tae Hyun Hwang, Yeon-Gil Kim, Timothy J. Griffin, and DoHyung Kim, “SH3BP4 is a negative regulator of amino acid−Rag GTPase−mTORC1”, In Press, Molecular Cell, doi:10.1016/j.molcel.2012.04.007 Impact factor: 14.194 5.Maoqiang Xie, TaeHyun Hwang, and Rui Kuang,“Reconstructing Disease Phenome-genome Association by Bi Random Walk”, Bioinformatics 2012, doi:10.1093/bioinformatics/bts067 Impact factor: 4.877 6.Seung-Jun Kim, TaeHyun Hwang, and Georgios B. Giannakis,“Sparse Robust Matrix Tri-factorization with Application to Cancer Genomics”,In Press, International Workshop on Cognitive Information Processing (CIP) 2012 7.Maoqiang Xie, TaeHyun Hwang, and Rui Kuang,“Prioritizing Disease Genes by Bi Random walk”, In Press, PAKDD 2012 8.TaeHyun Hwang, Gowtham Atluri, Rui Kuang, Timothy Starr, Peter Haverty, Zemin Zhang, and Jinfeng Liu,“Large-scale Integrative Network-based analysis Identifies Common Pathways Disrupted by Copy Number Alterations across Cancers”, Under review 9.TaeHyun Hwang, Wei Zhang, Maoqiang Xie, Jinfeng Liu, and Rui Kuang,“Inferring Disease and Gene Set Associations with Rank Coherence in Networks”, Bioinformatics, 2011 1;27(19):2692-9. Epub 2011 Aug 8. Impact factor: 4.926 10.TaeHyun Hwang, and Rui Kuang,“A Heterogeneous Label Propagation Algorithm for Disease Gene Discovery”, SIAM International Conference on Data Mining (SDM), April 2010 (full paper full presentation, acceptance rate 23.36%) 11.Ze Tian*, TaeHyun Hwang*, and Rui Kuang,“A Hypergraph-based Learning Algorithm for Classifying Gene Expression and arrayCGH data with Prior Knowledge”, Bioinformatics, Vol. 25, No. 21, pages: 2831-2838, 2009 Impact factor: 4.328 (*Joint first authors) 12.TaeHyun Hwang*, Ze Tian*, Jean-Pierre Kocher, and Rui Kuang,“Learning on Weighted Hypergraphs for Integrating Protein Interactions and Gene Expressions”, IEEE International Conference on Data Mining (ICDM), December 2008 (full paper full presentation, acceptance rate 9.7%) (*Joint first authors) 13.TaeHyun Hwang, Hugues Sicotte, Ze Tian, Dennis Wigle, Jean-Pierre Kocher, Vipin Kumar and Rui Kuang,“Robust and Efficient Identification of Biomarkers by Classifying Features on Graphs”, Bioinformatics, Vol. 24, No. 18, pages 2023-2029, 2008 Impact factor: 5.039 14.Ze Tian, TaeHyun Hwang, and Rui Kuang,“A Hypergraph-based Learning Algorithm for Classifying arrayCGH Data with Spatial Prior”, Proc. of IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS) , 2009 15.TaeHyun Hwang, and Rui Kuang,“A Comparative Study of Breast Cancer Microarray Gene Expression Profiles using Label Propagation”, Proceedings of the Workshop on Data Mining for Biomedical Informatics, held in conjunction with SIAM International Conference on Data Mining (SDM), April 2008 16.Jaehoon Jeong, TaeHyun Hwang, Tian He, and David Du,“MCTA: Target Tracking Algorithm based on Minimal Contour in Wireless Sensor Networks”, IEEE Infocom 2007 Minisymposia (INFOCOM), August 2007 (short paper, total acceptance rate 25%) Technical Reports 1.TaeHyun Hwang, and Rui Kuang,“Prioritizing Disease Genes in a Heterogeneous Network”, Technical Report UMN-CS-09-022, Sept. 2009 2.TaeHyun Hwang, Hugues Sicotte, Ze Tian, Dennis Wigle, Jean-Pierre Kocher, Vipin Kumar and Rui Kuang, “Identifying Clinical and Genetic Markers of Human Disease by Classifying Features on Graphs”, Technical Report UMN-CS-07-021, September 2007 3.Jaehoon Jeong, TaeHyun Hwang, Tian He, and David Du, “Target Tracking Algorithm based on Minimal Contour in Wireless Sensor Networks”, Technical Report UMN-CS-07-002, June 2007 In preparation 1.TaeHyun Hwang, and Kevin Silverstein,“Challenges of integrating omics data from cancer databases ”, (Invited paper), Personalized Medicine 2.TaeHyun Hwang, Gaurav Pandey, and Rui Kuang,“An Integrated Network-based approach to Infer Gene Set and GO function Associations”, In preparation Talks 1.TaeHyun Hwang, “Cancer genome analyses to personalize treatment: from microarray to nextgen sequencing data”, Harvard Medical School, University of Texas MD Anderson Cancer Center, University of Texas Southwestern Medical Center, Memorial Sloan Kettering Cancer Center, April, and May. 2012 (Invited talk) 2.TaeHyun Hwang, “Cancer genome analyses to personalize treatment: from microarray to nextgen sequencing data”, Mayo Clinic, Feb. 2012 (Invited talk) 3.TaeHyun Hwang, “Network-based methods to integrate gene expression, copy number, and protein interaction network for biomarker discovery and cancer outcome prediction”, IBM T.J Watson Research Center, Pfizer Oncology, Novartis Institute for Biomedical Research (Oncology), Institute for System Biology, Masonic Cancer Center at University of Minnesota Twin Cities, Sept. 2010 (Invited talk) 4.TaeHyun Hwang, “Integrated human phenome-genome interactome networks for disease-gene association discovery: a machine learning approach”, Harvard Medical School, Sept. 2010 (Invited talk) 5.TaeHyun Hwang, “Integrative analysis of copy number alternations with protein interaction networks in cancers”, Genentech Inc., August 2010 6.TaeHyun Hwang,“Integrating heterogenous networks for biomarker discovery and disease outcome predictions”, Columbia Univeristy, August 2010 (Invited talk) 7.TaeHyun Hwang,“Robust and efficient identification of biomarker using clinical and biological data integration”, Roche, August 2010 (Invited talk) 8.TaeHyun Hwang,“Mining genetic determinants of human disease: a machine learning approach with network perspective”, University of California San Diego, UCSD, July 2010 (Invited talk) 9.TaeHyun Hwang, and Rui Kuang,“A Heterogeneous Label Propagation Algorithm for Disease Gene Discovery”, SIAM International Conference on Data Mining (SDM), April 2010 (full paper full presentation, acceptance rate 23.36%) 10.TaeHyun Hwang, “A Hypergraph-based Learning Algorithm for Classifying Gene Expression and arrayCGH data with Prior Knowledge”, Inha University, Korea, Oct. 2009., (Invited talk) 11.TaeHyun Hwang, Ze Tian, Jean-Pierre Kocher, and Rui Kuang, “Learning on Weighted Hypergraphs to Integrate Protein Interactions and Gene Expressions for Cancer Outcome Prediction”, Proc. of the 8th IEEE International Conference on Data Mining (ICDM), Dec. 2008., (full paper full presentation, acceptance rate 9.7%) 12.TaeHyun Hwang, Hugues Sicotte, Dennis A. Wigle, Jean-Pierre Kocher, Vipin Kumar and Rui Kuang, “Identifying Clinical and Genetic Markers of Human disease by Classifying Features on Graphs”, US-Korea Conference on Science Technology, and Entrepreneurship, August 14-17, 2008 13.TaeHyun Hwang, and Rui Kuang, “A Comparative Study of Breast Cancer Microarray Gene Expression Profiles using Label Propagation”, Proc. of the Workshop on Data Mining for Biomedical Informatics, held in conjunction with SIAM International Conference on Data Mining (SDM), 2008. Posters 1.TaeHyun Hwang, Gowtham Atluri, Rui Kuang, Timothy Starr, Peter Haverty, Zemin Zhang, and Jinfeng Liu,“NetPathID: an integrative network-based analysis to identify pathways disrupted by copy number alterations across cancers”, Systems Biology of Diversity in Cancer symposium at MSKCC, September 16, 2011 2.TaeHyun Hwang, Ze Tian, Michael Steinbach, Vipin Kumar, Rui Kuang, Jean-Pierre Kocher, Hugues Sicott, Dennis Wigle, Rich Mushlin., Mining Cancer Biomarkers from Heterogeneous Genomic Data by Graph-based Learning, Biomedical Informatics and Computational Biology (BICB) Research Symposium organized by IBM, Mayo Clinic, Hormel Institute, and Univ. of Minnesota, January 16, 2009 3.TaeHyun Hwang, Hugues Sicotte, Dennis A. Wigle, Jean-Pierre Kocher, Vipin Kumar and Rui Kuang., Identifying Clinical and Genetic Markers of Human disease by Classifying Features on Graphs, US-Korea Conference on Science Technology, and Entrepreneurship,August 14-17, 2008 4.TaeHyun Hwang, Ze Tian, Michael Steinbach, Vipin Kumar, Rui Kuang, Jean-Pierre Kocher, Hugues Sicott, Dennis Wigle, Rich Mushlin., Mining Genetic Determinants of Human Disease,, Biomedical Informatics and Computational Biology (BICB) Research Symposium organized by IBM, Mayo Clinic, Hormel Institute, and Univ. of Minnesota, June 20, 2008 Honours and Awards 1. SDM 2010 Doctoral Forum Award and Student Travel Grant, National Science Foundation (NSF), 2010 2. First Prize for the 2009 Korean Computer Scientists and Engineers Association in America (KOCSEA) /Moon Jung Chung Scholarship/Poster competition. 2009 3. Nominated by Computer Science Department for Interdisciplinary Doctoral Fellowship, University of Minnesota Twin Cities, 2009 4. ICDM 2008 Student Travel Grant, National Science Foundation (NSF), 2008 5. UKC 2007 Student Travel Grant, Korean-American Scientists Engineering Association, 2008. 6. Research Paper Competition (4th place), Inha University, 2005 7. Merit-based Scholarship for Graduate Study, Inha Univeristy, 2004, 2005 8. Academic Excellence Fellowship Award, Inha University, 2001, 2003, 2004, 2005 Teaching Experience University of Minnesota Twin Cities, USA Teaching Assistant Jan. 2006 – Jun. 2010 CSci 5461 Functional Genomics, System Biology and Bioinformatics (Spring 2010) : This is an interdisciplinary graduate level course. Topics cover the analysis of gene expression data, proteomic data, and interaction data, with a special focus on how they can be used to understand and infer biological networks. Enrolled students have different backgrounds including biology, biochemistry, and computer science. I work closely with my advisor in this course, helping plan the course schedule, consulting on lecture topics, and planning homework and the course project. CSci 3003 Introduction to Computing in Biology (Spring 2008) : This was an interdisciplinary undergraduate level course. The course provides students with fundamental computational skills needed to carry out some basic tasks common in modern biology research. I helped my instructor to design homework and course topics, and led a weekly lab section with around 30 students. CSci 1901 Structure of Computing 1 (Spring 2007) : This was an undergraduate level course. In this large introductory programming class, I was responsible for leading lab sections and preparing lab materials, and helping the instructor prepare homework materials, grading of homework and exams. Finally, I gave the lecture in this course when the instructor was unable to attend class (approximately 120 students). CSci 5801 Software Engineering (Fall 2006) : This was a graduate level course with a large enrollment. My duties are grading homeworks and exams, helping students with questions about coursework and lectures. CSci 4131 Internet Programming (Spring 2006) : This was a senior undergraduate level course with a large enrollment. My duties are grading homeworks and exams, holding office hours to answer questions about coursework and lectures. CSci 1902 Structure of Computing 2 (Spring 2006) : This was an interdisciplinary undergraduate level course with a large enrollment. I was responsible for helping the instructor prepare homework materials, grading of homework and exams, and holding office hours to answer questions about coursework and lectures. I also led labs and gave tutorials about an Integrated Development Environment (Eclipse) and the Java programming language. Inha University, Korea Teaching Assistant Mar. 2004 – Jun. 2005 IN308 Java Programming (Spring 2005) : This was an undergraduate level course. Mainly, I led labs and gave tutorials about the Java programming language. IN212 Data Structure (Spring 2005) : This was an undergraduate level course with a large enrollment. I graded homework and helped my instructor to design homework and course projects. IN201 C, C++ programming (Fall 2004) : This was an undergraduate level course with a large enrollment. I graded homework and led labs and gave tutorials about the C and C++ programing language University of Minnesota Twin Cities, USA Guest Lecture CSci 8002 (Graduate course) Introduction to Computer Science Research (Fall 2009) CSci 1901 Structure of Computing 1 (Spring 2007) Major Project 1. AR Gene Structure Alterations and Prostate Cancer Progression (Mar 2011 ∼ current) Collaborators: Prof. Scott Dehm (PI, Masonic Cancer Center, University of Minnesota) Prof. Robert Vessella (Dept. of Urology and Microbiology School of Medicine, University of Washington), and Dr. Kevin Silverstein (Bioinformatics core, Masonic Cancer Center, University of Minnesota). Summary: This project aims to understand the mechanisms by which cancer cells become resistant to targeted drugs. Androgen depletion therapy (ADT) is the primary systemic treatment for patients with locally advanced or metastatic prostate cancer. ADT inhibits activity of the androgen receptor (AR) transcription factor, and aberrant mechanisms of AR-reactivation during ADT invariably undermine clinical response, leading to the development of castration-resistant prostate cancer. Understanding the mechanisms underlying splicing changes and increased synthesis of truncated AR variants could lead to more effective use of ADT and management of patients with castration-resistant prostate cancer. In collaboration with Masonic Cancer Center at University of Minnesota and University of Washington, we developed computational frameworks to investigate the mechanisms of disrupted splicing in the castration-resistant prostate cancer CWR-R1 cell line using high-throughput pairedend DNA-seq and RNA-seq, resulting in the discovery of a novel 48kb deletion in AR intron 1. These data illustrate that loss of over a quarter of the AR gene can provide a selective advantage to prostate cancer cells under conditions of ADT, and provide further evidence that structural alterations in the AR gene may underlie splicing alterations in castration-resistant prostate cancer. Moving forward, we will develop computational tools to reconstruct AR genome as well as whole genome in castrationresistant prostate cancer patients. The parts of results of this work is accepted to be published in Oncogene journal. 2. Discovery of Pathways Disrupted by Copy Number Alterations across Cancers (Jun. 2010 ∼ Sept. 2010) Adivisor: Dr. Jinfeng Liu (Dept. of Bioinformatics and Computational Biology, Genentech Inc.), Collaborators: Dr. Zemin Zhang, Dr. Peter Haverty (Dept. of Bioinformatics and Computational Biology, Genentech Inc.), Prof. Timothy Starr (Masonic Cancer Center, University of Minnesota) Prof. Rui Kuang, Prof. Vipin Kumar (Dept. of CSE, University of Minnesota) . Summary: This project aims to analyze disrupted pathways across human cancers, (e.g., which pathways are disrupted to particular cancer types, and which are commonly disrupted across many types of human cancers) for discovery of novel cancer-related pathways and patient subgroups to develop novel therapeutic targets. In collaboration with Genentech and Masonic Cancer Center at University of Minnesota, we developed an algorithm called NetPathID (NETwork based method for PATHway IDentification) to discover pathways disrupted by copy number alterations across cancers. We applied our approach to a data set of 2172 cancer patients across 16 different types of cancers, and found a set of commonly disrupted pathways across cancers, along with a set of pathways disrupted in a specific type(s) of cancers. We found that these pathways are not likely to have been discovered by conventional overrepresentation-based and pathway-based methods, and could reveal potentially novel cancer biology. The results of this work have been submitted to PLoS Computational Biology. 3. Disease Gene Discovery using Integrated Human Phenome-Genome Interactome Networks.(Sept. 2009 ∼ Aug. 2010) Adivisor: Prof. Rui Kuang, Collaborators: Dr. Jinfeng Liu (Dept. of Bioinformatics and Computational Biology, Genentech Inc.). Summary: This project aims to develop a novel computational approach to improve disease gene prioritization. Disease gene prioritization is the task of ranking candidate disease genes underlying each disease phenotype to prioritize the genes for further experimental validation. Integrative analysis of human phenome-genome-interactome data is essential to understand the underlying genetic cause of disease phenotypes, and reveal genetic associations among phenotypes. We developed a novel network-based method to predict disease causative genes from human phenome-genomeinteractome data integration.In experiments with OMIM disease-gene association, we demonstrated that our method achieved the best overall performance for prioritizing disease genes and discover novel candidate disease genes for complex human diseases. These findings can help to identify genetic determinants that strongly influence disease phenotypes, and provide unique insights into disease mechanisms and potential therapeutic targets. This work has been published in SDM 2010. 4. Cancer Outcome Prediction and Biomarker Discovery (Sept. 2007 ∼ Aug. 2009) Adivisor: Prof. Rui Kuang, Collaborators: Prof. Vipin Kumar (Dept. of CSE, University of Minnesota) Prof. Dennis Wigle (Mayo Clinic Cancer Center), Dr. Jean-Pierre Kocher, Dr. Hugues Sicotte (Bioinformatics core, Mayo Clinic College of Medicine) and Dr. Rich Mushlin (IBM T. J Watson). Summary: This project centers on discovering reproducible cancer biomarkers from multiple independent microarray gene expression datasets. Although the non-replicability is partially introduced by the difference of the microarray platforms and the experiment techniques used for generating the high-throughput data, cluster structures or modularities on the genes such as co-expression can be used to leverage that discrepancy. In this large collaborative project that involve several researchers from Mayo Clinic and IBM, we developed a novel graph-based semi-supervised algorithm for discovery of discriminative cancer biomarkers and cancer outcome predictions by learning on bipartite graphs. In experiments with three large-scale breast cancer data, we demonstrated that our proposed model can effectively identify highly reproducible cancer-relavant biomarkers and achieve overall the best performance for cancer outcome prediction. The results of these techniques have been published in Bioinformatics journal 2008 5. Genomic Data Integration for Cancer Outcome Prediction and Biomarker Discovery (Sept. 2008 ∼ Aug. 2009) Adivisor: Prof. Rui Kuang, Collaborators: Prof. Dennis Wigle (Mayo Clinic Cancer Center), Dr. Jean-Pierre Kocher (Bioinformatics core, Mayo Clinic College of Medicine). Summary: This project aims to develop machine learning approaches to integrate genomic data with prior knowledge derived from related experiments and biological databases. In collaboration with Mayo Clinic, we developed a hypergraph-based semi-supervised learning algorithm and its variation to integrate genomic data with prior knowledge (e.g. microarray gene expression or copy number variation data with protein-protein interaction network) for cancer outcome prediction and biomarker discovery. Experimental results on breast, bladder and melanoma cancer datasets show that our proposed method achieved significantly improved cancer outcome prediction compared with SVMs and the other baselines utilizing the same prior knowledge, and identified several cancer-related subnetworks, and CNV regions, both of which contain known oncogenes and tumor suppressor genes. The results have been published in ICDM 2008, and Bioinformatics journal 2009. Services Professional Activities and Affiliations Reviewer for Bioinformatics, WIREs System Biology, BMC Bioinformatics, KDD, ICDM, SDM, PSB, ICMLA, BIBE, BIBM, BioKDD, GENSIPS, and Member of IEEE, ACM and ISCB Programming C, C++, Matlab, R, Linux shell scripting, Perl, Python, LATEX 2ε , SQL, and Java. Rui Kuang, Ph.D. Assistant Professor Dept. of Computer Science and Engineering, University of Minnesota Twin Cities, MN 55454 USA phone: (612)624-7820 e-mail: [email protected] Vipin Kumar, Ph.D. William Norris Professor and Head Dept. of Computer Science and Engineering, University of Minnesota Twin Cities, MN 55454 USA phone: (612)625-0726 e-mail: [email protected] Chad L. Myers, Ph.D. Assistant Professor Dept. of Computer Science and Engineering, University of Minnesota Twin Cities, MN 55454 USA phone: (612)624-8036 e-mail: [email protected] Jinfeng Liu, Ph.D. Scientist Dept. of Bioinformatics and Computational Biology, Genentech Inc., South San Francisco CA USA e-mail: [email protected] Referees Kevin A. T. Silverstein, Ph.D. Coordinator, Senior Research Scientist Bioinformatics Group, Masonic Cancer Center University of Minnesota Twin Cities, MN 55454 USA phone: (612)625-0292 e-mail: [email protected] Scott Dehm, Ph.D. Assistant Professor Masonic Cancer Center and Department of Laboratory Medicine and Pathology University of Minnesota Twin Cities, MN 55454 USA phone: (612)625-1504 e-mail: [email protected]