Download Tae Hyun Hwang - UT Southwestern

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Tae Hyun Hwang
Contact
Information
Quantitative Biomedical Research Center, Department of Clinical Science
phone: 214-648-4127
2201 Inwood Ave
http://www.taehyunlab.org
The University of Texas Southwestern Medical Center e-mail: [email protected]
Research
Interests
Education
Computational Biology, Machine Learning, Data Mining, and Statistical Methods
Research
Experience
University of Minnesota Twin Cities, USA
Mar. 2011
Ph.D., Department of Computer Science and Engineering
• Thesis: Network-based Learning Algorithms for Understanding Human Disease
• Committee:
Rui Kuang, Ph.D., Assistant professor, Department of Computer Science and Engineering at
University of Minnesota
Vipin Kumar, Ph.D., William Norris Professor and Head, Department of Computer Science and
Engineering at University of Minnesota
Chad L. Myers, Ph.D., Assistant professor, Department of Computer Science and Engineering at
University of Minnesota
Dennis Wigle, M.D. Ph.D., Assistant professor, Mayo Clinic Cancer Center and Division of
Thoracic Surgery at Mayo Clinic
Inha University, KOREA
Aug. 2005
M.S. (transferred), Department of Computer Science and Engineering
• Advisor: Prof. Gun-Sik Jo
Inha University, KOREA
Feb. 2004
B.E., Department of Computer Science and Engineering
Quantitative Biomedical Research Center, Department of Clinical Science, University
of Texas Southwestern Medical Center, USA
Assistant Professor
Jan. 2013 – current
Masonic Cancer Center, University of Minnesota Twin Cities, USA
Principal Investigator, Research Scientist
Mar. 2011 – Jan. 2013
• Developing computational methods to integrate diverse high-throughput genomic, genetic, phenotypic, and clinical data, and custom pipelines to analyze next generation sequencing data (e.g.
DNA sequence, RNA sequence, CHIP-seq, and etc) to discover underlying molecular mechanisms
of cancers.
Genentech, Inc., USA
Research Intern
Jun. 2010 – Sept. 2010
Advisor: Jinfeng Liu, Ph.D., Department of Bioinformatics and Computational Biology
• Developed and implemented network-based integrative analysis for discovering gene modules (e.g.
pathway, subnetworks or gene sets) that are commonly and cancer-type specific dysregulated in
copy number alternations across human cancers.
University of Minnesota Twin Cities, USA
Research Assistant
Jun. 2007 – present
Advisor: Prof. Rui Kuang, Department of Computer Science and Engineering
• Developed and implemented network-based learning algorithms to integrate gene expression, copy
number, and protein interaction networks for cancer outcome prediction and biomarker discovery.
• Developed and implemented network-based learning algorithm to prioritize potential disease candidate genes using integrated human phenome-genome interactome networks. The algorithm can
be also applicable to predict functions of genes, and new drug targets of a drug (or a combination
of drugs).
Inha University, KOREA
Research Assistant
Mar. 2004 – Jun. 2005
Advisor: Prof. Gun-sik Jo, Department of Computer Science and Engineering
• Developed and implemented web-based tools for use in mining information on the semantic web
Research
Grants
1. Patient Stratification and Pathway Discovery using Genomic Data Integration (pending)
Samsung Advanced Institute of Technology Role: Principal Investigator
The goal of this project is to develop novel computational tools to predict which patients who may
or may not respond to drugs and pathways related with drug response to create personalized, more
effective treatment strategies.
2. AR Gene Structure Alterations and Prostate Cancer Progression
American Cancer Society ($600,000),
01/01/2012∼12/31/2015.
PI: Scott Dehm
The goal of this project is to study cell- and xenograft-based models of prostate cancer progression to
understand the mechanisms by which alternatively-spliced, truncated AR isoforms are synthesized,
translocate to the nucleus, bind DNA, activate transcription, and mediate resistance to AR-targeted
therapies in CRPCa.
3. Genetic Background and the Angiogenic Phenotype in Cancer
AKC Canine Health Foundation ($230,424),
01/01/2010∼12/31/2012.
PI: Jaime Modiano
The major goals of this project are to confirm that heritable traits (breed) have distinct and discrete
influence on gene expression signatures of canine hemangiosarcoma, and that these traits lead these
tumors to respond differently to angiogenic and pro-inflammatory signals.
4. Genomic Signatures of Colorectal Cancer
Masonic Cancer Center ($180,000)
03/01/2012∼02/28/2014.
PI: David Lagaespada
This proposal enlists a multi center team at the University of Minnesota and the VUMC in Amsterdam
to identify markers of colorectal cancer based on somatic DNA mutations and specific chromosomal
alterations for use in diagnosis and therapy.
5. The role of AR gene rearrangements in prostate cancer progression
R01 NIH
Role: Co Investigator
This project is designed to study AR gene rearrangements in prostate cancer progression using nextgeneration sequencing analysis.
6. Efficient Algorithms and Database Architecture for Big Bio Data (pending)
The Small and Medium Business Administration (Korea Government Agency) ($400,000)
Role: Co Investigator
This project is designed to develop algorithms and design infrastructures for the discovery of biomarkers for clinical use from Big Bio data.
7. Comparative Assessment of the Etiology and Clonal Diversity of Non-Hodgkin Lymphoma (submitted)
Mn Partnership for Biotechnology and Medical Genomics
PI: Jaime Modiano
This project is designed to develop and optimize infrastructure for comparative multispecies systems
approaches to study non-hodgkin lymphoma.
8. Tumor-Microenvironment Interactions in Osteosarcoma Progression (submitted)
Department of Defense
PI: Jaime Modiano
This project is designed to understand how bidirectional interactions between osteosarcoma and the
host tumor microenvironment contribute to aggressive biological behavior and metastasis in this
pediatric tumor.
Peer-reviewed
publications
1.Yingming Li, Siu Chiu Chan, Lucas Brand, TaeHyun Hwang, Kevin Silverstein and Scott
Dehm,“Androgen receptor splice variants mediate enzalutamide resistance in castration-resistant
prostate cancer cell lines”,Cancer Research November 2012, Impact factor: 7.856
2.TaeHyun Hwang, Maoqiang Xie, Gowtham Atluri, Sanjoy Dey, Vipin Kumar, Changjin Hong
and Rui Kuang.,“Co-clustering Phenome-genome for Phenotype Classification and Disease Gene Discovery”, Nucleic Acids Research June 2012; doi: 10.1093/nar/gks615, Impact factor: 8.026
3.Yingming Li*, TaeHyun Hwang*, LeAnn Oseth, Betsy Hirsch, Robert Vessella, Kenny Beckman, Kevin Silverstein, and Scott Dehm,“AR intragenic deletions linked to androgen receptor splice
variant expression and activity in models of prostate cancer progression”, Oncogene, Jan. 2012;
doi:10.1038/onc.2011.637 Impact factor: 7.414
(*Joint first authors)
4.Young-Mi Kim, Matthew Stone, Tae Hyun Hwang, Yeon-Gil Kim, Timothy J. Griffin, and DoHyung Kim, “SH3BP4 is a negative regulator of amino acid−Rag GTPase−mTORC1”, In Press,
Molecular Cell, doi:10.1016/j.molcel.2012.04.007 Impact factor: 14.194
5.Maoqiang Xie, TaeHyun Hwang, and Rui Kuang,“Reconstructing Disease Phenome-genome Association by Bi Random Walk”, Bioinformatics 2012, doi:10.1093/bioinformatics/bts067 Impact
factor: 4.877
6.Seung-Jun Kim, TaeHyun Hwang, and Georgios B. Giannakis,“Sparse Robust Matrix Tri-factorization
with Application to Cancer Genomics”,In Press, International Workshop on Cognitive Information Processing (CIP) 2012
7.Maoqiang Xie, TaeHyun Hwang, and Rui Kuang,“Prioritizing Disease Genes by Bi Random
walk”, In Press, PAKDD 2012
8.TaeHyun Hwang, Gowtham Atluri, Rui Kuang, Timothy Starr, Peter Haverty, Zemin Zhang, and
Jinfeng Liu,“Large-scale Integrative Network-based analysis Identifies Common Pathways Disrupted
by Copy Number Alterations across Cancers”, Under review
9.TaeHyun Hwang, Wei Zhang, Maoqiang Xie, Jinfeng Liu, and Rui Kuang,“Inferring Disease and
Gene Set Associations with Rank Coherence in Networks”, Bioinformatics, 2011 1;27(19):2692-9.
Epub 2011 Aug 8. Impact factor: 4.926
10.TaeHyun Hwang, and Rui Kuang,“A Heterogeneous Label Propagation Algorithm for Disease
Gene Discovery”, SIAM International Conference on Data Mining (SDM), April 2010 (full paper
full presentation, acceptance rate 23.36%)
11.Ze Tian*, TaeHyun Hwang*, and Rui Kuang,“A Hypergraph-based Learning Algorithm for
Classifying Gene Expression and arrayCGH data with Prior Knowledge”, Bioinformatics, Vol. 25,
No. 21, pages: 2831-2838, 2009 Impact factor: 4.328
(*Joint first authors)
12.TaeHyun Hwang*, Ze Tian*, Jean-Pierre Kocher, and Rui Kuang,“Learning on Weighted Hypergraphs for Integrating Protein Interactions and Gene Expressions”, IEEE International Conference
on Data Mining (ICDM), December 2008 (full paper full presentation, acceptance rate 9.7%)
(*Joint first authors)
13.TaeHyun Hwang, Hugues Sicotte, Ze Tian, Dennis Wigle, Jean-Pierre Kocher, Vipin Kumar and
Rui Kuang,“Robust and Efficient Identification of Biomarkers by Classifying Features on Graphs”,
Bioinformatics, Vol. 24, No. 18, pages 2023-2029, 2008 Impact factor: 5.039
14.Ze Tian, TaeHyun Hwang, and Rui Kuang,“A Hypergraph-based Learning Algorithm for Classifying arrayCGH Data with Spatial Prior”, Proc. of IEEE International Workshop on Genomic
Signal Processing and Statistics (GENSIPS) , 2009
15.TaeHyun Hwang, and Rui Kuang,“A Comparative Study of Breast Cancer Microarray Gene
Expression Profiles using Label Propagation”, Proceedings of the Workshop on Data Mining for
Biomedical Informatics, held in conjunction with SIAM International Conference on Data Mining
(SDM), April 2008
16.Jaehoon Jeong, TaeHyun Hwang, Tian He, and David Du,“MCTA: Target Tracking Algorithm
based on Minimal Contour in Wireless Sensor Networks”, IEEE Infocom 2007 Minisymposia (INFOCOM), August 2007 (short paper, total acceptance rate 25%)
Technical Reports
1.TaeHyun Hwang, and Rui Kuang,“Prioritizing Disease Genes in a Heterogeneous Network”,
Technical Report UMN-CS-09-022, Sept. 2009
2.TaeHyun Hwang, Hugues Sicotte, Ze Tian, Dennis Wigle, Jean-Pierre Kocher, Vipin Kumar and
Rui Kuang, “Identifying Clinical and Genetic Markers of Human Disease by Classifying Features on
Graphs”, Technical Report UMN-CS-07-021, September 2007
3.Jaehoon Jeong, TaeHyun Hwang, Tian He, and David Du, “Target Tracking Algorithm based on
Minimal Contour in Wireless Sensor Networks”, Technical Report UMN-CS-07-002, June 2007
In preparation
1.TaeHyun Hwang, and Kevin Silverstein,“Challenges of integrating omics data from cancer databases
”, (Invited paper), Personalized Medicine
2.TaeHyun Hwang, Gaurav Pandey, and Rui Kuang,“An Integrated Network-based approach to
Infer Gene Set and GO function Associations”, In preparation
Talks
1.TaeHyun Hwang, “Cancer genome analyses to personalize treatment: from microarray to nextgen sequencing data”, Harvard Medical School, University of Texas MD Anderson Cancer
Center, University of Texas Southwestern Medical Center, Memorial Sloan Kettering
Cancer Center, April, and May. 2012 (Invited talk)
2.TaeHyun Hwang, “Cancer genome analyses to personalize treatment: from microarray to nextgen sequencing data”, Mayo Clinic, Feb. 2012 (Invited talk)
3.TaeHyun Hwang, “Network-based methods to integrate gene expression, copy number, and protein interaction network for biomarker discovery and cancer outcome prediction”, IBM T.J Watson
Research Center, Pfizer Oncology, Novartis Institute for Biomedical Research (Oncology), Institute for System Biology, Masonic Cancer Center at University of Minnesota
Twin Cities, Sept. 2010 (Invited talk)
4.TaeHyun Hwang, “Integrated human phenome-genome interactome networks for disease-gene association discovery: a machine learning approach”, Harvard Medical School, Sept. 2010 (Invited
talk)
5.TaeHyun Hwang, “Integrative analysis of copy number alternations with protein interaction networks in cancers”, Genentech Inc., August 2010
6.TaeHyun Hwang,“Integrating heterogenous networks for biomarker discovery and disease outcome predictions”, Columbia Univeristy, August 2010 (Invited talk)
7.TaeHyun Hwang,“Robust and efficient identification of biomarker using clinical and biological
data integration”, Roche, August 2010 (Invited talk)
8.TaeHyun Hwang,“Mining genetic determinants of human disease: a machine learning approach
with network perspective”, University of California San Diego, UCSD, July 2010 (Invited talk)
9.TaeHyun Hwang, and Rui Kuang,“A Heterogeneous Label Propagation Algorithm for Disease
Gene Discovery”, SIAM International Conference on Data Mining (SDM), April 2010 (full paper
full presentation, acceptance rate 23.36%)
10.TaeHyun Hwang, “A Hypergraph-based Learning Algorithm for Classifying Gene Expression
and arrayCGH data with Prior Knowledge”, Inha University, Korea, Oct. 2009., (Invited talk)
11.TaeHyun Hwang, Ze Tian, Jean-Pierre Kocher, and Rui Kuang, “Learning on Weighted Hypergraphs to Integrate Protein Interactions and Gene Expressions for Cancer Outcome Prediction”,
Proc. of the 8th IEEE International Conference on Data Mining (ICDM), Dec. 2008., (full paper
full presentation, acceptance rate 9.7%)
12.TaeHyun Hwang, Hugues Sicotte, Dennis A. Wigle, Jean-Pierre Kocher, Vipin Kumar and
Rui Kuang, “Identifying Clinical and Genetic Markers of Human disease by Classifying Features on
Graphs”, US-Korea Conference on Science Technology, and Entrepreneurship, August 14-17, 2008
13.TaeHyun Hwang, and Rui Kuang, “A Comparative Study of Breast Cancer Microarray Gene
Expression Profiles using Label Propagation”, Proc. of the Workshop on Data Mining for Biomedical
Informatics, held in conjunction with SIAM International Conference on Data Mining (SDM), 2008.
Posters
1.TaeHyun Hwang, Gowtham Atluri, Rui Kuang, Timothy Starr, Peter Haverty, Zemin Zhang,
and Jinfeng Liu,“NetPathID: an integrative network-based analysis to identify pathways disrupted
by copy number alterations across cancers”, Systems Biology of Diversity in Cancer symposium at
MSKCC, September 16, 2011
2.TaeHyun Hwang, Ze Tian, Michael Steinbach, Vipin Kumar, Rui Kuang, Jean-Pierre Kocher,
Hugues Sicott, Dennis Wigle, Rich Mushlin., Mining Cancer Biomarkers from Heterogeneous Genomic Data by Graph-based Learning, Biomedical Informatics and Computational Biology (BICB)
Research Symposium organized by IBM, Mayo Clinic, Hormel Institute, and Univ. of Minnesota,
January 16, 2009
3.TaeHyun Hwang, Hugues Sicotte, Dennis A. Wigle, Jean-Pierre Kocher, Vipin Kumar and
Rui Kuang., Identifying Clinical and Genetic Markers of Human disease by Classifying Features
on Graphs, US-Korea Conference on Science Technology, and Entrepreneurship,August 14-17, 2008
4.TaeHyun Hwang, Ze Tian, Michael Steinbach, Vipin Kumar, Rui Kuang, Jean-Pierre Kocher,
Hugues Sicott, Dennis Wigle, Rich Mushlin., Mining Genetic Determinants of Human Disease,,
Biomedical Informatics and Computational Biology (BICB) Research Symposium organized by IBM,
Mayo Clinic, Hormel Institute, and Univ. of Minnesota, June 20, 2008
Honours and
Awards
1. SDM 2010 Doctoral Forum Award and Student Travel Grant, National Science Foundation (NSF),
2010
2. First Prize for the 2009 Korean Computer Scientists and Engineers Association in America (KOCSEA) /Moon Jung Chung Scholarship/Poster competition. 2009
3. Nominated by Computer Science Department for Interdisciplinary Doctoral Fellowship, University
of Minnesota Twin Cities, 2009
4. ICDM 2008 Student Travel Grant, National Science Foundation (NSF), 2008
5. UKC 2007 Student Travel Grant, Korean-American Scientists Engineering Association, 2008. 6.
Research Paper Competition (4th place), Inha University, 2005
7. Merit-based Scholarship for Graduate Study, Inha Univeristy, 2004, 2005
8. Academic Excellence Fellowship Award, Inha University, 2001, 2003, 2004, 2005
Teaching
Experience
University of Minnesota Twin Cities, USA
Teaching Assistant
Jan. 2006 – Jun. 2010
CSci 5461 Functional Genomics, System Biology and Bioinformatics (Spring 2010) : This is an interdisciplinary graduate level course. Topics cover the analysis of gene expression data, proteomic
data, and interaction data, with a special focus on how they can be used to understand and infer biological networks. Enrolled students have different backgrounds including biology, biochemistry, and
computer science. I work closely with my advisor in this course, helping plan the course schedule,
consulting on lecture topics, and planning homework and the course project.
CSci 3003 Introduction to Computing in Biology (Spring 2008)
: This was an interdisciplinary undergraduate level course. The course provides students with fundamental computational skills needed to carry out some basic tasks common in modern biology
research. I helped my instructor to design homework and course topics, and led a weekly lab section
with around 30 students.
CSci 1901 Structure of Computing 1 (Spring 2007)
: This was an undergraduate level course. In this large introductory programming class, I was responsible for leading lab sections and preparing lab materials, and helping the instructor prepare
homework materials, grading of homework and exams. Finally, I gave the lecture in this course when
the instructor was unable to attend class (approximately 120 students).
CSci 5801 Software Engineering (Fall 2006)
: This was a graduate level course with a large enrollment. My duties are grading homeworks and
exams, helping students with questions about coursework and lectures.
CSci 4131 Internet Programming (Spring 2006)
: This was a senior undergraduate level course with a large enrollment. My duties are grading homeworks and exams, holding office hours to answer questions about coursework and lectures. CSci 1902
Structure of Computing 2 (Spring 2006)
: This was an interdisciplinary undergraduate level course with a large enrollment. I was responsible
for helping the instructor prepare homework materials, grading of homework and exams, and holding
office hours to answer questions about coursework and lectures. I also led labs and gave tutorials
about an Integrated Development Environment (Eclipse) and the Java programming language.
Inha University, Korea
Teaching Assistant
Mar. 2004 – Jun. 2005
IN308 Java Programming (Spring 2005)
: This was an undergraduate level course. Mainly, I led labs and gave tutorials about the Java
programming language.
IN212 Data Structure (Spring 2005)
: This was an undergraduate level course with a large enrollment. I graded homework and helped
my instructor to design homework and course projects.
IN201 C, C++ programming (Fall 2004)
: This was an undergraduate level course with a large enrollment. I graded homework and led labs
and gave tutorials about the C and C++ programing language
University of Minnesota Twin Cities, USA
Guest Lecture
CSci 8002 (Graduate course) Introduction to Computer Science Research (Fall 2009)
CSci 1901 Structure of Computing 1 (Spring 2007)
Major Project
1. AR Gene Structure Alterations and Prostate Cancer Progression (Mar 2011 ∼ current)
Collaborators: Prof. Scott Dehm (PI, Masonic Cancer Center, University of Minnesota) Prof. Robert
Vessella (Dept. of Urology and Microbiology School of Medicine, University of Washington), and Dr.
Kevin Silverstein (Bioinformatics core, Masonic Cancer Center, University of Minnesota). Summary:
This project aims to understand the mechanisms by which cancer cells become resistant to targeted
drugs. Androgen depletion therapy (ADT) is the primary systemic treatment for patients with
locally advanced or metastatic prostate cancer. ADT inhibits activity of the androgen receptor (AR)
transcription factor, and aberrant mechanisms of AR-reactivation during ADT invariably undermine
clinical response, leading to the development of castration-resistant prostate cancer. Understanding
the mechanisms underlying splicing changes and increased synthesis of truncated AR variants could
lead to more effective use of ADT and management of patients with castration-resistant prostate
cancer. In collaboration with Masonic Cancer Center at University of Minnesota and University
of Washington, we developed computational frameworks to investigate the mechanisms of disrupted
splicing in the castration-resistant prostate cancer CWR-R1 cell line using high-throughput pairedend DNA-seq and RNA-seq, resulting in the discovery of a novel 48kb deletion in AR intron 1. These
data illustrate that loss of over a quarter of the AR gene can provide a selective advantage to prostate
cancer cells under conditions of ADT, and provide further evidence that structural alterations in the
AR gene may underlie splicing alterations in castration-resistant prostate cancer. Moving forward,
we will develop computational tools to reconstruct AR genome as well as whole genome in castrationresistant prostate cancer patients. The parts of results of this work is accepted to be published in
Oncogene journal.
2. Discovery of Pathways Disrupted by Copy Number Alterations across Cancers (Jun. 2010 ∼ Sept.
2010)
Adivisor: Dr. Jinfeng Liu (Dept. of Bioinformatics and Computational Biology, Genentech Inc.),
Collaborators: Dr. Zemin Zhang, Dr. Peter Haverty (Dept. of Bioinformatics and Computational
Biology, Genentech Inc.), Prof. Timothy Starr (Masonic Cancer Center, University of Minnesota)
Prof. Rui Kuang, Prof. Vipin Kumar (Dept. of CSE, University of Minnesota) .
Summary: This project aims to analyze disrupted pathways across human cancers, (e.g., which
pathways are disrupted to particular cancer types, and which are commonly disrupted across many
types of human cancers) for discovery of novel cancer-related pathways and patient subgroups to
develop novel therapeutic targets. In collaboration with Genentech and Masonic Cancer Center at
University of Minnesota, we developed an algorithm called NetPathID (NETwork based method for
PATHway IDentification) to discover pathways disrupted by copy number alterations across cancers.
We applied our approach to a data set of 2172 cancer patients across 16 different types of cancers, and
found a set of commonly disrupted pathways across cancers, along with a set of pathways disrupted
in a specific type(s) of cancers. We found that these pathways are not likely to have been discovered
by conventional overrepresentation-based and pathway-based methods, and could reveal potentially
novel cancer biology. The results of this work have been submitted to PLoS Computational Biology.
3. Disease Gene Discovery using Integrated Human Phenome-Genome Interactome Networks.(Sept.
2009 ∼ Aug. 2010)
Adivisor: Prof. Rui Kuang,
Collaborators: Dr. Jinfeng Liu (Dept. of Bioinformatics and Computational Biology, Genentech
Inc.).
Summary: This project aims to develop a novel computational approach to improve disease gene
prioritization. Disease gene prioritization is the task of ranking candidate disease genes underlying each disease phenotype to prioritize the genes for further experimental validation. Integrative
analysis of human phenome-genome-interactome data is essential to understand the underlying genetic cause of disease phenotypes, and reveal genetic associations among phenotypes. We developed
a novel network-based method to predict disease causative genes from human phenome-genomeinteractome data integration.In experiments with OMIM disease-gene association, we demonstrated
that our method achieved the best overall performance for prioritizing disease genes and discover
novel candidate disease genes for complex human diseases. These findings can help to identify genetic determinants that strongly influence disease phenotypes, and provide unique insights into disease
mechanisms and potential therapeutic targets. This work has been published in SDM 2010.
4. Cancer Outcome Prediction and Biomarker Discovery (Sept. 2007 ∼ Aug. 2009)
Adivisor: Prof. Rui Kuang,
Collaborators: Prof. Vipin Kumar (Dept. of CSE, University of Minnesota) Prof. Dennis Wigle
(Mayo Clinic Cancer Center), Dr. Jean-Pierre Kocher, Dr. Hugues Sicotte (Bioinformatics core,
Mayo Clinic College of Medicine) and Dr. Rich Mushlin (IBM T. J Watson).
Summary: This project centers on discovering reproducible cancer biomarkers from multiple independent microarray gene expression datasets. Although the non-replicability is partially introduced
by the difference of the microarray platforms and the experiment techniques used for generating the
high-throughput data, cluster structures or modularities on the genes such as co-expression can be
used to leverage that discrepancy. In this large collaborative project that involve several researchers
from Mayo Clinic and IBM, we developed a novel graph-based semi-supervised algorithm for discovery
of discriminative cancer biomarkers and cancer outcome predictions by learning on bipartite graphs.
In experiments with three large-scale breast cancer data, we demonstrated that our proposed model
can effectively identify highly reproducible cancer-relavant biomarkers and achieve overall the best
performance for cancer outcome prediction. The results of these techniques have been published in
Bioinformatics journal 2008
5. Genomic Data Integration for Cancer Outcome Prediction and Biomarker Discovery (Sept. 2008
∼ Aug. 2009)
Adivisor: Prof. Rui Kuang, Collaborators: Prof. Dennis Wigle (Mayo Clinic Cancer Center), Dr.
Jean-Pierre Kocher (Bioinformatics core, Mayo Clinic College of Medicine).
Summary: This project aims to develop machine learning approaches to integrate genomic data with
prior knowledge derived from related experiments and biological databases. In collaboration with
Mayo Clinic, we developed a hypergraph-based semi-supervised learning algorithm and its variation
to integrate genomic data with prior knowledge (e.g. microarray gene expression or copy number
variation data with protein-protein interaction network) for cancer outcome prediction and biomarker
discovery. Experimental results on breast, bladder and melanoma cancer datasets show that our
proposed method achieved significantly improved cancer outcome prediction compared with SVMs
and the other baselines utilizing the same prior knowledge, and identified several cancer-related
subnetworks, and CNV regions, both of which contain known oncogenes and tumor suppressor genes.
The results have been published in ICDM 2008, and Bioinformatics journal 2009.
Services
Professional Activities and Affiliations
Reviewer for Bioinformatics, WIREs System Biology, BMC Bioinformatics, KDD, ICDM, SDM,
PSB, ICMLA, BIBE, BIBM, BioKDD, GENSIPS, and Member of IEEE, ACM and ISCB
Programming
C, C++, Matlab, R, Linux shell scripting, Perl, Python, LATEX 2ε , SQL, and Java.
Rui Kuang, Ph.D.
Assistant Professor
Dept. of Computer Science and Engineering,
University of Minnesota Twin Cities, MN 55454 USA
phone: (612)624-7820
e-mail: [email protected]
Vipin Kumar, Ph.D.
William Norris Professor and Head
Dept. of Computer Science and Engineering,
University of Minnesota Twin Cities, MN 55454 USA
phone: (612)625-0726
e-mail: [email protected]
Chad L. Myers, Ph.D.
Assistant Professor
Dept. of Computer Science and Engineering,
University of Minnesota Twin Cities, MN 55454 USA
phone: (612)624-8036
e-mail: [email protected]
Jinfeng Liu, Ph.D.
Scientist
Dept. of Bioinformatics and Computational Biology,
Genentech Inc., South San Francisco CA USA
e-mail: [email protected]
Referees
Kevin A. T. Silverstein, Ph.D.
Coordinator, Senior Research Scientist
Bioinformatics Group, Masonic Cancer Center
University of Minnesota Twin Cities, MN 55454 USA
phone: (612)625-0292
e-mail: [email protected]
Scott Dehm, Ph.D.
Assistant Professor
Masonic Cancer Center and Department of Laboratory Medicine and Pathology
University of Minnesota Twin Cities, MN 55454 USA
phone: (612)625-1504
e-mail: [email protected]