Download summary42

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Public health genomics wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Transcript
Project Summary
The last decade has witnessed an explosive growth in scientific discoveries in life sciences where high-performance
and data intensive information technology (IT) capabilities play an increasingly critical role. The human genome
project is one of the most successful examples of this growth. Such discoveries are fueling the pace of innovations
that have the potential to significantly transform medicine and health-care by improving quality of care, developing
new therapies, and increasing cost-effectiveness. A fundamental shift is occurring in how research is carried out in
life sciences, whereby a team of multidisciplinary researchers, including information technologists, is needed to
achieve significant discoveries. So far, even such multidisciplinary research has mainly focused on what we call “a
single domain” of a research problem within life sciences. For example, gene sequencing, analysis of DNA
microarray data, and protein structure prediction can all be considered single domain problems, which form specific
components useful in understanding complex phenomena in life sciences. The proposed research is to take a
significant leap forward by developing and integrating an array of IT capabilities to address the important,
multifaceted life sciences application of genetic medicine in its full complexity.
Merit: The overarching goal of the proposed project is to develop innovative, high-performance IT solutions for
Genetic Medicine. The goal of genetic medicine is to understand the genetic influence on the susceptibility and
progression of diseases, and drug responses by collecting and evaluating the entire spectrum of data including
genotyping of multiple individuals across multiples genes (or, in an extreme case, all the sequence variability
between the individuals’ genomes), gene expression data, phenotype data, clinical data, drug response and toxicity
data, family history and environmental conditions. This will be achieved by the development and effective
integrated application of various information technologies including high-performance computing, scientific data
management, scalable algorithms, data analysis and mining, large and heterogeneous data warehouses with a
capability to incorporate the different types of data mentioned above. In particular, our goal in this project is to
develop the methodologies, structures, and algorithms necessary to effectively address topics that involve
information from multiple domains including genotype, RNA expression, protein production and modification, and
multiple sources that describe phenotype - including patient medical and pharmaceutical histories. By allowing cross
associations between these different domains, a more complete picture of the relationship between genotype and
phenotype can be teased out of a genetically heterogeneous population. Specifically, we will develop and integrate
an infrastructure for collection and description of heterogeneous data in scalable data warehouses, develop diskresident data structures and associated algorithms for effective storage and processing of each type of data, develop
scalable algorithms for analysis of data, design data mining algorithms and software for finding patterns and rules
from data, and develop high-performance parallel computing versions of these to enable high throughput computing.
The project will be done in collaboration with Northwestern Memorial Hospital, Childrens Memorial Hospital, and
Evanston Northwestern Healthcare, three leading hospitals that together are responsible for the healthcare of more
than a million lives in the greater Chicago area. The PI team includes IT faculty with expertise in computational
biology, high-performance parallel computing, scalable algorithms, data warehousing, data mining, and scientific
data management, together with researchers from genetic medicine and life sciences, and bioinformatics researchers
and practitioners from the above-mentioned hospitals. The project will be based on real data gathered from a number
of consenting patients with a broad spectrum of disease histories.
Broader Impact: The proposed research will have the following broader impacts: It will lead to novel approaches
in individualized application of drugs and therapies, and prevention of diseases, a problem of national importance.
Furthermore, this could dramatically reduce the side-effects and increase the efficacy of drugs by stratifying patients
based on genotypic predictors. Adverse side-effects is a documented problem in medicine, resulting in further
illnesses and deaths1, and delaying “correct” treatment. This project will have significant impact on IT in two ways –
through the development of high-performance data warehousing, mining and algorithmic techniques for each
individual type of life sciences data, and through the development and effective application of integrating strategies
to mine such varied data in the computationally and data intensive application of Genetic Medicine. This project
also strengthens interdisciplinary research by bringing information technology to the emerging field of Genetic
Medicine and bringing IT based research in life-sciences to practice of medicine and healthcare. All faculty PIs have
a record of working with women and minority students and postdoctoral associates. This project will involve them
in a significant number. Furthermore, the PIs have developed interdisciplinary degree programs in bioinformatics
and computational biology at Northwestern and Iowa State and will incorporate results from this research into those
programs. They are also engaged in outreach and other educational activities to area schools and health-care
providers. Such outreach and educational activities will be a specific goal of this project.
1
Adverse drug reaction may rank as the fifth leading cause of death in the USA. (from: L. Mancinelly, “Pharmacogenomics: The Promise of
Personalized Medicine,” AAPS PharmSci 2000;2(1).