Download summary42

Project Summary The last decade has witnessed an explosive growth in scientific discoveries in life sciences where high-performance and data intensive information technology (IT) capabilities play an increasingly critical role. The human genome project is one of the most successful examples of this growth. Such discoveries are fueling the pace of innovations that have the potential to significantly transform medicine and health-care by improving quality of care, developing new therapies, and increasing cost-effectiveness. A fundamental shift is occurring in how research is carried out in life sciences, whereby a team of multidisciplinary researchers, including information technologists, is needed to achieve significant discoveries. So far, even such multidisciplinary research has mainly focused on what we call “a single domain” of a research problem within life sciences. For example, gene sequencing, analysis of DNA microarray data, and protein structure prediction can all be considered single domain problems, which form specific components useful in understanding complex phenomena in life sciences. The proposed research is to take a significant leap forward by developing and integrating an array of IT capabilities to address the important, multifaceted life sciences application of genetic medicine in its full complexity. Merit: The overarching goal of the proposed project is to develop innovative, high-performance IT solutions for Genetic Medicine. The goal of genetic medicine is to understand the genetic influence on the susceptibility and progression of diseases, and drug responses by collecting and evaluating the entire spectrum of data including genotyping of multiple individuals across multiples genes (or, in an extreme case, all the sequence variability between the individuals’ genomes), gene expression data, phenotype data, clinical data, drug response and toxicity data, family history and environmental conditions. This will be achieved by the development and effective integrated application of various information technologies including high-performance computing, scientific data management, scalable algorithms, data analysis and mining, large and heterogeneous data warehouses with a capability to incorporate the different types of data mentioned above. In particular, our goal in this project is to develop the methodologies, structures, and algorithms necessary to effectively address topics that involve information from multiple domains including genotype, RNA expression, protein production and modification, and multiple sources that describe phenotype - including patient medical and pharmaceutical histories. By allowing cross associations between these different domains, a more complete picture of the relationship between genotype and phenotype can be teased out of a genetically heterogeneous population. Specifically, we will develop and integrate an infrastructure for collection and description of heterogeneous data in scalable data warehouses, develop diskresident data structures and associated algorithms for effective storage and processing of each type of data, develop scalable algorithms for analysis of data, design data mining algorithms and software for finding patterns and rules from data, and develop high-performance parallel computing versions of these to enable high throughput computing. The project will be done in collaboration with Northwestern Memorial Hospital, Childrens Memorial Hospital, and Evanston Northwestern Healthcare, three leading hospitals that together are responsible for the healthcare of more than a million lives in the greater Chicago area. The PI team includes IT faculty with expertise in computational biology, high-performance parallel computing, scalable algorithms, data warehousing, data mining, and scientific data management, together with researchers from genetic medicine and life sciences, and bioinformatics researchers and practitioners from the above-mentioned hospitals. The project will be based on real data gathered from a number of consenting patients with a broad spectrum of disease histories. Broader Impact: The proposed research will have the following broader impacts: It will lead to novel approaches in individualized application of drugs and therapies, and prevention of diseases, a problem of national importance. Furthermore, this could dramatically reduce the side-effects and increase the efficacy of drugs by stratifying patients based on genotypic predictors. Adverse side-effects is a documented problem in medicine, resulting in further illnesses and deaths1, and delaying “correct” treatment. This project will have significant impact on IT in two ways – through the development of high-performance data warehousing, mining and algorithmic techniques for each individual type of life sciences data, and through the development and effective application of integrating strategies to mine such varied data in the computationally and data intensive application of Genetic Medicine. This project also strengthens interdisciplinary research by bringing information technology to the emerging field of Genetic Medicine and bringing IT based research in life-sciences to practice of medicine and healthcare. All faculty PIs have a record of working with women and minority students and postdoctoral associates. This project will involve them in a significant number. Furthermore, the PIs have developed interdisciplinary degree programs in bioinformatics and computational biology at Northwestern and Iowa State and will incorporate results from this research into those programs. They are also engaged in outreach and other educational activities to area schools and health-care providers. Such outreach and educational activities will be a specific goal of this project. 1 Adverse drug reaction may rank as the fifth leading cause of death in the USA. (from: L. Mancinelly, “Pharmacogenomics: The Promise of Personalized Medicine,” AAPS PharmSci 2000;2(1).

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download summary42