* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Applications of Functional Genomics and Bioinformatics
Gene nomenclature wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Long non-coding RNA wikipedia , lookup
History of genetic engineering wikipedia , lookup
Public health genomics wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Minimal genome wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome evolution wikipedia , lookup
Fetal origins hypothesis wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Genome (book) wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Designer baby wikipedia , lookup
Gene expression programming wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Towards an Understanding of Oxidative Stress Resistance in Plants: Expresso and Chips Applications of Functional Genomics and Bioinformatics Overview • Environmental stress and reactive oxygen species (ROS) • Plant responses to ROS • Stress on a chip — current results • Expresso – Managing expression experiments – Analyzing expression data – Reaching conclusions • Some future directions — Collaborating with CIP on resistance mechanisms in Andean root and tuber crop species The Paradox of Aerobiosis • Oxygen is essential, yet also potentially toxic. • Aerobic cells maintain themselves against constant danger of production of reactive oxygen species (ROS). • ROS can act as mutagens, they can cause lipid peroxidation and denature proteins. ROS Arise Throughout the Cell Wounding Chilling Pathogens Ozone Cell Wall Pathogens Wounding , Chilling Ozone Cell Wall Mitochondrion Mitochondrion Post-transcriptional EffectsPos t-tra nscriptiona l Effe cts Drought Salinity (ROS subcellular sites unclear) Drought , Salinity Cytosol Cytosol Antioxidantgenes genes Antioxidant Nucleus (ROS su bce llul ar si tes un cle ar) Nucleus Gene Ex pression Gene Expression Chloroplast Chloroplast Pos t-tra nscriptiona l Effe cts Post-transcriptional Paraquat , Effects High Light + Chilling Sulfur Dioxide Paraquat High Light + Chilling Sulfur Dioxide , Cellular Redox Homeostasis • Maintained enzymatically • Glutathione, Ascorbate (soluble). – Alpha-tocopherol, Carotenoids (membrane). • Antioxidant pools increase with stress. • Protein methionine sulfoxidation is an additional antioxidant reservoir. • Molecular chaperones (heat shock proteins) act as repair mechanism. ROS Arise as a Result of Exposure to: • • • • • • • Ozone Sulfur dioxide High light Paraquat Extremes of temperature Salinity Drought Plant-Environment Interactions • Several defense systems that respond to environmental stress are known. • Their relative importance is not known. • Mechanistic details are not known. Redox sensing may be involved. A Basis for Cellular Responses to ROS Thiol Redox Control Stress Defense Redox Regulation of Gene Expression Environmental Stress Prooxidants (ROS): . O2 H2O2 Membrane Receptors (Oxylipins) Protein kinases; Phosphoprotein phosphatases . NO Transcription factors (Redoxsensitive?) Gene expression Cellular response: Defense processes Repair processes Adaptation Cellular Defense Response Antioxidants: Trx-(SH)2/Trx-S2 2 GSH/GSSG Grx-(SH)2/Grx-S2 Met/MetO Asc/DHAsc ROS Scavenging in Plastids Stress Resistance — Short Term “Emergency” • Accumulated evidence suggests that successful resistance to stress imposition consists in the mobilization of cellular defense machinery. (Short term exposures to oxidative stress conditions in a number of crop species, and cultivars within species.) • Activation of defense genes, such as SOD, glutathione reductase • Stimulation of antioxidant biosynthetic pathways, such as glutathione Differential Response of Plastid SOD to Sulfur Dioxide in Two Cultivars of Pea Exposure to sulfur dioxide in resistant (Progress) and sensitive (Nugget) cultivars of pea resulted in increases in plastid Cu-Zn SOD mRNA and protein only in the resistant cultivar. Kinetics of increase correlates with recovery of photosynthesis in cv. Progress. Stress Resistance — Long Term Adaptation to Harsh Environmental Conditions Less data available than for emergency responses. But overlap with emergency processes? Candidates include: • Low temperatures - glutathione-associated processes, cryoprotective proteins and oligosaccharides • High temperature- heat shock proteins • Drought- water channel proteins (aquaporins), dehydrins Season-Specific Isoforms of Glutathione Reductase in Spruce Winter and summer specific isoforms of glutathione reductase exist in red spruce. The appearance of the winter specific form correlates with the onset of hardening. Glutathione Reductase Genes (GR1) Glutathione Reductase Genes (GR2) Candidate Resistance Mechanisms • In the past, candidate mechanisms were examined known gene by known gene, process by process. • Microarray Technology – Simultaneous examination of groups of candidate genes and associated interactions – Possible discovery of new defense mechanisms Relative Abundance Detection Detection Treatment 1 1 1 Control 1 2 2 3 3 3 3 2 2 Mix Spots: 1 2 (Sequences affixed to slide) 3 Hybridization 1 2 3 Iterative strategy for detection of genetic interactions using microarrays Detection of gene expression effects on microarrays 1 4 Genetic Regulatory Networks Test mutant phenotypes 3 Identify mutants Characterize 2 gene function Long Term Goal • Precedent I: Plants adapt to adverse environmental conditions via a global cellular response involving changes in the expression patterns of numerous genes. • Precedent II: To study these changes, the Expresso team uses bioinformatics and experimental techniques. • Long term goal: To identify and improve emergency and long term adaptational stress response mechanisms in crop species. Expresso: A Problem Solving Environment for Microarray Experiment Design and Analysis • Integration of design and procedures • Integration of image analysis tools and statistical analysis • Connections to web databases and sequence alignment tools • The software Aleph was used for inductive logic programming (ILP). Who’s Who Computer Science Plant Biology Ruth Alscher Plant Stress Virginia Tech Boris Chevone Plant Stress Dawei Chen Molecular Biology Bioinformatics Ron Sederoff, North Ross Whetten Carolina Len van Zyl State Univ. Y-H.Sun Lenwood Heath (CS) Algorithms Virginia Tech Naren Ramakrishnan (CS) Data Mining Problem Solving Environments Craig Struble, Vincent Jouenne (CS) Image Analysis Forest Biotechnology Ina Hoeschele (DS) Statistical Genetics Keying Ye (STAT) Bayesian Statistics Statistics Virginia Tech Expresso People Ron Sederoff Lenny Heath Craig Struble Ruth Alscher Ross Whetten Keying Ye Boris Chevone Y-H .Sun Len van Zyl Dawei Chen Vincent Jouenne Naren Ramakrishnan The 1999 Experiment: A Measure of Long Term Adaptation to Drought Stress • Loblolly pine seedlings (two unrelated genotypes “C” and “D”) were subjected to mild or severe drought stress for four (mild) or three (severe) cycles. – Mild stress: needles dried down to –10 bars; little effect on growth, new flushes as in control trees. – Severe stress: needles dried down to –17 bars; growth retardation, fewer new flushes compared to controls. • Harvest RNA at the end of growing season, determine patterns of gene expression on DNA microarrays. • With algorithms incorporated into Expresso, identify genes and groups of genes involved in stress responses. Scenarios for Effects of Specific Stresses on Gene Expression Hypotheses • There is a group of genes whose expression confers resistance to drought stress. • Expression of this group of genes is lower under severe than under mild stress. • Individual members of gene families show distinct responses to drought stress. Selection of cDNAs for Arrays • 384 ESTs (xylem, shoot tip cDNAs of loblolly) were chosen on the basis of function and grouped into categories. • Major emphasis was on processes known to be stress responsive. • In cases where more than one EST had similar BLAST hits, all ESTs were used. Categories within Protective and Protected Processes Gene Expression Signal Transduction Protease-associated ROS and Stress Environmental Change Protective Processes Nucleus Cell Wall Related Trafficking Phenylpropanoid Pathway Development Protected Processes Secretion Cells Cytoskeleton Tissues Plant Growth Regulation Chloroplast Associated Metabolism Carbon Metabolism Respiration and Nucleic Acids Mitochondrion A Note about Categories Categories are not mutually exclusive; gene(s) may be assigned to more then one category. For example, heat shock proteins have been grouped under these different categories and subcategories – Abiotic stress – heat – Gene expression – post-translational processing – chaperones – Abiotic stress - chaperones Abiotic Biotic Stress Protective Processes Cell Wall Related “Isoflavone Reductases” Antioxidant Processes Phenylpropanoid Pathway Drought Dehydrins, Aquaporins Heat Non-Plant Heat shock proteins (Chaperones) Xenobiotics GSTs Chaperones NADPH/Ascorbate/ Glutathione Scavenging Pathway Sucrose Metabolism Cellulose Categories within “Protective Processes” Arabionogalactan proteins Cytosolic ascorbate peroxidase superoxide dismutase-Fe superoxide dismutase-Cu-Zn glutathione reductase Extensins and proline rich proteins Hemicellulose Pectins Xylose Other Cell Wall Proteins Lignin Biosynthesis isoflavone reductases phenylalanine ammonia-lyases S-adenosylmethionine decarboxylases glycine hydromethyltransferases 4-coumarate-CoA ligases CCoAOMTs cinnamyl-alcohol dehydrogenase Quality Control • Positive: LP-3, a loblolly gene known to respond positively to drought stress in loblloly pine, was included. • LP-3 was positive in the moist versus mild comparison, and unchanged in the moist versus severe comparison. • Negative: Four clones of human genes used as negative controls in the Arabidopsis Functional Genomics project were included. The clones did not respond. Drought Abiotic Biotic ROS and Stress Protective Processes Cell Wall Related “Isoflavone Reductases” Antioxidant Processes Phenylpropanoid Pathway Dehydrins, Aquaporins Heat Heat shock proteins Non-Plant Xenobiotics GSTs Cystosolic ascorbate Chaperones peroxidase NADPH/Ascorbate/ Glutathione Scavenging Pathway Sucrose Metabolism Categories that contained positives in genotypes C and D (Control versus Mild) Data from two slides (4 arrays) for C and two slides (4 arrays) for D were collected. Cellulose superoxide dismutase-Fe superoxide dismutase-Cu-Zn glutathione reductase Extensins, Arabionogalactan, and Proline Rich Proteins Hemicellulose Pectins Xylose Other Cell Wall Proteins Lignin Biosynthesis isoflavone reductases phenylalanine ammonia-lyase S-adenosylmethionine decarboxylase glycine hydromethyltransferase 4-coumarate-CoA ligase CCoAOMT cinnamyl-alcohol dehydrogenase Hypotheses versus Results • Among the genes responding to mild stress, there exists a population of genes whose expression confers resistance. – Genes in 69 categories responded positively to mild stress in Genotypes C and D (the positive response was not observed in the severe stress condition in Genotype D). • There is evidence for a response to drought among genes associated with other stresses. – Isoflavone reductase homologs and GSTs responded positively to mild drought stress. – These categories are previously documented to respond to biotic stress and xenobiotics, respectively. Relationships among HSP Homologs In control versus mild stress, HSP 100, 70, and 23 responded in C and D; HSP 80s did not respond in either C or D. Candidate Categories — Long-term Adaptation to Drought Stress • Include – Aquaporins – Dehydrins – Heat shock proteins/chaperones • Exclude – Isoflavone reductases Design of Microarrays • Clones on the drought-stress microarrays were replicated and randomly placed • Experiment involved 384 archived pine ESTs • Organized into 4 microtitre source plates after PCR • Pipetted into 8 sets of 4 microtitre plates each • Each set a different random arrangement of 384 ESTs • Printed type A microarrays from first 4 sets • Printed type B microarrays from second 4 sets • Each array has 4 randomly placed replicates of each EST • Each control versus stress comparison was done on 4 arrays — A and B; flip dyes; A and B • Total of 16 replicates of each EST in each comparison Spot and Clone Analysis • Image Analysis: gridding, spot identification, intensity and background calculation, normalization • Statistics: • Fold or ratio estimation • Combining replicates • Higher-level Analysis: • Clustering methods • Inductive logic programming (ILP) Image Analysis Microarray Suite: • Manual gridding • Extract two intensities for each spot • Compute ratios • Compute calibrated ratios Our tools use the logarithm of the calibrated ratios Computational and Statistical Analysis • The multiple (typically 16) log calibrated ratios for a replicated clone do NOT follow a normal distribution. • We assume a zero-centered distribution for log ratios. • The number of positive (or negative) log ratios follows a binomial distribution with parameters 16 and 0.5. • A clone with 12 or more positive log ratios is up-expressed with a probability of 0.96. • We classify each EST response as one of – Up-regulated – Down-regulated – No clear change • Provides sufficient results for the use of inductive logic programming (ILP). Related Statistical Results • Chen et al. (J. Biomed. Optics 2, 1997, 364-374) – Assume a normal distribution and normalize ratios – No replicates – Estimate a confidence interval for ratios that applies to each spot • Lee et al. (PNAS 97, August 29, 2000, 9834-9) emphasize need for replication • Black and Doerge (PNAS, to appear) – Investigate distributional assumptions of log-normal and gamma distributions on intensities – Determine the number of replicates needed for a particular confidence level under each distribution – Assume normalization has been done and locationdependent error has been eliminated. Further Analysis: Inductive Logic Programming • ILP is a data mining algorithm expressly designed for inferring relationships. • By expressing relationships as rules, it provides new information and resultant testable hypotheses. • ILP groups related data and chooses in favor of relationships having short descriptions. • ILP can also flexibly incorporate a priori biological knowledge (e.g., categories and alternate classifications). Rule Inference in ILP • Infers rules relating gene expression levels to categories, both within a probe pair and across probe pairs, without explicit direction • Example Rule: [Rule 142] [Pos cover = 69 Neg cover = 3] level(A,moist_vs_severe,not positive) :level(A,moist_vs_mild,positive). • Interpretation: “If the moist versus mild stress comparison was positive for some clone named A, it was negative or unchanged in the moist versus severe comparison for A, with a confidence of 95.8%.” More Rules We Obtained • [Rule 6] level(A,moist_vs_mild,positive) :category(A, transport_protein). level(A,mild_vs_severe,negative) :- • category(A, transport_protein). [Rule 13] level(A,moist_vs_mild,positive) :- • category(A, heat). [Rule 17] level(A,moist_vs_mild,positive) :category(A, cellwallrelated). ILP Subsumes Two Forms of Reasoning • Unsupervised learning – “Find clusters of genes that have similar/consistent expression patterns” • Supervised learning – “Given several patterns of gene expression for two conditions, give an equation that distinguishes the patterns for each condition ” • Hybrid reasoning – “Is there a relationship between genes in a given functional category and genes in a particular expression cluster?” – ILP mines this information in a single step Current Status of Expresso • Completely automated and integrated – Statistical analysis – Data mining – Experiment capture in MEL • Current Work: Integrating – Image processing – Querying by semi-structured views – Expresso-assisted experiment composition Future Directions Next Generation Stress Chips 1. Time course, short and long term, to capture gene expression events underlying “emergency” and adaptive events following drought stress imposition. (Use all available ESTs for candidate stress resistance genes.) 2. Generate cDNA library from stressed seedlings. 3. Initiate modeling of kinetics of drought stress responses. Gene Expression Events Associated with Extreme Environmental Conditions • Hypothesis 1: Specific genes that confer ability to adapt to extreme conditions are expressed in Andean potato varieties and in other root and tuber crops of the region. • Hypothesis 2: The adaptive genes act either individually or in co-adaptive groups. • Hypothesis 3: The adaptive genes are also expressed in temperate zone varieties or they are specific to extreme conditions. • Proposed approach: Use of microarrays as a tool for discovery of these adaptive genes. Successful Emergency Responses Versus Adaptation to Diurnal Variation • Cultivar differences with respect to degree/rapidity of gene expression • Cultivar differences with respect to rate of synthesis of antioxidant(s) • Our question: Are the genes that respond in the short term the same ones that confer stress resistance diurnally? Relevant Data • Similarities among orthologous genes are sufficiently close that cross-hybridization occurs on microarrays between species (R. R. Sederoff, personal communication). • Although the above confounds treatment responses shown by individual members of multi-gene families, it allows use of a chip based on one species in interspecific hybridizations. Collaborating with CIP Suggested Strategy I • Identify patterns of gene expression in S. tuberosum associated with successful adaptation to stress (temperature extremes, with or without drought? ) • Construct two or more cDNA libraries from adapted potato • Design “potato stress chips” including the stress cDNA library and known stress-responsive genes of S. tuberosum (SolGenes resource) Suggested Strategy II An Experimental Approach • Identify variety-specific diurnal gene expression patterns in Andean potato varieties using potato stress chip. • Construct arrays of cDNAs from Andean varieties. Compare mRNA populations isolated during adaptation of temperate zone potatoes with RNA obtained from Andean varieties. Iterative strategy for detection of genetic interactions using microarrays Detection of gene expression effects on microarrays 1 4 Genetic Regulatory Networks Test mutant phenotypes 3 Identify mutants Characterize 2 gene function Iterative strategy for detection of genetic interactions using microarrays and CS expertise Detection of stress mediated gene expression effects on microarrays 1 4 Genetic Regulatory Networks Revised / New Tools and Experiments 3 Test inferences with mutants/varying conditions 2 Computational tools to infer interaction among genes, pathways