Download Applications of Functional Genomics and Bioinformatics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene nomenclature wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Long non-coding RNA wikipedia , lookup

History of genetic engineering wikipedia , lookup

Public health genomics wikipedia , lookup

Ridge (biology) wikipedia , lookup

Genomic imprinting wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene wikipedia , lookup

Minimal genome wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genome evolution wikipedia , lookup

Fetal origins hypothesis wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Genome (book) wikipedia , lookup

Nutriepigenomics wikipedia , lookup

NEDD9 wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression programming wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Microevolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene expression profiling wikipedia , lookup

Transcript
Towards an Understanding of
Oxidative Stress Resistance
in Plants: Expresso and
Chips
Applications of Functional
Genomics and Bioinformatics
Overview
• Environmental stress and reactive oxygen
species (ROS)
• Plant responses to ROS
• Stress on a chip — current results
• Expresso
– Managing expression experiments
– Analyzing expression data
– Reaching conclusions
• Some future directions — Collaborating with
CIP on resistance mechanisms in Andean root
and tuber crop species
The Paradox of Aerobiosis
• Oxygen is essential, yet also potentially
toxic.
• Aerobic cells maintain themselves
against constant danger of production of
reactive oxygen species (ROS).
• ROS can act as mutagens, they can cause
lipid peroxidation and denature proteins.
ROS Arise Throughout the Cell
Wounding
Chilling
Pathogens
Ozone
Cell Wall
Pathogens
Wounding ,
Chilling
Ozone
Cell Wall
Mitochondrion
Mitochondrion
Post-transcriptional
EffectsPos t-tra nscriptiona l
Effe cts
Drought
Salinity
(ROS subcellular
sites unclear)
Drought ,
Salinity
Cytosol
Cytosol
Antioxidantgenes
genes
Antioxidant
Nucleus
(ROS
su bce llul ar
si tes
un cle ar)
Nucleus
Gene
Ex pression
Gene
Expression
Chloroplast
Chloroplast
Pos t-tra nscriptiona l
Effe cts
Post-transcriptional
Paraquat ,
Effects
High Light + Chilling
Sulfur Dioxide
Paraquat
High Light + Chilling
Sulfur Dioxide
,
Cellular Redox Homeostasis
• Maintained enzymatically
• Glutathione, Ascorbate (soluble).
– Alpha-tocopherol, Carotenoids (membrane).
• Antioxidant pools increase with stress.
• Protein methionine sulfoxidation is an
additional antioxidant reservoir.
• Molecular chaperones (heat shock proteins) act
as repair mechanism.
ROS Arise as a Result of
Exposure to:
•
•
•
•
•
•
•
Ozone
Sulfur dioxide
High light
Paraquat
Extremes of temperature
Salinity
Drought
Plant-Environment
Interactions
• Several defense systems that respond to
environmental stress are known.
• Their relative importance is not known.
• Mechanistic details are not known. Redox
sensing may be involved.
A Basis for Cellular Responses to
ROS
Thiol Redox Control
Stress
Defense
Redox Regulation of
Gene Expression
Environmental Stress
Prooxidants (ROS):
.
 O2
 H2O2
Membrane
Receptors
(Oxylipins)
Protein kinases;
Phosphoprotein
phosphatases
.
 NO
Transcription
factors (Redoxsensitive?)
Gene expression
Cellular response:
 Defense processes
 Repair processes
 Adaptation
Cellular Defense
Response
Antioxidants:
 Trx-(SH)2/Trx-S2
 2 GSH/GSSG
 Grx-(SH)2/Grx-S2
 Met/MetO
 Asc/DHAsc
ROS Scavenging in Plastids
Stress Resistance — Short
Term “Emergency”
• Accumulated evidence suggests that successful
resistance to stress imposition consists in the
mobilization of cellular defense machinery.
(Short term exposures to oxidative stress
conditions in a number of crop species, and
cultivars within species.)
• Activation of defense genes, such as SOD,
glutathione reductase
• Stimulation of antioxidant biosynthetic
pathways, such as glutathione
Differential Response of Plastid SOD to
Sulfur Dioxide in Two Cultivars of Pea
Exposure to sulfur dioxide in resistant (Progress) and sensitive
(Nugget) cultivars of pea resulted in increases in plastid Cu-Zn
SOD mRNA and protein only in the resistant cultivar. Kinetics of
increase correlates with recovery of photosynthesis in cv. Progress.
Stress Resistance — Long Term
Adaptation to Harsh Environmental
Conditions
Less data available than for emergency responses. But
overlap with emergency processes?
Candidates include:
• Low temperatures - glutathione-associated processes,
cryoprotective proteins and oligosaccharides
• High temperature- heat shock proteins
• Drought- water channel proteins (aquaporins),
dehydrins
Season-Specific Isoforms of
Glutathione Reductase in Spruce
Winter and summer specific isoforms of glutathione
reductase exist in red spruce. The appearance of the
winter specific form correlates with the onset of
hardening.
Glutathione Reductase Genes (GR1)
Glutathione Reductase Genes (GR2)
Candidate Resistance
Mechanisms
• In the past, candidate mechanisms were
examined known gene by known gene,
process by process.
• Microarray Technology
– Simultaneous examination of groups of
candidate genes and associated interactions
– Possible discovery of new defense
mechanisms
Relative Abundance
Detection
Detection
Treatment
1
1
1
Control
1
2
2
3
3
3
3
2
2
Mix
Spots: 1
2
(Sequences affixed to slide)
3
Hybridization
1
2
3
Iterative strategy for detection of genetic
interactions using microarrays
Detection of gene
expression effects
on microarrays
1
4
Genetic
Regulatory
Networks
Test mutant
phenotypes
3
Identify mutants
Characterize
2
gene function
Long Term Goal
• Precedent I: Plants adapt to adverse
environmental conditions via a global
cellular response involving changes in the
expression patterns of numerous genes.
• Precedent II: To study these changes, the
Expresso team uses bioinformatics and
experimental techniques.
• Long term goal: To identify and improve
emergency and long term adaptational
stress response mechanisms in crop species.
Expresso: A Problem Solving
Environment for Microarray
Experiment Design and Analysis
• Integration of design and procedures
• Integration of image analysis tools and
statistical analysis
• Connections to web databases and sequence
alignment tools
• The software Aleph was used for inductive logic
programming (ILP).
Who’s Who
Computer Science
Plant Biology
Ruth Alscher
Plant Stress
Virginia Tech
Boris Chevone
Plant Stress
Dawei Chen
Molecular Biology
Bioinformatics
Ron Sederoff,
North
Ross Whetten
Carolina
Len van Zyl
State Univ. Y-H.Sun
Lenwood Heath (CS)
Algorithms
Virginia Tech
Naren Ramakrishnan (CS)
Data Mining
Problem Solving Environments
Craig Struble,
Vincent Jouenne (CS)
Image Analysis
Forest Biotechnology
Ina Hoeschele (DS)
Statistical Genetics
Keying Ye (STAT)
Bayesian Statistics
Statistics
Virginia Tech
Expresso
People
Ron Sederoff
Lenny Heath
Craig Struble
Ruth Alscher
Ross Whetten
Keying Ye
Boris Chevone
Y-H .Sun
Len van Zyl
Dawei Chen
Vincent Jouenne
Naren
Ramakrishnan
The 1999 Experiment: A Measure
of Long Term Adaptation to
Drought Stress
• Loblolly pine seedlings (two unrelated genotypes “C” and
“D”) were subjected to mild or severe drought stress for four
(mild) or three (severe) cycles.
– Mild stress: needles dried down to –10 bars; little effect
on growth, new flushes as in control trees.
– Severe stress: needles dried down to –17 bars; growth
retardation, fewer new flushes compared to controls.
• Harvest RNA at the end of growing season, determine
patterns of gene expression on DNA microarrays.
• With algorithms incorporated into Expresso, identify genes
and groups of genes involved in stress responses.
Scenarios for Effects of Specific
Stresses on Gene Expression
Hypotheses
• There is a group of genes whose expression
confers resistance to drought stress.
• Expression of this group of genes is lower under
severe than under mild stress.
• Individual members of gene families show
distinct responses to drought stress.
Selection of cDNAs for Arrays
• 384 ESTs (xylem, shoot tip cDNAs of loblolly)
were chosen on the basis of function and
grouped into categories.
• Major emphasis was on processes known to be
stress responsive.
• In cases where more than one EST had similar
BLAST hits, all ESTs were used.
Categories within Protective
and Protected Processes
Gene
Expression
Signal
Transduction
Protease-associated
ROS and Stress
Environmental
Change
Protective
Processes
Nucleus
Cell Wall Related
Trafficking
Phenylpropanoid
Pathway
Development
Protected
Processes
Secretion
Cells
Cytoskeleton
Tissues
Plant Growth Regulation
Chloroplast Associated
Metabolism
Carbon Metabolism
Respiration and Nucleic Acids
Mitochondrion
A Note about Categories
Categories are not mutually exclusive; gene(s)
may be assigned to more then one category. For
example, heat shock proteins have been
grouped under these different categories and
subcategories
– Abiotic stress – heat
– Gene expression – post-translational
processing – chaperones
– Abiotic stress - chaperones
Abiotic
Biotic
Stress
Protective
Processes
Cell Wall Related
“Isoflavone
Reductases”
Antioxidant
Processes
Phenylpropanoid
Pathway
Drought
Dehydrins, Aquaporins
Heat
Non-Plant
Heat shock proteins
(Chaperones)
Xenobiotics
GSTs
Chaperones
NADPH/Ascorbate/
Glutathione
Scavenging Pathway
Sucrose Metabolism
Cellulose
Categories
within
“Protective
Processes”
Arabionogalactan proteins
Cytosolic
ascorbate
peroxidase
superoxide
dismutase-Fe
superoxide
dismutase-Cu-Zn
glutathione
reductase
Extensins and proline rich proteins
Hemicellulose
Pectins
Xylose
Other Cell Wall Proteins
Lignin Biosynthesis
isoflavone reductases
phenylalanine ammonia-lyases
S-adenosylmethionine decarboxylases
glycine hydromethyltransferases
4-coumarate-CoA
ligases
CCoAOMTs
cinnamyl-alcohol
dehydrogenase
Quality Control
• Positive: LP-3, a loblolly gene known to
respond positively to drought stress in loblloly
pine, was included.
• LP-3 was positive in the moist versus mild
comparison, and unchanged in the moist versus
severe comparison.
• Negative: Four clones of human genes used as
negative controls in the Arabidopsis Functional
Genomics project were included. The clones did
not respond.
Drought
Abiotic
Biotic
ROS and Stress
Protective
Processes
Cell Wall Related
“Isoflavone
Reductases”
Antioxidant
Processes
Phenylpropanoid
Pathway
Dehydrins, Aquaporins
Heat
Heat shock proteins
Non-Plant
Xenobiotics GSTs
Cystosolic
ascorbate
Chaperones
peroxidase
NADPH/Ascorbate/
Glutathione
Scavenging Pathway
Sucrose Metabolism
Categories that
contained positives in
genotypes C and D
(Control versus Mild)
Data from two slides (4 arrays)
for C and two slides (4 arrays)
for D were collected.
Cellulose
superoxide
dismutase-Fe
superoxide
dismutase-Cu-Zn
glutathione
reductase
Extensins, Arabionogalactan,
and Proline Rich Proteins
Hemicellulose
Pectins
Xylose
Other Cell Wall Proteins
Lignin Biosynthesis
isoflavone reductases
phenylalanine ammonia-lyase
S-adenosylmethionine decarboxylase
glycine hydromethyltransferase
4-coumarate-CoA
ligase
CCoAOMT
cinnamyl-alcohol
dehydrogenase
Hypotheses versus Results
• Among the genes responding to mild stress, there
exists a population of genes whose expression confers
resistance.
– Genes in 69 categories responded positively to mild stress in
Genotypes C and D (the positive response was not observed
in the severe stress condition in Genotype D).
• There is evidence for a response to drought among
genes associated with other stresses.
– Isoflavone reductase homologs and GSTs responded
positively to mild drought stress.
– These categories are previously documented to respond to
biotic stress and xenobiotics, respectively.
Relationships among HSP Homologs
In control versus mild stress,
HSP 100, 70, and 23 responded in C and D;
HSP 80s did not respond in either C or D.
Candidate Categories — Long-term
Adaptation to Drought Stress
• Include
– Aquaporins
– Dehydrins
– Heat shock proteins/chaperones
• Exclude
– Isoflavone reductases
Design of Microarrays
• Clones on the drought-stress microarrays were replicated and
randomly placed
• Experiment involved 384 archived pine ESTs
• Organized into 4 microtitre source plates after PCR
• Pipetted into 8 sets of 4 microtitre plates each
• Each set a different random arrangement of 384 ESTs
• Printed type A microarrays from first 4 sets
• Printed type B microarrays from second 4 sets
• Each array has 4 randomly placed replicates of each EST
• Each control versus stress comparison was done on 4 arrays — A
and B; flip dyes; A and B
• Total of 16 replicates of each EST in each comparison
Spot and Clone Analysis
• Image Analysis: gridding, spot identification,
intensity and background calculation,
normalization
• Statistics:
• Fold or ratio estimation
• Combining replicates
• Higher-level Analysis:
• Clustering methods
• Inductive logic programming (ILP)
Image Analysis
Microarray Suite:
• Manual gridding
• Extract two intensities for each spot
• Compute ratios
• Compute calibrated ratios
Our tools use the logarithm of the calibrated ratios
Computational and Statistical Analysis
• The multiple (typically 16) log calibrated ratios for a
replicated clone do NOT follow a normal distribution.
• We assume a zero-centered distribution for log ratios.
• The number of positive (or negative) log ratios follows a
binomial distribution with parameters 16 and 0.5.
• A clone with 12 or more positive log ratios is up-expressed
with a probability of 0.96.
• We classify each EST response as one of
– Up-regulated
– Down-regulated
– No clear change
• Provides sufficient results for the use of inductive logic
programming (ILP).
Related Statistical Results
• Chen et al. (J. Biomed. Optics 2, 1997, 364-374)
– Assume a normal distribution and normalize ratios
– No replicates
– Estimate a confidence interval for ratios that applies to
each spot
• Lee et al. (PNAS 97, August 29, 2000, 9834-9) emphasize need
for replication
• Black and Doerge (PNAS, to appear)
– Investigate distributional assumptions of log-normal and
gamma distributions on intensities
– Determine the number of replicates needed for a
particular confidence level under each distribution
– Assume normalization has been done and locationdependent error has been eliminated.
Further Analysis:
Inductive Logic Programming
• ILP is a data mining algorithm expressly designed
for inferring relationships.
• By expressing relationships as rules, it provides new
information and resultant testable hypotheses.
• ILP groups related data and chooses in favor of
relationships having short descriptions.
• ILP can also flexibly incorporate a priori biological
knowledge (e.g., categories and alternate
classifications).
Rule Inference in ILP
• Infers rules relating gene expression levels to categories, both
within a probe pair and across probe pairs, without explicit
direction
• Example Rule:
[Rule 142] [Pos cover = 69 Neg cover = 3]
level(A,moist_vs_severe,not positive) :level(A,moist_vs_mild,positive).
• Interpretation:
“If the moist versus mild stress comparison was positive for
some clone named A, it was negative or unchanged in the
moist versus severe comparison for A, with a confidence of
95.8%.”
More Rules We Obtained
•
[Rule 6]
level(A,moist_vs_mild,positive) :category(A, transport_protein).
level(A,mild_vs_severe,negative) :-
•
category(A, transport_protein).
[Rule 13]
level(A,moist_vs_mild,positive) :-
•
category(A, heat).
[Rule 17]
level(A,moist_vs_mild,positive) :category(A, cellwallrelated).
ILP Subsumes Two Forms of
Reasoning
• Unsupervised learning
– “Find clusters of genes that have similar/consistent
expression patterns”
• Supervised learning
– “Given several patterns of gene expression for two
conditions, give an equation that distinguishes the
patterns for each condition ”
• Hybrid reasoning
– “Is there a relationship between genes in a given
functional category and genes in a particular
expression cluster?”
– ILP mines this information in a single step
Current Status of Expresso
• Completely automated and integrated
– Statistical analysis
– Data mining
– Experiment capture in MEL
• Current Work: Integrating
– Image processing
– Querying by semi-structured views
– Expresso-assisted experiment composition
Future Directions
Next Generation Stress Chips
1. Time course, short and long term, to capture
gene expression events underlying
“emergency” and adaptive events following
drought stress imposition.
(Use all available ESTs for candidate stress
resistance genes.)
2. Generate cDNA library from stressed seedlings.
3. Initiate modeling of kinetics of drought stress
responses.
Gene Expression Events Associated with
Extreme Environmental Conditions
• Hypothesis 1: Specific genes that confer ability to
adapt to extreme conditions are expressed in Andean
potato varieties and in other root and tuber crops of
the region.
• Hypothesis 2: The adaptive genes act either
individually or in co-adaptive groups.
• Hypothesis 3: The adaptive genes are also expressed
in temperate zone varieties or they are specific to
extreme conditions.
• Proposed approach: Use of microarrays as a tool for
discovery of these adaptive genes.
Successful Emergency Responses Versus
Adaptation to Diurnal Variation
• Cultivar differences with respect to
degree/rapidity of gene expression
• Cultivar differences with respect to rate
of synthesis of antioxidant(s)
• Our question: Are the genes that respond
in the short term the same ones that
confer stress resistance diurnally?
Relevant Data
• Similarities among orthologous genes are
sufficiently close that cross-hybridization occurs on
microarrays between species (R. R. Sederoff,
personal communication).
• Although the above confounds treatment responses
shown by individual members of multi-gene families,
it allows use of a chip based on one species in interspecific hybridizations.
Collaborating with CIP
Suggested Strategy I
• Identify patterns of gene expression in S.
tuberosum associated with successful adaptation to
stress (temperature extremes, with or without
drought? )
• Construct two or more cDNA libraries from
adapted potato
• Design “potato stress chips” including the stress
cDNA library and known stress-responsive genes of
S. tuberosum (SolGenes resource)
Suggested Strategy II
An Experimental Approach
• Identify variety-specific diurnal gene
expression patterns in Andean potato
varieties using potato stress chip.
• Construct arrays of cDNAs from
Andean varieties. Compare mRNA
populations isolated during adaptation
of temperate zone potatoes with RNA
obtained from Andean varieties.
Iterative strategy for detection of genetic
interactions using microarrays
Detection of gene
expression effects
on microarrays
1
4
Genetic
Regulatory
Networks
Test mutant
phenotypes
3
Identify mutants
Characterize
2
gene function
Iterative strategy for detection of
genetic interactions using microarrays
and CS expertise
Detection of stress mediated gene
expression effects on
microarrays
1
4
Genetic
Regulatory
Networks
Revised / New
Tools and
Experiments
3
Test inferences with
mutants/varying
conditions
2
Computational
tools to infer
interaction
among genes,
pathways