Download Gene exspression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Primary transcript wikipedia , lookup

Point mutation wikipedia , lookup

Transposable element wikipedia , lookup

Genetic engineering wikipedia , lookup

X-inactivation wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Public health genomics wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Pathogenomics wikipedia , lookup

Oncogenomics wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Essential gene wikipedia , lookup

Gene expression programming wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Microevolution wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Nutriepigenomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genome evolution wikipedia , lookup

Genomic imprinting wikipedia , lookup

Genome (book) wikipedia , lookup

Designer baby wikipedia , lookup

Gene wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

RNA-Seq wikipedia , lookup

NEDD9 wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Ridge (biology) wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Minimal genome wikipedia , lookup

Gene expression profiling wikipedia , lookup

Transcript
Gene expression
Guy Nimrod
Microarrays
• The microarrays
technology is aimed to
measure the gene
expression profile of cell.
• This is done by
measuring the mRNA
levels of different genes
in the cell.
• The method can be
applied to thousands of
genes and complete
genomes simultaneously
DNA chips
• DNA chips are arrays of different DNA
fragments attached at specific locations
on glass slides at very high density.
• Fragments at each specific location are
usually designed as complementary to
part of the mRNA (or its cDNA) of a
certain gene.
• The use of the DNA chips is based on
hybridization between the fragments
attached to the glass and the mRNA (or
its cDNA) from the query organism cells.
The method
A.
Reverse transcription
Hybridization
(Actual strand ~25b)
B.
Disadvantages:
• mRNA levels do not necessarily reflect the levels
of the proteins.
– Different half-life time for different proteins.
– Regulation in the protein level
• Potential noise e.g. :
– Imperfect hybridization and paralogs.
– Alternative splicing.
• Measurements are relative to a control
specimen.
Applications
• Analysis and characterization of:
– Cell’s response to different conditions.
– Cell cycle regulated genes.
– Different expression profile in different tissues
of the organism.
– Sources and their implications in diseases.
– Mode of action of drugs.
Data analysis
• A basis for organizing gene expression
data is to group together genes with similar
pattern of expression
• Define similarity. E.g. :
– Euclidean distance
– Correlation coefficient
Genes
(The data is usually log transformed)
• Clustering the data. This could be done by
a supervised or unsupervised clustering.
Experiments
• Compute distances between each pair of
genes each gene is considered as a node
with weight of one unit.
• Find the most similar pair of nodes, and join
them into one node with expression profile as
an average of them both. Weight the new
node as the sum of weights of its
components.
• Compute the distances of the new node from
all the nodes in the list. (Discard the nodes
which compose the new one)
• There are 2n-1 linear ordering consistent with
the structure of the tree. The ordering is
usually according to some weight function
(e.g., time of maximal induction)
Genes
Hierarchical clustering
Experiments
K-means
Objective: divide the objects into K clusters such that some metric (e.g.,
variance) relative to the centroids of the clusters is minimized.
Example of simple version of K-means: (Assume K=3)
1. Place K points into the space. These points represent initial group
centroids.
2. Assign each object to the group that has the closest centroid.
3. Recalculate the positions of the k centroids.
4. Repeat Steps 2 and 3 until the centroids no longer move (or changes below a
certain cutoff.
•Need to choose K.
•Global minimum is not guaranteed (because the assignments are discrete, not
necessarily a local minimum). Dependence on starting point.
Example: K-means
91 clusters
91 centroids
(Gacsh et al., 2002)
2. amino-acid biosynthesis.
7. genes induced as part of
the environmental stress
response.
14. mitochondrial protein
synthesis.
39. genes involved in nitrogen
utilization.
45. oxidative phosphorylation
and respiration components.
53. specific amino-acid
transporters.
67. glycolysis genes.
72. secretion, protein
synthesis, and membrane
synthesis genes
73. genes repressed as part of
the environmental stress
response.
80. amino-acid biosynthesis
genes
86. histone genes.
Response of yeast cells to
environmental changes*
• Cells require specific internal
conditions for optimal growth.
• Unicellular organisms such as
yeast (S.cerevisia) have
evolved mechanisms for
adapting to drastic
environmental changes.
• The following research
explores the genomic
expression pattern in the
complete genome of the yeast,
in response to diverse
environmental transitions.
* Gasch et al., 2000
Methods
• Yeast:
– Unicellular organism, requires rapid recovery
and adjustment to the new surroundings.
– Available ‘whole genome’ microarrys, each
contained ~6200 known/predicted genes.
– One of the most researched organisms with
many annotated genes.
Methods
• The expression pattern of the genes was
examined in the response to a variety extreme
environments, e.g. :
–
–
–
–
–
Heat shock
Amino-acid starvation
Nitrogen depletion
Hyper-osmotic shock
Progression into stationary phase.
• It was measured relatively to an unstressed
culture/beginning of the experiment.
Results:
Hierarchical clustering
• Two major clusters
(F&P) showed
reciprocal but nearly
identical profiles.
• These ~900 (15%)
genes responded to
almost all of the
examined stress
conditions (ESRs).
• Some other clusters
are of genes that
respond to specific
extreme conditions.
The Enviromental
Stress ResponseESR
• ~600 repressed genes
– Growth related processes
– Nucleotides biosynthesis
– Ribosomal genes
• These genes seems to
be coregulated and
promotor analysis
revealed two novel and
conserved motifs
upstream the genes,
The Enviromental
Stress ResponseESR
• ~300 induced genes (60%
uncharacterized)
– Carbohydrate metabolism
– Detoxification of reactive
oxygen
– DNA damage repair
– Metabolite transport
– Intracellular signaling
• Many of these genes have
previously been proposed to
function as cellular protection of
stress.
Regulation of the genes induces in
the ESR
• A set of ~50 genes induced by a variety of stress
conditions through a stress response element
(STRE), was previously known. It is recognized
by the transcription factors Msn2p and Msn4p.
• Half of those genes are induced in the identified
ESR
• Sub-clusters within the induced ESR genes
suggests differences in the regulation of those
genes.
Genes dependent on Msn2/Msn4p
H2O2 Heat
• A- Partially dependent on
Msn2/Msn4p in response to
both stresses.
• B- Largely dependent on
Msn2/Msn4p in response to
both stresses.
• C- Dependent on
Msn2/Msn4p in response to
heat shock.
• A substantial fraction in the
ESR genes was unaffected by
over expression or deletion of
Msn2/Msn4p.
Course of the reaction
• ESR genes responded
immediately with large
changes.
• However, over time new
steady state of transcript
levels is reached with small
differences comparing to the
initial steady state.
– Maintaining new levels?
– Some Overcome from the
stress?
• Duration and amplitude of
the transient changes varied
with the magnitude of the
environmental change.
Isozymes
• Isozymes are enzymes
having similar structure
that catalyze the same
reaction.
• Analysis showed
differential expression of
some isozymes.
– Different properties of the
isozymes (localization,
affinity, substrate specificity
etc.)
– Divergence of regulation
(74%id)
(78%id)
Reciprocal metabolic roles
• Among the genes induced in the ESR were
many whose products play reciprocal metabolic
role
– E.g., enzymes that synthesize glycogen, and their
precursors, as well as catabolic enzymes for
degrading glycogen.
• The activity of many of these enzymes is
controlled in the posttranslational level.
• Induction of both way enzymes enhances the
cell’s ability to rapidly manage osmotic instability
and energy reserves.
What triggers ESR?
• Hypothesis- ESR is initiated
in response to any extreme
change in cell’s
environment.
• 25oC to 37oC- massive and
transient changes in ESR
expression
• 37oC to 25oC- reciprocal
response. simple transition
to the gene expression
program characteristic to of
steady state growth at
25oC.
• ESR- seems to respond to
conditions that enhance the
environmental stress.
Identification of cell cycle-regulated
genes in Yeast*
• Cell cycle- The sequence of events
from one division of a cell to the
next.
• Cyclins- proteins that control the cell
cycle.
• CDKs- cyclin-dependent protein
kinases.
• G1 - growth and preparation of the
chromosomes for replication.
• S - synthesis of DNA
• G2 - preparation for mitosis.
• M - mitosis
•
G0 - cell leaves the cell cycle,
temporarily or permanently.
* Spellman et al., 1998
Methods:
• In an untreated culture of cells, the cells are in
various stages of the cell cycle.
• The experiments tracked cell cultures
synchronized by three different methods
(another experiment set was taken from Cho et
al., 1998).
– Applying different independent methods was essential
to diminish artifacts characteristic for a certain
method.
– Cultures were considered as synchronized at the next
2-3 cycles after synchronization.
• As control an unsynchronized culture was used.
Extracting cell cycle-regulated genes
• Two factors were used to score the periodicity of each
gene:
– Measurement of the periodicity of the gene comparing to the
period of the yeast cell cycle (~80min).
– Measurement of the correlation between the gene and each of
five different profiles, each represent a gene known to be
expressed at a certain stage.
• The 800 genes with the highest combined score were
chosen:
– Maximize the number of known cell cycle regulated genes in the
list (95/104 known at that time).
– Minimize false positive. (+ measure ~3% false positive in random
data)
– Somewhat arbitrary.
Example of periodic and nonperiodic genes
Results:
• The periodic genes ordered by the
time at which they reach peak
expression.
–
–
–
–
–
G1: 300 genes
S: 71 genes
G2: 121 genes
M: 195 genes
M/G: 113 genes
• Most genes: cell cycle control, DNA
replication, DNA repair, budding,
nuclear division, glycosylation
mitosis etc.
• Many genes needed for replication
and repair reach peak expression
just before they are needed.
Hierarchical clustering
• Many known groups of genes were clustered together:
• The histone cluster formed the tightest cluster, having
very high peak at the S phase.
• The Histons have three known modes of regulation:
– Repressing elements
– Activating transcription
– Destabilization of the mRNA.
• Cdc28 seems to cause here some artifacts in the
expression pattern.
Gene expression
• The microarrays technology along with bioinformatics
methods, and the sequencing of complete genomes
supplies a revolutionary novel sight to processes in
the cells.
• The researches presented here demonstrate the
ability to:
– Discover genes with a certain pattern of regulation.
– Suggest functions for un-annotated genes.
– Refine characterization of regulatory elements.
– Propose new regulatory elements.
– Better understanding of pathways in the cell.
– And many others…