Download Gene expression (transcription)-

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsatellite wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
Chapter 22—
Genomics II
• Functional Genomics—studying genes in groups, with
respect to the cell, tissue, signaling pathway or organism
• Proteomics—to understand the interplay among many
different proteins (cellular processes and organismal
level [traits])
• Bioinformatics—using computers, math, and statistics to
understand the genome and proteome information
(record, store, analyze, predict)
Chapter 22—
Genomics II
• Functional Genomics—studying genes in groups, with
respect to the cell, tissue, signaling pathway or organism
• Proteomics—to understand the interplay among many
different proteins (cellular processes and organismal
level [traits])
• Bioinformatics—using computers, math, and statistics to
understand the genome and proteome information
(record, store, analyze, predict)
A
A mixture of 3
different types of F
mRNA
A
Microarrays for
studying gene
expression or resequencing
D
D
F
A portion of a DNA microarray
A
A
D
B
F
Add reverse transcriptase, poly-dT
primers that anneal to the mRNAs, C
and fluorescent nucleotides.
Note: Only 1 complementary
cDNA strand is made.
A
Fluorescently
labeled cDNA that
is complementaryF
to the mRNA
A
E
D
F
D
D
F
A
D
F
Hybridize cDNAs
to the microarray.
Figure 22.1
A
B
C
D
E
F
View with a laser scanner.
Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display
Modern day “Southerns” and
“Northerns”—microarray analysis
Two distinct forms of large B-cell lymphoma are shown by the expression
pattern: GC B-like DLBCL (orange) and Activated B-like DLBCL (blue)
significantly better
overall survival
ASH ALIZADEH et al. 2000
Nature 403, 503-511 (3 February 2000)
Distinct types of diffuse large B-cell lymphoma identified by
gene expression profiling
Observation/problem
•
Diffuse large B-cell lymphoma (DLBCL) = most common subtype of non-Hodgkin's
lymphoma is clinically heterogeneous: 40% of patients respond well to current therapy and
have prolonged survival, whereas the remainder succumb to the disease
Hypothesis
•
variability in natural history reflects unrecognized molecular heterogeneity in the
tumours.
Experiment
•
DNA microarrays used for a systematic characterization of gene expression in B-cell
malignancies.
Results
•
Diversity in gene expression among the tumours of DLBCL patients (reflecting the variation
in tumour proliferation rate, host response and differentiation state of the tumour).
•
Identified two molecularly distinct forms of DLBCL which had gene expression patterns
indicative of different stages of B-cell differentiation.
–
–
One type expressed genes characteristic of germinal centre B cells ('germinal centre B-like DLBCL');
the second type expressed genes normally induced during in vitro activation of peripheral blood B
cells ('activated B-like DLBCL').
•
Patients with germinal centre B-like DLBCL had a significantly better overall survival than
those with activated B-like DLBCL.
Conclusion
•
Molecular classification of tumours on the basis of gene expression can thus identify
previously undetected and clinically significant subtypes of cancer.
ASH ALIZADEH et al. 2000
Nature 403, 503-511 (3 February 2000)
Protein of interest
Which DNA sequences
bind to my protein of
interest?
Add formaldehyde to crosslink
protein to DNA. Lyse the cells.
Sonicate DNA into small
pieces.
Add antibodies that recognize the
protein of interest. The antibodies
are bound to heavy beads. After
the antibodies bind to the protein
of interest, the sample is
subjected to centrifugation.
Chromatin
Immunoprecipitation Assay
(ChIP)
Protein of interest
Antibody against
protein of interest
Bead
Pellet
Collect complexes in pellet.
Add chemical that breaks the
crosslinks to remove the protein.
Known Candidates:
Conduct PCR using primers
to a known DNA region.
or
Unknown Candidates:
Ligate DNA linkers to the
ends of the DNA.
Linker
If PCR amplifies the DNA,
the protein was bound to
the DNA region recognized
by the primers.
Figure 22.2
Conduct PCR using primers
that are complementary to
the linkers. Incorporate
fluorescently labeled
nucleotides during PCR.
Denature DNA and
hybridize to a microarray.
See Figure 22.1
Chapter 22—Genomics II
• Functional Genomics—studying genes in groups,
with respect to the cell, tissue, signaling pathway
or organism
• Proteomics—to understand the interplay among
many different proteins (cellular processes and
organismal level [traits])
• Bioinformatics—using computers, math, and
statistics to understand the genome and
proteome information (record, store, analyze,
predict)
Why is the proteome so large? Alternative splicing
pre-mRNA
Exon 1
Exon 2
Exon 3
Exon 4
Alternative splicing
Translation
Exon 2
Exon 1
Exon 5
Exon 4
Exon 6
or
Exon 3
Exon 1
Exon 5
Exon 4
Exon 6
or
Exon 2
Exon 1
Exon 6
Exon 4
(a) Alternative splicing
Exon 5
Exon 6
Irreversible modifications
Proteolytic
processing
SH
SH
Disulfide bond
formation
S
S
Heme
group
Attachment of
prosthetic
groups, sugars,
or lipids
Sugar
Phospholipid
Reversible modifications
Phosphorylation
Acetylation
Methylation
PO42-
Phosphate
group
O
C CH3 Acetyl
group
CH3
Methyl
group
(b) Posttranslational covalent modification
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Why is the
proteome so
large?
Post translational
modification
Techniques to
study the
proteome: 2D
Gel analysis
Lyse a sample of cells and
load the resulting mixture
of proteins onto an isoelectric
focusing gel.
pH 4.0
Proteins migrate until they
reach the pH where their
net charge is 0. At this
point, a single band could
contain 2 or more
different proteins.
pH 10.0
SDS-polyacrylamide gel
pH 4.0
Lay the tube gel onto an
SDS-polyacrylamide gel and
separate proteins according
to their molecular mass.
pH 10.0
200 kDa
10 kDa
Brooker, Fig 22.4
N
Purified protein
Techniques to
study the
proteome:
Mass
spectrometry
C
Digest protein into
small fragments
using a protease.
N
C
Determine the mass
of these fragments with
a first spectrometer.
Abundance
1652 daltons
0
Brooker, Fig 22.5
Mass/charge
4000
Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display
Abundance
1652 daltons
0
4000
Mass/charge
Analyze this fragment with
a second spectrometer.
The peptide is fragmented
from one end.
1201
Abundance
1008
900
1652
1428
1315
1114
Mass/charge
–Asn–Ser–Asn–Leu–His–Ser–
Tandem mass
spectrometry to
sequence
peptides
1565
1800
Brooker, Fig 22.5, cont.
Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display
Chapter 22—Genomics II
• Functional Genomics—studying genes in groups,
with respect to the cell, tissue, signaling pathway
or organism
• Proteomics—to understand the interplay among
many different proteins (cellular processes and
organismal level [traits])
• Bioinformatics—using computers, math, and
statistics to understand the genome and
proteome information (record, store, analyze,
predict)
Example of DNA Sequence as stored in
Genetic Database
Numbers represent the base number
in the sequence file
Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display
A bioinformatics program may ask:
• Does the sequence contain a gene?
• Which nt’s are the functional sites (e.g. promoters,
exons, introns, termination sequence)?
• Does the sequence encode a protein? (have an open
reading frame [ORF]
• What is the secondary structure of its RNA or
associated amino acid sequence?
• Is the sequence homologous to any other known
sequences?
• What is the evolutionary relationship between two
or more sequences?
5′ end
3′ end
A secondary structural model for
E. coli 16S rRNA
Brooker, Fig 22.7
Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display
Sequence matches between E. coli and K. pneumoniae
• DNA sequences of the lacY gene
– ~ 78% of the bases are a perfect match
• In this case, the two sequences are similar because the genes are
homologous to each other
– They have been derived from the same ancestral gene
– Refer to Figure 22.6
Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display
Example output from a computer
alignment program (and
comparison to real world data)
Human Pa
Ca
Mouse Lu
Ca
Human
LHON,
Human
Thy Ca
Interesting cancer mutation pattern in mitochondrial ND6 protein
Mouse Lu
Ca
Sequence homology used to “hang” human cancer mutations on the bovine
crystal structure of Cytochrome B
Chen and Uberto 2014
Federal Genetic Databases
National Center for Biotechnology Information
www.ncbi.nlm.nih.gov/
U.S. government-funded national resource for molecular biology
information.
BLAST programs identify sequences with
homology or similarity
Table 22.5
Origin of
orthologous
genes
Ancestral lacY gene
Ancestral
organism
Evolutionary separation
of 2 (or more)
distinct species
E. coli
K. pneumoniae
lacY gene
lacY gene
Accumulation of
random mutations
in the 2 genes
lacY gene
Figure 22.6
Mutation
lacY gene
Mutation
Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display
Mb
ζ ψζ ψα2 ψα1 α2
α1
f
ε
g G g A ψβ δ
β
Millions of years ago
0
200
a chains
400
b chains
Myoglobin
Hemoglobins
600
800
Duplication
Better at binding and
storing oxygen in muscle
cells
Ancestral
globin
Better at binding and
transporting oxygen via red
blood cells
1,000
Brooker, Fig 8.7
Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display
Orthologs, paralogs, homologs
•
•
•
•
Like Brooker fig 8-7
All the globin genes have homology to each other
a-like genes are paralogs of each other;
b-like genes are paralogs of each other;
a-1 in mice and a-1 in humans are orthologs
From Thompson
and Thompson,
Genetics in
Medicine, 6th ed.