Download Genome Size, Repetitive Sequences, and Genes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Primer, Friday 10am, Beckman B-302
Ex. 1 is coming.
http://cs273a.stanford.edu
[Bejerano Fall10/11]
1
Lecture 4
Our place in the tree of life
Genome Size
Genome Content:
Repetitive Sequences
Genes
http://cs273a.stanford.edu
[Bejerano Fall10/11]
2
Our Place in the Tree of Life
 you are here
[Human Molecular Genetics, 3rd Edition]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
3
Metazoans (multi-cellular organisms)
 you are here
[Human Molecular Genetics, 3rd Edition]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
4
Vertebrates
, Stickleback
, Lizard
, Opossum
 you are here
[Human Molecular Genetics, 3rd Edition]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
5
INTERSPECIES VARIATION IN GENOME SIZE
WITHIN VARIOUS GROUPS OF ORGANISMS
http://cs273a.stanford.edu
[Bejerano Fall10/11]
6
Figure from Ryan Gregory (2005)
Meet Your Genome Continues
[Human Molecular Genetics, 3rd Edition]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
7
http://cs273a.stanford.edu
[Bejerano Fall10/11]
8
Repeats /
obile Elements ("selfish DNA")
Human
Genome:
3*109 letters
http://cs273a.stanford.edu
1.5%
known
function
[Bejerano Fall10/11]
>50%
junk
9
[Adapted from Lunter]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
10
http://cs273a.stanford.edu
[Bejerano Fall10/11]
11
http://cs273a.stanford.edu
[Bejerano Fall10/11]
12
TE composition and assortment vary among
eukaryotic genomes
100%
80%
60%
DNA transposons
LTR Retro.
40%
Non-LTR Retro.
20%
http://cs273a.stanford.edu
[Bejerano Fall09/10]
13
Feschotte & Pritham 2006
http://cs273a.stanford.edu
[Bejerano Fall10/11]
14
http://cs273a.stanford.edu
[Bejerano Fall10/11]
15
http://cs273a.stanford.edu
[Bejerano Fall10/11]
16
http://cs273a.stanford.edu
[Bejerano Fall10/11]
17
http://cs273a.stanford.edu
[Bejerano Fall10/11]
18
http://cs273a.stanford.edu
[Bejerano Fall10/11]
19
Assemby Challenges
http://cs273a.stanford.edu
[Bejerano Fall10/11]
20
Inferring Phylogeny Using Repeats
[Nishihara et al, 2006]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
21
Functional elements from
obile Elements
Co-option event,
probably due to
favorable genomic
context
[Yass is a small town in
New South Wales, Australia.]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
[Bejerano et al., Nature 2006]
22
The amount of TE correlate positively with
genome size
Mb
Genomic DNA
3000
2500
TE DNA
2000
Protein-coding
DNA
1500
1000
500
0
http://cs273a.stanford.edu
[Bejerano Fall09/10]
23
Feschotte & Pritham 2006
The proportion of protein-coding genes decreases with genome size,
while the proportion of TEs increases with genome size
TEs
Protein-coding
genes
24
Gregory, Nat Rev Genet 2005
Genome Size Variability
1pg = 978 Mb
http://cs273a.stanford.edu
[Bejerano Fall10/11]
25
Simple Repeats
•Every possible motif of mono-, di, tri- and tetranucleotide repeats is
vastly overrepresented in the human genome.
•These are called microsatellites,
Longer repeating units are called minisatellites,
The real long ones are called satellites.
•Highly polymorphic in the human population.
•Highly heterozygous in a single individual.
•As a result microsatellites are used in paternity testing, forensics, and
the inference of demographic processes.
•There is no clear definition of how many repetitions make a simple
repeat, nor how imperfect the different copies can be.
•Highly variable between genomes: e.g., using the same search criteria
the mouse & rat genomes have 2-3 times more microsatellites than
the human genome. They’re also longer in mouse & rat.
http://cs273a.stanford.edu
[Bejerano Fall10/11]
26
http://cs273a.stanford.edu
[Bejerano Fall10/11]
27
http://cs273a.stanford.edu
[Bejerano Fall10/11]
28
http://cs273a.stanford.edu
[Bejerano Fall10/11]
29
Restriction enzymes recognize and
make a cut within specific
palindromic sequences, known as
restriction sites, in the DNA. This
is usually a 4- or 6 base pair
sequence.
blunt end
sticky end
http://cs273a.stanford.edu
[Bejerano Fall10/11]
30
DNA Fingerprint Basics
DNA fragments of different size will be produced
by a restriction enzyme that cuts at the points
shown by the arrows.
31
DNA fragments are then separated based on
size using gel electrophoresis.
32
DNA Fingerprinting can be used in
paternity testing or murder cases.
33
http://cs273a.stanford.edu
[Bejerano Fall10/11]
34
From an evolutionary point of view transposons and simple
repeats are very different.
Different instances of the same transposon share common
ancestry (but not necessarily a direct common progenitor).
Different instances of the same simple repeat most often
do not.
http://cs273a.stanford.edu
[Bejerano Fall10/11]
35
The Gene-ome makes < 2% of the H.G.
[Human Molecular Genetics, 3rd Edition]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
36
Gene Structure
Signal – a string of DNA recognized by the cellular machinery
http://cs273a.stanford.edu
[Bejerano Fall10/11]
37
Gene Processing
Eukaryotic Gene Structure
http://cs273a.stanford.edu
[Bejerano Fall10/11]
38
Gene Finding – The Practice
Challenge:
“The genes, the whole genes, and nothing but the genes”
Problems:
spliced ESTs  legitimate gene isoform?
predicting gene isoforms
tissue/condition-specific genes / gene isoforms
single exon genes
pseudogenes
Practice:
http://cs273a.stanford.edu
[Bejerano Fall10/11]
39
Evolution of Gene Finding Tools
1982
intrinsic
extrins
ic
hybrid
Ab-initio
Alignment-based
Genie
1996
Genscan
1997
Comparative Genomics
DNA
cDNA, Protein
Protein
ExoFish
GenieEST
Procrustes
GenieESTHOM
2000
Informant
1996
HMM-based
Rosetta
Twinscan
2000
2001
Pair-HMM
Phylo-HMM
Slam
Siepel-Haussler
DoubleScan
Jojic-Haussler
2002
2004
etc
http://cs273a.stanford.edu
[Bejerano Fall10/11]
40
The Human Gene Set
[HGC, 2001]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
41
[Celera, 2001]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
42
wrong!
http://cs273a.stanford.edu
[Bejerano Fall10/11]
43
Signal Transduction
http://cs273a.stanford.edu
[Bejerano Fall10/11]
44
Ancient Origins of Important Gene Families
http://cs273a.stanford.edu
[Bejerano Fall10/11]
45

Multigene families due to:

Single gene duplication

Segment duplication: Tandem duplication or
duplication transposition




46
a
b
a
c
b
d
c
e
d
e
f
f
b
c
Horizontal gene transfer
Genome-wide doubling event
g
d
g
Horizontal Gene Transfer
http://cs273a.stanford.edu
[Bejerano Fall10/11]
47
Horizontal Gene Transfer in the H.G.
…
[HGC, 2001]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
48
Or is it?
[Kurland et al., 2003]
http://cs273a.stanford.edu
[Bejerano Fall10/11]
49
HGT between fish & their parasites
http://cs273a.stanford.edu
[Bejerano Fall10/11]
50
Retroposed Genes and Pseudogenes
http://cs273a.stanford.edu
[Bejerano Fall10/11]
51