Download Mouse Repeats

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetic engineering wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

History of RNA biology wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Point mutation wikipedia , lookup

Microevolution wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Primary transcript wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Oncogenomics wikipedia , lookup

Gene wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Human genetic variation wikipedia , lookup

Copy-number variation wikipedia , lookup

Public health genomics wikipedia , lookup

Genome (book) wikipedia , lookup

Designer baby wikipedia , lookup

RNA-Seq wikipedia , lookup

NUMT wikipedia , lookup

Metagenomics wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Pathogenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Minimal genome wikipedia , lookup

ENCODE wikipedia , lookup

Microsatellite wikipedia , lookup

History of genetic engineering wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Genomic library wikipedia , lookup

Transposable element wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genomics wikipedia , lookup

Human genome wikipedia , lookup

Genome editing wikipedia , lookup

Helitron (biology) wikipedia , lookup

Human Genome Project wikipedia , lookup

Genome evolution wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Transcript
Mouse Repeats
The mouse genome, like its human counterpart, contains a large number of interspersed
repeat sequences. While in mouse, these repeats represent a lower percentage of the
genome (37.5% vs. 46% in human), this difference may be due to a higher mutation rate
that obscures some of the older repeat sequences (Waterston, Lindblad-Toh et al. 2002).
LINEs
The mouse genome contains a set of long interspersed elements (LINEs), which are
sequentially related to the elements with the same name in the human genome (Burton,
Loeb et al. 1986). Mouse L1s are classified into three subfamilies – A, V and F – based
on the three different types of promoter in the 5’ end (Padgett, Hutchison et al. 1988;
Mears and Hutchison 2001). The A and F mouse L1 subfamilies are considered to be the
youngest and are currently active(Mears and Hutchison 2001) similar to the human L1H
subfamily (Smit, Toth et al. 1995). LINEs represent approximately 19% of the mouse
genome (Waterston, Lindblad-Toh et al. 2002).
SINEs
Unlike the human genome that contains only one type of SINE, the mouse genome
contains four distinct SINE families – B1, B2, ID and B4. The B1 elements are derived
from the ancestral 7SL RNA gene and are related to human Alus (Krayev, Kramerov et
al. 1980; Ullu and Tschudi 1984).
The B1 family of repeat elements is closely related to human Alus. Like the Alus, they
are derived from the 7SL RNA gene (Ullu and Tschudi 1984). Unlike the Alus, which
are dimers, B1s are monomers approximately 140 nucleotides in length, with an internal
29-nucleotide duplication (Labuda, Sinnett et al. 1991). The Alu 5’ monomer (FLAN)
shares significant homology with certain proto-B1 (pB1) sequences (Quentin 1994). B1s,
like Alus, are believed to be propagated by a reverse transcriptase encoded by the L1
family of long interspersed repeat elements (LINEs) (Schmid 1998). Similarly, B1s
appear to be preferred methylation targets (Schmid 1998) and appear to be preferentially
distributed in GC-rich regions (Waterston, Lindblad-Toh et al. 2002). The B1
distribution of the mouse genome exhibits a greater correlation with the Alu content of
the orthologous areas of the human genome than with the immediate GC-density
(Waterston, Lindblad-Toh et al. 2002). This suggests that genomic features, which are
correlated with but distinct from GC-content, may determine Alu/B1 distribution
(Waterston, Lindblad-Toh et al. 2002).
Like the B1 element, B2 is transcribed by the polymerase III promoter sequence. Unlike
B1, it shares significant homology at the 5’-end with particular tRNAs (usually tRNALys
or tRNAGly). Because of this homology they are believed to be derived from tRNAs or
their genes (Krayev, Markusheva et al. 1982). The B2s are longer than B1s. In Ensembl
release 31, B1’s have an average length of 116 nucleotides and B2’s have an average
length of 158 nucleotides. However, because they appear to be fewer in numbers than
B1s, they occupy approximately the same percentage of the genome.
Another tRNA-related ID family of repeat elements, believed to be derived from a
neuronally expressed BC1 gene (Kim and Deininger 1996). could be a progenitor gene
for the more recently derived B2 family. In mouse, the short ID elements are small in
number. However, they have a major presence in the rat genome (Kim and Deininger
1996).
The B4 repeat element appears to be a result of fusion of ID elements at 5’-end and the
B1 elements at the 3’-end (Serdobova and Kramerov 1998). Indeed, ID elements
possibly may be a degenerate remainder or mislabeled versions of B4 SINEs (Table 1)
(Waterston, Lindblad-Toh et al. 2002).
Number of
Total
Average
Component of
Family
elements
nucleotides
element length genome
B1
498,420
57,952,140
116
2.2%
B2
335,517
53,343,425
159
2.0%
ID
42,200
2,900,608
69
0.1%
B4
329,838
48,399,444
147
1.8%
Total SINEs
1,313,530
174,821,424
133
6.7%
Human Alu
1,160,797
303,885,572
262
10.1%
Table 1: Mouse SINEs
Computed from Ensembl release 31
LTR
The LTR elements in mouse are derived from vertebrate-specific retroviruses (ERVs).
There are three recognized classes I – III. The oldest, class III is believed to predate
human-mouse separation. Some class III LTRs are still active in mouse but not in human
(Smit 1993). The class II elements are considerably more common in mouse than human
while class I element appear more frequently in human than mouse. The reason for this
distinction is not clear (Waterston, Lindblad-Toh et al. 2002). Overall mouse LTRs
represent approximately 10% of the mouse genome (Waterston, Lindblad-Toh et al.
2002).
DNA transposon
DNA transposons depend on frequent horizontal transfers utilizing foreign vectors such
as viruses and other intracellular parasites (Smit 1996; Smit 1999). These elements are
considerably less prevalent in the mouse genome. One reason for this difference is that
the mouse germ line is less susceptible to infiltration with horizontal transfer agents
compared with human (Waterston, Lindblad-Toh et al. 2002). Since all DNA
transposons have been deposited early in the evolutionary history, it is possible that in
mouse, these elements have degenerated beyond recognition. In total, DNA transposons
represent approximately 1% of the mouse genome (Waterston, Lindblad-Toh et al. 2002).
Simple Repeats
Like the human genome, the mouse genome contains a large number of simple sequence
repeats – the near identical tandem repeats of 1 to 500 nucleotides. On the short end (i.e.,
up to 5 nucleotides) the mouse genome contains two to three times more of these
sequences. Of the longer variety (over 20 nucleotides) the difference between mouse and
human is even greater. This suggests that the reason for more SSRs in mouse is due to
both initiation and extension (Waterston, Lindblad-Toh et al. 2002). One proposal that
attempts to explain this difference is an idea that humans have a higher number of point
mutations per generation that disrupts the exactness of the SSRs (Waterston, LindbladToh et al. 2002). However, higher point mutations also would be seen in other
sequences, including single copy genes, and that the modern mouse is a product of a
greater number of generations than the modern human. Simple repeats represent
approximately 2.3% of the mouse genome (Waterston, Lindblad-Toh et al. 2002).
References
Burton, F. H., D. D. Loeb, et al. (1986). "Conservation throughout mammalia and
extensive protein-encoding capacity of the highly repeated DNA long interspersed
sequence one." J Mol Biol 187(2): 291-304.
Kim, J. and P. L. Deininger (1996). "Recent amplification of rat ID sequences." J Mol
Biol 261(3): 322-7.
Krayev, A. S., D. A. Kramerov, et al. (1980). "The nucleotide sequence of the ubiquitous
repetitive DNA sequence B1 complementary to the most abundant class of mouse
fold-back RNA." Nucleic Acids Res 8(6): 1201-15.
Krayev, A. S., T. V. Markusheva, et al. (1982). "Ubiquitous transposon-like repeats B1
and B2 of the mouse genome: B2 sequencing." Nucleic Acids Res 10(23): 746175.
Labuda, D., D. Sinnett, et al. (1991). "Evolution of mouse B1 repeats: 7SL RNA folding
pattern conserved." J Mol Evol 32(5): 405-14.
Mears, M. L. and C. A. Hutchison, 3rd (2001). "The evolution of modern lineages of
mouse L1 elements." J Mol Evol 52(1): 51-62.
Padgett, R. W., C. A. Hutchison, 3rd, et al. (1988). "The F-type 5' motif of mouse L1
elements: a major class of L1 termini similar to the A-type in organization but
unrelated in sequence." Nucleic Acids Res 16(2): 739-49.
Quentin, Y. (1994). "A master sequence related to a free left Alu monomer (FLAM) at
the origin of the B1 family in rodent genomes." Nucleic Acids Res 22(12): 22227.
Schmid, C. W. (1998). "Does SINE evolution preclude Alu function?" Nucleic Acids Res
26(20): 4541-50.
Serdobova, I. M. and D. A. Kramerov (1998). "Short retroposons of the B2 superfamily:
evolution and application for the study of rodent phylogeny." J Mol Evol 46(2):
202-14.
Smit, A. F. (1993). "Identification of a new, abundant superfamily of mammalian LTRtransposons." Nucleic Acids Res 21(8): 1863-72.
Smit, A. F. (1996). "The origin of interspersed repeats in the human genome." Curr Opin
Genet Dev 6(6): 743-8.
Smit, A. F. (1999). "Interspersed repeats and other mementos of transposable elements in
mammalian genomes." Curr Opin Genet Dev 9(6): 657-63.
Smit, A. F., G. Toth, et al. (1995). "Ancestral, mammalian-wide subfamilies of LINE-1
repetitive sequences." J Mol Biol 246(3): 401-417.
Ullu, E. and C. Tschudi (1984). "Alu sequences are processed 7SL RNA genes." Nature
312(5990): 171-2.
Waterston, R. H., K. Lindblad-Toh, et al. (2002). "Initial sequencing and comparative
analysis of the mouse genome." Nature 420(6915): 520-62.