Download TE classification and submission guide

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Endogenous retrovirus wikipedia , lookup

Evolution of metal ions in biological systems wikipedia , lookup

Transposable element wikipedia , lookup

Transcript
Draft Guidelines for TE Classification and Submission to TEfam
1. Background
TEfam is a relational database for the submission, retrieval, and analysis of transposable
elements (TEs). Our main goals are to provide an outlet for information from systematic and
genome-scale characterization of TEs and to provide a user-friendly interface for biologists to
take full advantage of the compiled information.
Within any genome, a TE consists of multiple copies generated by transposition events. Here we
call the collection of these copies an Element. Although these copies are also referred to as a
family of TEs in the context of genome annotation, we reserve the word “family” for its more
conventional use (e.g., Tc1 family, mariner family etc), which is a group of related TEs in the
same or diverse organisms that usually share conserved amino acid sequences in their
transposases or reverse transcriptases. The task of TE annotation is the combination of
identifying and classifying TEs and no cut and dry standard exists. This document is designed to
provide a general guideline for the classification, naming, and submission of TEs to TEfam at the
level of an Element. It is not designed to deal with annotating individual copies of an Element in
a genome assembly. Either a consensus or a representative of an Element may be submitted to
TEfam.
2. Naming convention for TEfam
2.1. How does one classify TEs?
There is no absolute answer to this question. We provide a scheme for your consideration, which
is detailed in section 4 General Classification. The current classification scheme reflects the
consensus of the laboratories involved in annotation of mosquito TEs. A broader advisory
committee may be necessary for establishing standard TE nomenclature if TEfam is to be widely
used in other organisms. The classification includes Class, Subclass, Family (equivalent to Clade
for Class I RNA-mediated TEs), Subfamily (equivalent to lineages for Class I RNA-mediated
TEs), and Element. See sections 3 and 4 for examples and details. In the current TEfam structure,
only a TEfam administrator can add or delete Family names. Please email [email protected] to
request addition or deletion of Family names. Subfamily names are at the discretion of the
submitter. However, we recommend using established subfamily/lineages names (e.g., mariner
subfamilies and lineages in Ty3/gypsy) whenever possible. The systematic name of an Element
is derived from the name of its TE Family, not the Subfamily.
2.2. How does one define an Element?
This is another difficult issue that lacks consensus from the TE community. Definition has to
take into account the evolutionary dynamics of the TE in question and sometimes consistency
with analysis from other species (if comparison is desired)
2.3. Previously named Element and synonyms.
If one uses a new systematic name for a TE that has already been named in the literature, one
should provide its “old” name as synonyms.
2.4. Naming an Element.
Names should contain only English alphabet, Arabic numerals, and underscore, which will
ensure that they will be easily identified by a wide variety of automated searches.
2.5. Species.
Currently, the species name is selected from one of the two mosquito species Aedes aegypti and
Anopheles gambiae.
3. Examples
Jockey_Ele1 represents Element 1 of the Jockey family of non-LTR retrotransposons.
(Instead of simply using Jockey1 as the Element name, we recommend using Ele to specify an
element within an established TE family. This is to avoid confusion with some established
family names. For example, Tc1 is an established family and it is not an Element within the Tc
family. Please see more examples in the following table. Classification scheme is described
section 4.
MADE-UP examples
Element
Synonym
Class
Ty1copia_Ele1
Jockey_Ele1
L1_Ele1
Tc1_Ele1
Mosqcopia
Class
Class
Class
Class
Topi
Subclass
I
I
I
II
LTR retrotransposon
Non-LTR Retrotransposons
Non-LTR Retrotransposons
Cut and Paste transposon
4. General Classification.
Class I, RNA-mediated TEs
Subclass, LTR retrotransposons
Family (Clades or Groups)
Subfamily (or lineage, optional)
Element
Subclass, Non-LTR retrotransposons,
Family (Clades)
Subfamily (optional)
Element
Subclass, Short interspersed nuclear elements (SINEs)
Family
Subfamily (optional)
Element
Class II, DNA-mediated TEs
Subclass, "Cut and paste" DNA transposons
P
hAT family
Tc1 family
mariner family
ITmD37E family
Family Subfamily
(Clade) (Lineage)
Ty1copia
Jockey
L1
Tc1
ITmD37D family
DD41D family
pogo family
piggyBac family
PIF family
Transib family
Merlin family
Subclass, Miniature inverted-repeat transposable elements (MITEs)
Family (e,g., m8bp, MITEs with 8bp TSDs)
Element (e.g., m8bp_Ele1)
Subclass, Helitrons
Family
Element
Penelope-like elements
Notes:
1. TSDs, target site duplications.
2. The classification of MITEs is based on their TSDs because it may be difficult to identify
corresponding DNA transposons in many cases.