Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Draft Guidelines for TE Classification and Submission to TEfam 1. Background TEfam is a relational database for the submission, retrieval, and analysis of transposable elements (TEs). Our main goals are to provide an outlet for information from systematic and genome-scale characterization of TEs and to provide a user-friendly interface for biologists to take full advantage of the compiled information. Within any genome, a TE consists of multiple copies generated by transposition events. Here we call the collection of these copies an Element. Although these copies are also referred to as a family of TEs in the context of genome annotation, we reserve the word “family” for its more conventional use (e.g., Tc1 family, mariner family etc), which is a group of related TEs in the same or diverse organisms that usually share conserved amino acid sequences in their transposases or reverse transcriptases. The task of TE annotation is the combination of identifying and classifying TEs and no cut and dry standard exists. This document is designed to provide a general guideline for the classification, naming, and submission of TEs to TEfam at the level of an Element. It is not designed to deal with annotating individual copies of an Element in a genome assembly. Either a consensus or a representative of an Element may be submitted to TEfam. 2. Naming convention for TEfam 2.1. How does one classify TEs? There is no absolute answer to this question. We provide a scheme for your consideration, which is detailed in section 4 General Classification. The current classification scheme reflects the consensus of the laboratories involved in annotation of mosquito TEs. A broader advisory committee may be necessary for establishing standard TE nomenclature if TEfam is to be widely used in other organisms. The classification includes Class, Subclass, Family (equivalent to Clade for Class I RNA-mediated TEs), Subfamily (equivalent to lineages for Class I RNA-mediated TEs), and Element. See sections 3 and 4 for examples and details. In the current TEfam structure, only a TEfam administrator can add or delete Family names. Please email [email protected] to request addition or deletion of Family names. Subfamily names are at the discretion of the submitter. However, we recommend using established subfamily/lineages names (e.g., mariner subfamilies and lineages in Ty3/gypsy) whenever possible. The systematic name of an Element is derived from the name of its TE Family, not the Subfamily. 2.2. How does one define an Element? This is another difficult issue that lacks consensus from the TE community. Definition has to take into account the evolutionary dynamics of the TE in question and sometimes consistency with analysis from other species (if comparison is desired) 2.3. Previously named Element and synonyms. If one uses a new systematic name for a TE that has already been named in the literature, one should provide its “old” name as synonyms. 2.4. Naming an Element. Names should contain only English alphabet, Arabic numerals, and underscore, which will ensure that they will be easily identified by a wide variety of automated searches. 2.5. Species. Currently, the species name is selected from one of the two mosquito species Aedes aegypti and Anopheles gambiae. 3. Examples Jockey_Ele1 represents Element 1 of the Jockey family of non-LTR retrotransposons. (Instead of simply using Jockey1 as the Element name, we recommend using Ele to specify an element within an established TE family. This is to avoid confusion with some established family names. For example, Tc1 is an established family and it is not an Element within the Tc family. Please see more examples in the following table. Classification scheme is described section 4. MADE-UP examples Element Synonym Class Ty1copia_Ele1 Jockey_Ele1 L1_Ele1 Tc1_Ele1 Mosqcopia Class Class Class Class Topi Subclass I I I II LTR retrotransposon Non-LTR Retrotransposons Non-LTR Retrotransposons Cut and Paste transposon 4. General Classification. Class I, RNA-mediated TEs Subclass, LTR retrotransposons Family (Clades or Groups) Subfamily (or lineage, optional) Element Subclass, Non-LTR retrotransposons, Family (Clades) Subfamily (optional) Element Subclass, Short interspersed nuclear elements (SINEs) Family Subfamily (optional) Element Class II, DNA-mediated TEs Subclass, "Cut and paste" DNA transposons P hAT family Tc1 family mariner family ITmD37E family Family Subfamily (Clade) (Lineage) Ty1copia Jockey L1 Tc1 ITmD37D family DD41D family pogo family piggyBac family PIF family Transib family Merlin family Subclass, Miniature inverted-repeat transposable elements (MITEs) Family (e,g., m8bp, MITEs with 8bp TSDs) Element (e.g., m8bp_Ele1) Subclass, Helitrons Family Element Penelope-like elements Notes: 1. TSDs, target site duplications. 2. The classification of MITEs is based on their TSDs because it may be difficult to identify corresponding DNA transposons in many cases.