* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download TIGR_ISS
Genome evolution wikipedia , lookup
Non-coding RNA wikipedia , lookup
Epitranscriptome wikipedia , lookup
Gene regulatory network wikipedia , lookup
Protein (nutrient) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Molecular evolution wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Magnesium transporter wikipedia , lookup
Protein structure prediction wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
Interactome wikipedia , lookup
Protein domain wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein adsorption wikipedia , lookup
List of types of proteins wikipedia , lookup
Gene expression wikipedia , lookup
Protein mass spectrometry wikipedia , lookup
Western blot wikipedia , lookup
Homology modeling wikipedia , lookup
Sequence-based manual annotation, as carried out at TIGR Genome sequence find coding genes or predicted protein coding genes. RNA finding (tRNAscan, RFAM, homology searches) predicted RNA genes Collect any literature for the gene product translation Sequence based searches: Blast-type pairwise alignments; HMM searches (Pfam, TIGRFAM, etc.); InterPro; TMHMM; SignalP; TargetP; COGs; Paralogous families; and more….. Evaluate evidence presented in paper Evaluation of evidence pairwise alignments: Get Candidate GO terms -from match proteins -from matching families/ domains/motifs -from EC number mapping, InterPro2GO, other mappings, etc. Search for GO terms if no candidates present themselves -GO search/browse tool AmiGO -many other tools (e.g. Manatee, QuickGO, etc.) Evaluate GO terms: Check that the quality of evidence supports candidate GO terms at a particular level of specificity. Read the literature relevant to the experimental characterization of any match proteins used as evidence. Check that any GO terms that may be assigned to the match protein are correct. Check GO trees and definitions to make sure the term makes sense for your organism. Generally it is safer to make function GO annotations than process ones based on sequence similarity to single proteins. See IGC chart for more on process annotations based on sequence. Visually inspect alignments, look for conserved active sites, look for (generally) at least 35% identity across the full lengths of both proteins. If matches are not full length, look to see if there are recognized functional domains in the area where the match occurs. Decide how much information can be transferred from the match protein to the query. In order to assert that the query has the exact same function as the match protein, the match protein must be experimentally characterized. If any doubt about specificity of the function exists, back up to a more general level of annotation. family/domain based evidence: Review search results (InterPro, HMM). Look to see specificity of the family in question. Can a specific function be assigned based on membership in the family?, or is the family broad in functional scope? If so, can a general function such as “kinase” or “oxidoreductase” be given. If not, can a name be given based on family membership even if function is unknown? motif predictors: Look to see what the presence of membrane spans, signal peptides, etc. is telling you about the protein in light of other information coming from other search results - is it all consistent, does it add up to a particular cellular location or function? If all you have is a motif, perhaps you can still make some annotations (eg. “integral membrane protein” based on for example multiple TMHMM regions.