* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Standard Genetic Nomenclature - Iowa State University Digital
Genetically modified crops wikipedia , lookup
Metagenomics wikipedia , lookup
Copy-number variation wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Heritability of IQ wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Behavioural genetics wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Human genetic variation wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Pathogenomics wikipedia , lookup
Population genetics wikipedia , lookup
Gene therapy wikipedia , lookup
Gene desert wikipedia , lookup
Helitron (biology) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Dominance (genetics) wikipedia , lookup
History of genetic engineering wikipedia , lookup
Gene expression programming wikipedia , lookup
Genome editing wikipedia , lookup
Genetic engineering wikipedia , lookup
Genome evolution wikipedia , lookup
Genome (book) wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Public health genomics wikipedia , lookup
Gene nomenclature wikipedia , lookup
Microevolution wikipedia , lookup
Animal Science Publications Animal Science 2014 Standard Genetic Nomenclature Zhiliang Hu Iowa State University, [email protected] James M. Reecy Iowa State University, [email protected] Fiona M. McCarthy University of Arizona Carissa A. Park Iowa State University, [email protected] Follow this and additional works at: http://lib.dr.iastate.edu/ans_pubs Part of the Agriculture Commons, Animal Sciences Commons, and the Genetics Commons The complete bibliographic information for this item can be found at http://lib.dr.iastate.edu/ ans_pubs/171. For information on how to cite this item, please visit http://lib.dr.iastate.edu/ howtocite.html. This Book Chapter is brought to you for free and open access by the Animal Science at Iowa State University Digital Repository. It has been accepted for inclusion in Animal Science Publications by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. 24 Standard Ganetic·•Nomencl,ture Zhi-Liang Hu, 1 James M. Reecy, 1 Fiona McCarthy2 and Carissa A. Park 1 1 /owa State University, Ames, Iowa, USA; 2 University of Arizona, Tucson, Arizona, USA Introduction Locus and Gene Names and Symbols Locus name and symbol Allele name and symbol Genotype terminology Gene annotations and the gene ontology (GO) Trait and Phenotype Terminology Traits Super-traits Trait hierarchy and ontology Current status of research Trait and phenotype nomenclature Future Prospects Acknowledgements References Introduction Genetics includes the study of genotypes and phenotypes, the mechanisms of genetic control between them, and information transfer between generations. Genetic terms describe processes, genes and traits with which genetic phenomena are examined and described. While the genetic terminologies are extensively discussed in this book and elsewhere, the standardization of their names has been an ongoing process. Therefore, this chapter will only concentrate on discussions about the issues involved in the standardization of gene and trait terminologies. Readers may wish to refer to online resources (see Table 24.1 for URLs) for lists of the glossaries currently in use. 598 598 599 599 600 600 600 602 602 602 603 604 605 605 607 607 A standardized genetic nomenclature is vital for unambiguous concept description, efficient genetic data management and effective communications not only among scientists, but also among those who are involved in cattle production and genetic improvement. This issue has become even more critical in the post-genomics era due to rapid accumulation of large quantities of genetic and phenotypic data, and the requirement for data management and computational analysis, which increases the need for precise definition and interpretation of gene and trait terms. For example, the Myostatin (MSTN) gene is known as Growth and Differentiation Factor 8 (GDF8 or GDF-8) in some literature and is also referred to as the 'muscle hypertrophy' or 'double-muscling' locus in cattle. While the ©CAB International 2015. The Genetics of Cattle, 2nd Edn (eds D.J. Garrick and A. Ruvinsky) ·) interchangeable use of all these names in the literature can cause confusion, it gets more complicated when one considers paralogous gene duplications across species, which led Rodgers et a/. (2007) to propose MSTN-1 and MSTN2. Unfortunately, this naming scheme does not follow the Human Genome Organization (HUGO) Gene Nomenclature Committee (HGNC) guidelines, which indicate that these paralogues should be named MSTNl and MSTN2, respectively. In terms of traits, an example that would benefit from consistent nomenclature is the longissimus dorsi muscle area, which is also referred to as the loin eye area (LEA), loin muscle area (LMA), meat area (MLD), ribeye area (REA), etc. Each of these is known to certain researchers as their default name for the trait. Complexity is further increased by variation in anatomic locations, physiological stages and methods used to measure a given trait. This may seem manageable at first, but once one starts to compare data across different laboratories, publications or species, it quickly becomes very confusing. The 'standard genetic nomenclature' recommendations made by the Committee on Genetic Nomenclature of Sheep and Goats (COGNOSAG) in the 1980s and 1990s initially covered sheep and goats and were later extended . to cattle (Broad et a/., 1999). Dolling (1999) summarized these efforts and abstracted guidelines for practical use. In 2009, an international meeting to discuss coordination of gene names across vertebrate species was held in Cambridge, UK (Bruford, 2010). While we may hesitate to dictate how genetic terms are defined, adopting a standardized genetic nomenclature system enables researchers to more easily manage and compare their data, both within and across species. The emergence of the use of ontologies in biological research has contributed a new way to effectively organize biological data and facilitate analysis of large datasets. Adopting standardized nomenclature will further enable researchers to unambiguously organize and manage their data. When genomic information must be transferred across species to perpetuate genetic discoveries, the role of a standardized genetic nomenclature becomes even more important. The goal of this chapter is to clearly state guidelines for nomenclature, with the hope that they will facilitate comparison of results between experiments and, most importantly, prevent confusion. Locus and Gene Names and Symbols Locus name and symbol The following guidelines for cattle gene nomenclature are adapted and abbreviated from the HUGO Gene Nomenclature Committee (HGNC; see Table 24.1 for URL). A gene is defined as 'a DNA segment that contributes to phenotype/function. In the absence of demonstrated function a gene may be characterized by sequence, transcription or homology.' A locus is not synonymous with a gene. It is defined as 'a point in the genome, identified by a marker, which can be mapped by some means. A locus could be an anonymous non-coding DNA segment or a cytogenetic feature.' A single gene may have numerous loci within it (each may be defined by different markers). A gene name should be short and specific, and convey the character or function of the gene. Gene names should be written using American spelling and contain only Latin letters or a combination of Latin letters and Arabic numerals. A gene symbol should start with the same letter as the gene name. The gene symbol should consist of upper-case Latin letters and possibly Arabic numerals. Gene symbols must be unique. A locus name should be in capitalized Latin letters or a combination of Latin letters and Arabic numerals. A locus symbol should consist of as few Latin letters as possible or a combination of Latin letters and Arabic numerals. The characters of a symbol should always be capital Latin characters and should begin with the initial letter of the name of the locus. If the locus name is two or more words, then the initial letters of each word should be used in the locus symbol. Gene and locus names and symbols should be printed in italics whenever possible; otherwise they should be underlined. Z·-LHu.etal.) 500 When assigning cattle gene nomenclature, the gene name and symbol should be assigned based on existing HGNC nomenclature when 1:1 human:bovine orthology is well established. Recognized members of gene families should be named following existing naming schemes. Initial efforts to provide information about genes predicted during the cattle genome sequencing project resulted in the assignment of standardized names for 57 57 cattle genes based on human gene nomenclature (Bovine Genome Sequencing and Analysis Consortium, 2009). There are two categories of novel cattle genes: (i) novel genes predicted by bioinformatic gene prediction programs; and (ii) novel genes that have been studied prior to the completion of the cattle genome. In addition, it is anticipated that, in the future, additional novel genes will be identified by RNA-sequencing experiments. In cases where no strict 1: 1 human orthologue exists that has been assigned nomenclature, the NCB! LOC# or Ensembl 10 should be used as a temporary gene symbol for predicted genes with no known function. In order to assign a symbol/ name to novel genes, they will need to be manually curated and assigned a unique symbol/name following these guidelines. Allele name and symbol These guidelines for allele nomenclature are adapted from Dolling (1999) and mouse genome nomenclature guidelines (see Table 24.1 for URL), consistent with HGNC guidelines. Alleles do not have to be named, but should be assigned symbols. An allele symbol should always be written following the locus symbol. It can consist of Latin letters or a combination of Latin letters and Arabic numerals. An allele name should be as brief as possible, and should convey the variation associated with the allele. If a new allele is similar to one that has already been named, it should be named according to the breed, geographic location or population of origin. If new alleles are to be named for a recognized locus, they should conform to nomenclature established for that locus. The first letter of the allele name should be lower case. The allele name and symbol may be identical for a locus detected by biochemical, 1 serological or nucleotide methods. The HGNC guideline recommends that 'allele designation should be written on the same line as gene symbol separated by an asterisk e.g. PGM1 * 1, the allele is printed as * 1 '. The wild-type allele can be denoted with a+ (e.g. MSTN+). Neither+ nor - symbols should be used in alleles detected by biochemical, serological or nucleotide methods. Null alleles should be designated by the number zero. A single nucleotide polymorphism (SNP) allele should be designated based on its dbSNP_id, followed by a hyphen and the specific nucleotide (e.g. MSTNrs1234567- T). If the SNP occurs outside of an identified gene, the SNP locus can be designated using the dbSNP_id as the locus symbol, followed by a hyphen and the nucleotide allelic variants as in rs1234567- T. The allele name and symbol should be printed in italics whenever possible; otherwise they should be underlined. Genotype terminology The genotype of an individual should be shown by printing the relevant locus and allele symbols for the two homologous chromosomes concerned, separated by a slash, e.g. MSTNrs1234567-T/rs1234567-C. Unlinked loci should be separated by a semicolon, e.g. CD 11Rsal-2400!2200; ESRPvuii-5 700/4200. Linked loci should be separated by a space or dash and listed in linkage order (e.g. POU1F1A!G-STCHC/G-PRSS7A!T), or in alphabetical order if the linkage order is not known. For X-linked loci, the hemizygous case should have a /Y following the locus and allele symbol, e.g. AR-Eco57I-1094/Y. Likewise, Y-linked loci should be designated by /X following the locus and allele symbol. Gene annotations and the gene ontology (GO) Advances in genomic technologies require that researchers be able to functionally analyse large, high-throughput datasets to gain insight into the complex systems they are studying. By using the same nomenclature and procedures j ' I I to describe gene function, gene components can be consistently linked to function in a way that facilitates effective computational analysis and promotes comparative genomics. In 1998, the GO Consortium was formed to standardize functional annotation in the form of gene ontologies that can be used across all eukaryotes (Gene Ontology Consortium, 2000). This effort not only provided a standard method for functional annotation but also promoted data sharing and enabled modelling of functional genomics datasets. The GO consists of three separate ontologies: Biological Process, Cellular Component, and Molecular Function. Genes or gene products are associated with GO terms that represent gene attributes. A GO term is defined with a term name, a unique identifier and a definition (preferably indicating which of the three sub-ontologies it belongs to, information about its relationships to other GO terms and cited sources). GO terms may also have synonyms, database cross-references and comments to provide more detailed information. A unique GO identifier consists of the prefix 'GO' followed by a colon and six to eight numerical digits, e.g. G0:0000016. It serves as a key to reference GO terms in a GO database. An example of a GO term is shown in Fig. 24.1. Standard GO annotations are maintained by the GO Consortium (see Table 24.1 for URL), which provides updates of qualitychecked data for public access. The GO id: name: namespace: def: synonym: synonym: xref: xref: xref: is_a: annotations are used by secondary source databases like Entrez Gene (see Table 24.1 for URL; Sayers eta/., 2012) and UniProt (UniProt Consortium, 2010), genome browsers like Ensembl (see Table 24.1 for URL; Flicek, 2013), and analysis tools like DAVID (see Table 24.1 for URL; Huang, 2009), among other publicly accessible resources and tools. A growing number of model organism and livestock animal species (including bovine) databases and working groups contribute annotation sets to the GO repository (McCarthy, 2007; Reese, 2010). GO annotations are created by capturing the gene product information (database, database accession, name and symbol, type of gene product and species taxon), its associated GO term, GO sub-ontology and evidence for the assertion with references. The current practice for bovine GO annotation is to provide names and symbols based upon a combination of NCBI Entrez Gene and UniProtKB names. In instances where there is no suitable gene symbol, database accessions are used. Continued efforts are made to improve the accuracy of the bovine GO annotations by transferring GO annotations from better annotated proteins in human and mouse based on Ensembl orthology. As of September 2012, GO annotation for bovine (McCarthy, 2007) comprises 306,7 46 annotation entries for 41,63 7 gene products; 86.7% of these annotations G0:0000016 lactase activity molecular_function "Catalysis of the reaction: lactose + H20 = D-glucose + Dgalactose. • [EC:3.2.1.108] "lactase-phlorizin hydrolase activity• BROAD [EC:3.2.1.108] "lactose galactohydrolase activity• EXACT [EC:3.2.1.108] EC:3.2.1.108 MetaCyc:LACTASE-RXN Reactome:20536 G0:0004553 ! hydrolase activity, hydrolyzing 0-glycosyl compounds Fig. 24.1. An example of a GO term. (For further information, see Table 24.1 for GO website URL.) are computationally derived (AgBase: see Table 24.1 for URL). To contribute annotations to the GO, or for a complete list of bovine GO data, users are encouraged to contact either the GO Consortium or AgBase at their respective websites. Trait and Phenotype Terminology Cattle traits are conventionally named based on performance (e.g. body weight), physiological parameters (e.g. blood cholesterol level), anatomic locations/dissections (e.g. loin muscle area), physical-chemical properties (e.g. milk protein content), livelihood soundness (e.g. immune capacity) and exterior appearance (e.g. coat colour), etc. As such, there is a good chance a trait will be named differently by different people, even within a species community. Furthermore, traits have been studied across many species, which adds additional complexity to their naming. The study of traits may also involve the study of underlying genes and markers, environments and management protocols that contribute to the manifestation of a trait. Therefore, it is obvious that factors that contribute to the naming of a trait are multi-dimensional. As the amount of trait information associated with a gene or chromosomal region is growing exponentially, we cannot overemphasize the need for a standard nomenclature to be used by researchers to communicate as consistently and unambiguously as possible, with the aid of bioinformatics tools. Traits Cattle trait terms can be found ubiquitously throughout journal articles, farm reports and daily communications among scientists and cattle industry personnel. A trait term can be created by anyone, and each person may have a slightly different definition for any given term. As such, hundreds of thousands of terms can be found in the literature with various naming conventions used. Previously, there was no central repository where the uniqueness of a trait term could be maintained and checked, until two relatively recent database development efforts emerged: the Online Mendelian Inheritance in Animals (OMIA) database and the Animal QTL database (QTLdb). OMIA (see Table 24.1 for URL) was initiated in 1978. To date, it contains >400 cattle trait variations and/or abnormalities from cattle genetic research publications (Nicholas, Chapter 5). The Animal QTLdb (see Table 24.1 for URL) has a collection of 4 70 cattle traits, including measurement method variations (Hu eta/., 2013), of which 407 traits have at least one QTL. Curators at both OMIA and Animal QTLdb made efforts to make each database entry unique in terms of the names and their representations. Expanded from. the QTLdb development, an Animal Trait Ontology (ATO) project at Iowa State University (see Table 24.1 for URL) has been launched to standardize traits for livestock species including cattle. Its initial purpose was to help with organization and management of trait information through the use of a controlled vocabulary to facilitate comparison of QTL results and standardize trait data annotation and retrieval (Hu et a/., 2005, 2007). It was soon introduced to the community (Hughes eta/., 2008). Super-traits Compared to standard gene nomenclature, trait name standardization is far more complex, not only because the same trait can be named differently (e.g. 'loin eye area' versus 'ribeye area'), but also because many factors contribute to how a trait is defined under various circumstances. For example, Fig. 24.2 shows a list of 10 'backfat thickness' variations, each of which is defined by their specific measurement methods, measuring time and specific anatomic locations, which may contribute to trait comparison difficulties and increase the potential for confusion. One attempt to simplify the comparisons was by introduction of the concept of 'trait types' or 'super-traits'. Hu et a/. (2005) described trait type as a general physical or chemical property of, or the processes that lead to, or types of measurements that result in, an observation (phenotype). The 'trait types' ~~a~~~~~""~fhEJil~~~~···· 60S Backfat thickness (average backfat) by ultrasound } Backfat thickness (average backfat) by ruler Backfat thickness at the 7th rib Backfat thickness at the 121h rib } Backfat thickness at the 12th_13th rib Backfat thickness at the 13th rib Backfat thickness measured at 1-3 days postpartum Backfat thickness measured at 40-42 days postpartum Backfat thickness measured at 90-92 days postpartum Backfat thickness measured at 130-150 days postpartum bymethods by locat;orn; }"'"""' Fig. 24.2. An example of the trait name variations by different modifiers such as measurement methods, time and sampling locations. This variation can easily add difficulties for accurate and unambiguous trait comparisons. or 'super-traits' were initially used to serve as a general concept for a trait, regardless of possible variations in trait names based on measurement times, locations or methods. As the ATO project progressed, the factors in the methods of trait measurements, such as point in time or time span, anatomic locations, instruments, etc., were classified as 'trait modifiers', because they do not constitute a component of a trait, but only affect the way a trait is described. Therefore, the 'super-trait' may only be employed to categorize variations in how a trait is defined or named. For example, 'rib eye area', 'rib-eye area', 'rib muscle area', 'longissimus dorsi muscle area', 'longissimus muscle area', 'loin eye area', 'loin muscle area', etc. can be unified as 'longissimus dorsi muscle area (LMA)'. 'Backfat', 'backfat depth', 'backfat thickness', 'backfat above muscle dorsi', 'backfat intercept', 'backfat linear', etc. may all simply be referred to as 'subcutaneous fat thickness'. Trait hierarchy and ontology In order to compare QTL across experiments, the Cattle QTLdb uses a trait hierarchy (Fig. 24.3) to provide a framework for organizing the traits and easily locating them (Hu et a/., 2013). This approach simplifies the procedures by which traits are defined, linked and compared. Subsequently, a computer program could be implemented to automatically process the database searches, so that when a user queries for a trait by keywords, the database can gather and retrieve related trait names and their associated QTL, put them together and present them to the user in real time. However, people of different disciplines may see the need for a different trait hierarchy, which may better capture the subtleties required in their field. For example, for body weight gained over a period of time (e.g. average daily gain, ADG), a farmer considers it a production trait, a nutritionist may see it as an indicator for feed conversion efficiency and a veterinarian may find it a health status parameter. Similarly, blood cholesterol levels may be used to predict meat quality by beef producers, and may also be used as a parameter to predict coronary heart disease by those who use cattle as an animal model for human heart disease research. Therefore, a simple hierarchy may be helpful to reduce the complexity in some cases, although may not be adequate in all cases. In addition, due to the existence of multiple overlapping hierarchies for cattle traits, the management of such data may introduce one more dimension of complexity to the ontology structure. Ontologies are controlled vocabularies used to describe objects and relationships between them in a formal manner. In an ontology, the Directed Acyclic Graph (DAG), a mathematical graphic modelling method, is used to solve data management problems with complex hierarchical structures. For example, the trait 'marbling' may belong to the 'meat quality', 'adipose trait' or 'muscular system physiology' hierarchies. Computer tools have been developed and are freely available to manage such ontology data with DAG structures. The two most popular tools that are likely to be useful to the cattle genetics community Cattle traits Disease susceptibility General health parameters Mastitis Organ disorder Parasite load Parasite resistance Carcass characteristics Meat quality + Milk composition - fat Milk composition - other Milk composition - protein Milk processing trait Milk yield Production traits Energy efficiency Feed conversion Feed intake Growth Life history traits Lifetime production Reproduction traits Fertility General Semen quality Behavioural Conformation Pigmentation A simple cattle trait class hierarchy used in the Animal QTLdb for users to browse for traits of interest. (See Table 24.1 for URL.) Fig. 24.3. are AmiGO and OBO-Edit (Gene Ontology Tools, see Table 24.1 for URL). AmiGO is an ontology browser adapted to the ATO database, which allows users to share and view trait data stored in ATO with any web browser on the internet. OBO-Edit is a java-based ontology data editor that can be used by anyone to edit ontology term definitions and relationships, and to export data in Open Biological/Biomedical Ontologies (080) format to share data. Current status of research The ATO has been a successful project since its development from the QTLdb several years ago. Recently, the developers of ATO have begun working with Mouse Genome Informatics, the Rat Genome Database, European Animal Disease Genomics Network of Excellence (EADGENE) and the French National Institute for Agricultural Research (INRA) to incorporate the Mammalian Phenotype Ontology (MPO) and the ATO into a unified Vertebrate Trait (VT) Ontology (Park et a/., 2013; see Table 24.1 for URL). To reach a proper granularity level of the trait ontology, Product Trait (PT) Ontology (see · Table 24.1 for URL) and Clinical Measurement Ontology (CMO; Shimoyama et a/., 2012; see Table 24.1 for URL) were introduced. By reuse of existing ontologies and integration of production-specific livestock traits, researchers at INRA have also launched an Animal Trait Ontology for Livestock (ATOL) site, containing over 1000 traits including those of cattle (Golik eta/., 2012). Current efforts have been aimed at enhancing the ability to standardize trait nomenclature within and across species. For example, a disease such as mastitis in dairy cattle may have been considered a 'trait' in classical animal genetic studies. In fact, in terms of concept specifications, it is not a characteristic cattle trait observable in the general population, but rather an abnormal manifestation in some cattle (in fact, resistance to mastitis is a trait). In addition, a trait name may have variations because it is 'modified' by measurement time or method (Fig. 24.2), but the names actually represent the same trait. The separation of diseases from traits reflects the efforts toward a well-defined and standardized trait nomenclature. Standardization of the trait nomenclature will undoubtedly help the cattle genomics community make meaningful trait comparisons, as well as facilitate the transfer of genomics information from some well-studied species. The challenge of using ontologies to standardize and manage trait nomenclature is not only a technical issue, but a community issue, in the sense that it has to be commonly recognized, mutually agreed upon, and widely shared. (standard Genetic Nomendat~re Trait and phenotype nomenclature Until an international committee issues rules for trait and phenotype nomenclature, a good practice with wide acceptability is to follow the 'norm' in published materials. Listed in Table 24.1 are some of the best trait reference resources available to date (see table footnote for details). Since this has been an active research area in recent years, it is highly recommended that users check multiple databases for the best and most up-to-date information. Phenotype is the actual manifestation of observable traits. A phenotype is a trait observed in an individual. It usually consists of a trait with characteristic features (e.g. twinning), variations that can be described (e.g. black spots on the body) or qualities that can be measured (e.g. birth weight of 30 kg). Since there are so many variations as to how a phenotype can be 'observed' (often such observation is made indirectly with instruments or through tests) and obtained, a technical guide for recording each trait might be ideal. Often a description of comments for a phenotype record may be necessary to correctly understand and use the data. For example, when blood samples are taken, the number of hours the animal is fasted might be an important co-factor for the measurement of blood cholesterol concentration. When a phenotype is a reflection of a certain genotype, the phenotype symbol should be the same as the genotype symbol. The difference is that the characters should not be underlined or in italics, and they should be written with a space between locus characters and allele characters, instead of an asterisk. Square brackets [ ] may also be used. In classical genetics, phenotypes were sometimes used to denote Mendelian genotypes. This was done using an abbreviation of the trait, post-fixed with a plus(+) or minus(-) sign to represent 'presence' or 'absence' of certain trait features. For example, halothanenegative was denoted as 'Hal-', and halothanepositive as 'Hal+'. A phenotype denotation can also be used to represent genetic haplotypes, such that 'K88ab+, ac+, ad-' are written together as an entire denotation. Likewise, numbers or letters may be used to denote alleles when polymorphisms are observed, for 005 example, ApoB1/2, ApoB2/3, etc. (Note the difference from recording genotypes, where italics or asterisks are required.) Future Prospects The Gene Ontology and Mammalian Phenotype Ontology are already playing a role in robust annotation of mammalian genes and phenotypes in the context of mutations, quantitative trait loci, etc. (Smith et a/., 2005). Undoubtedly, a standardized cattle genetic nomenclature will more effectively facilitate efficient cattle genome annotation and transfer of knowledge from information-rich species such as humans and mouse, and make it possible for new bioinformatics tools to easily streamline data management and genetic analysis. Meanwhile, it is noteworthy to mention that the term 'phene' for 'trait' is being used more frequently in the scientific literature in recent years. It is interesting that in terms of etymology lineage, 'phene' is to 'phenotype' and 'phenome' as 'gene' is to 'genotype' and 'genome' (Wikipedia, 2012), where 'phene' is an equivalent term for 'trait'. However, Dr Frank Nicholas from the University of Sydney has used the term 'phene' in OMIA in a slightly different but more concise context, namely 'phene is to gene as phenotype is to genotype', where 'phene' refers to a set of phenotypes that correspond to a set of genotypes determined by a gene. This is practically very useful in light of the future structured genetic terminology standardization in the genomics era. Several genome databases, such as ArkDB, Animal QTLdb, Bovine Genome Database, Ensembl and NCB! GeneDB, have played a role in the usage of commonly accepted gene/trait notations. Undoubtedly, existing and new genome databases and tools will further develop and evolve. As such, a standardized genetic nomenclature in cattle will definitely become crucial for information sharing and comparisons between different research groups, across experiments and even across species. Recently the Animal Genetics journal has updated its Author Guidelines insisting that proper gene nomenclature be followed: 'All gene names and symbols should be italicized throughout the text, table and figures'; 'Locus ----- http://www.agbase.msstate.edu/cgi-bin/information/Cow.pl http://www.animalgenome.org/QTLdb http://www.animalgenome.org/bioinfo/projects/ATO/ http://www.atol-ontology.com http://www.animalgenome.org/QTLdb/exporVcattle_traits http://bioportal.bioontology.org/ontologies/CMO (BioPortal) http://www.animalgenome.org/bioinfo/projects/cmo http://david.abcc.ncifcrf.gov http://www.ensembl.org http://www.ncbi.nlm.nih.gov/gene http://neurolex.org/wiki/Category: Resource:Gene_Ontology_Tools http://www.animalgenome.org/genetics_glossaries http://www.geneontology.org http://www.geneontology.org/GO.ontology.structure.shtml http://www.genenames.org/guidelines.html http://www.informatics.jax.org/mgihome/nomen/gene.shtml http://omia.angis.org.au/ http://www.animalgenome.org/bioinfo/projects/pt http://www.uniprot.org http://bioportal.bioontology.org/ontologies/VT (BioPortal) http://www.animalgenome.org/bioinfo/projects/vt AgBASE Animal QTLdb Animal Trait Ontology project ATOL Cattle trait hierarchy CMO project ----------------~--- ---~---- VT, Vertebrate Trait Ontology is a controlled vocabulary for the description of traits (measurable or observable characteristics) pertaining to the morphology, physiology or development of vertebrate organisms. CMO, Clinical Measurement Ontology is designed to be used to standardize morphological and physiological measurement records generated from clinical and model organism research and health programmes. PT, Product Trait Ontology is a controlled vocabulary for the description of traits (measurable or observable characteristics) pertaining to products produced by or obtained from the body of an agricultural animal or bird maintained for use and profit. QTLdb, Animal QTLdb is a database to house all QTL data for all livestock species. OMIA, Online Mendelian Inheritance in Animals is a comprehensive collection of phenotypic information on heritable animal traits and genes in a comparative context, relating traits to genes where possible. ATOL, Animal Trait Ontology for Livestock is aimed at defining livestock traits, with a focus on the main types of animal production in line with societal priorities. DAVID Ensembl Entrez Gene Gene Ontology Tools Genetic glossaries GO Consortium GO structure HGNC guidelines Mouse genome nomenclature guidelines OMIA PT project UniProt VT project URL Data source Table 24.1. Internet URL addresses for the web resources used in this chapter and cattle trait glossary information. (stand&'4 ~~No!J'I~~clatut$ 607 symbols used in Animal Genetics publications must be confirmed with HGNC' and 'nonhuman gene names should be checked against NCBI's Entrez Gene database'. This is a good move towards educating the community on the proper use of standardized genetic nomenclatures. Active development and use of a standardized genetic nomenclature will surely help to improve data quality and reusability, and facilitate data comparisons between experiments, laboratories, even species. Acknowledgements The authors wish to thank Dr Frank Nicholas from the University of Sydney for useful discussions, inputs and kind review of the draft. References Bovine Genome Sequencing and Analysis Consortium eta/. (2009) The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science 324, 522-528. Broad, T.E., Dolling, C.H.S., Lauvergne, J.J. and Millar, P. (1999) Revised COGNOSAG guidelines for gene nomenclature in ruminants 1998. Genetics, Selection, Evolution 31, 263-268. Bruford, E.A. (201 0) Highlights of the 'Gene Nomenclature Across Species' meeting. Human Genomics 4, 213-217. Dolling, C.H.S. (1999) Standardized genetic nomenclature for cattle. In: Fries, R. and Ruvinsky, A. (eds) The Genetics of Cattle. CAB International, Wallingford, UK, pp. 657-666. Flicek, P., Ahmed, 1., Amode, M.R., Barrell, D., Beal, K., Brent, S., Carvalho-Silva, D., Clapham, P., Coates, G., Fairley, S. eta/. (2013) Ensembl 2013. Nucleic Acids Research 41 (Database issue), D48-D55. Gene Ontology Consortium (2000) Gene Ontology: tool for the unification of biology. Nature Genetics 25, 25-29. Golik, W., Dameron, 0., Bugeon, J., Fatet, A., Hue, 1., Hurtaud, C., Reichstadt, M., Meunier-Salaun, M.C., Vernet, J., Joret, L. eta/. (2012) ATOL: the multi-species livestock trait ontology. 6th International Conference on Metadata and Semantic Research (MTSR'12), Cadiz, Spain, 28-30 November. Hu, Z.-L., Dracheva, S., Jang, W.-H., Maglott, D., Bastiaansen, J., Rothschild, M.F. and Reecy, J.M. (2005) A QTL resource and comparison tool for cattle: PigQTLDB. Mammalian Genome 16, 792-800. Hu, Z.-L., Fritz, E.R. and Reecy, J.M. (2007) AnimaiQTLdb: a livestock QTL database tool set for positional QTL information mining and beyond. Nucleic Acids Research 35 (Database issue), D604-D609. Hu, Z.-L., Park, C.A., Wu, X.-L. and Reecy, J.M. (2013) Animal QTLdb: an improved database tool for livestock animal QTUassociation data dissemination in the post-genome era. Nucleic Acids Research 41' D871-D879. Huang, D:W., Sherman, B.T. and Lempicki, R.A. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols 4, 44-57. Hughes, L.M., Bao, J., Hu, Z.-L., Honavar, V.G. and Reecy, J.M. (2008) Animal Trait Ontology (ATO): The importance and usefulness of a unified trait vocabulary for animal species. Journal of Animal Science 86, 1485-1491. McCarthy, F.M., Bridges, S.M., Wang, N., Magee, G.B., Williams, W.P., Luthe, D.S. and Burgess, S.C. (2007) AgBase: a unified resource for functional analysis in agriculture. Nucleic Acids Research 35 (Database issue), D599-D603. Park, C.A., Bello, S.M., Smith, C.L., Hu, Z.-L., Munzenmaier, D.H., Nigam, R., Smith, J.R., Shimoyama, M., Eppig, J.T. and Reecy, J.M. (2013) The Vertebrate Trait Ontology: a controlled vocabulary for the annotation of trait data across species. Journal of Biomedical Semantics 4, 13. Reese, J.T., Childers, C.P., Sundaram, J.P., Dickens, C.M., Childs, K.L., Vile, D.C. and Elsik, C.G. (2010) Bovine Genome Database: supporting community annotation and analysis of the Bos taurus genome. BMC Genomics 11, 645. Rodgers, B.D., Roalson, E. H., Weber, G.M., Roberts, S.B. and Goetz, F.W. (2007) A proposed nomenclature consensus for the myostatin gene family. American Journal of Physiology- Endocrinology and Metabolism 292, E371-E372. 008 l.-L.. Hu et al.) Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., Dicuccio, M., Federhen, S. eta/. (2012) Database resources of the National Center for Biotechnology Information. Nucleic Acids Research 40 (Database issue), D13-D25. Shimoyama, M., Nigam, R., Mcintosh, L.S., Nagarajan, R., Rice, T., Rao, D.C. and Dwinell, M.R. (2012) Three ontologies to define phenotype measurement data. Frontiers in Genetics 3, 87. Smith, C.L., Goldsmith, C.A. and Epcattle, J.T. (2005) The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biology6, R7. UniProt Consortium (201 O) The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Research 38 (Database issue), D142-D148. Wikipedia (2012) Phene. Available at:http://en.wikipedia.org/wiki/Phene (accessed 30 March 2013).