* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download HGNC future plans
X-inactivation wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Oncogenomics wikipedia , lookup
Copy-number variation wikipedia , lookup
Essential gene wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Genetic engineering wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
RNA interference wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
RNA silencing wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Public health genomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Gene therapy wikipedia , lookup
Non-coding RNA wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Metagenomics wikipedia , lookup
Ridge (biology) wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene desert wikipedia , lookup
Minimal genome wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genomic imprinting wikipedia , lookup
Gene expression programming wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Pathogenomics wikipedia , lookup
Helitron (biology) wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome (book) wikipedia , lookup
Genome evolution wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Gene nomenclature wikipedia , lookup
Gene expression profiling wikipedia , lookup
Designer baby wikipedia , lookup
Future Plans for the HGNC Elspeth Bruford Funding Sources • Applied to NHGRI for renewal of current U41 funding Submitted in cycle III (25.09.16) – expect score Feb/March, advisory council May – current end 30.06.17 - will apply for no-cost extension • Will be applying to Wellcome Trust Biomedical Resources fund (current end 31.08.17) preliminary application due 13.01.17 full application due 03.04.17 • Should we consider applying to anywhere else? Future Funded Aims (2017-2022) 1. continue naming of human protein-coding genes, pseudogenes & RNA genes largely maintenance for protein coding genes, more focus on RNAs 2. continue reassignment of uninformative symbols based on functional data – bearing in mind clinical aspect 3. coordinate gene naming across vertebrates – increase in automation and species 4. assign gene names within complex families across vertebrate species (olfactory receptors, cytochrome P450s) – including new families: GSTs, UGTs, and ? ? zinc fingers, histones, immunoglobulins… ? Resource Project • Aim 1: Naming novel protein coding loci Focus on novel protein coding genes reported in the literature, annotated by GENCODE , and novel genes annotated on new alternative haplotypes. • Aim 2: Naming pseudogenes Focus on transcribed and unprocessed pseudogenes, as well as segregating/polymorphic pseudogenes and unitary pseudogenes. • Aim 3: Naming long non-coding RNA genes Name long non-coding RNA genes based on genomic location, or published (or prepublication) functional data. Prioritize published loci, and those annotated by GENCODE and RefSeq. • Aim 4: Naming small non-coding RNA genes Name microRNAs, transfer RNAs, small nucleolar RNAs and ribosomal RNAs, and investigate naming piRNA genes, create a “miscellaneous non-coding RNA” category for non-specific bioinformatically predicted genomic loci. Resource Project • Aim 5: Reassigning placeholder symbols based on novel data Seek new functional data to enable updates for placeholder symbols., collaborating with EuropePMC , using bioinformatics tools and identifying new GO annotations. • Aim 6: Improving human gene names for transferral to other species Update human gene names to remove superfluous information and punctuation, aim to unify gene and protein names, and avoid using human phenotypes if possible, following community consultation. • Aim 7: Naming genes in other vertebrate species Further automate naming of orthologs utilising a subset of HCOP data and the conversion rules formulated for chimp, initially using dog, cow and Rhesus macaque, and improve tools for manual curation. • Aim 8: Examining complex homology in chimp Manually curate chimp gene naming for cases where 2 or less of the orthology resources agree Resource Project • Aim 9: Naming CYP genes across vertebrates Continue to name CYP genes in multiple vertebrate species and investigate novel CYP mammalian subfamilies. • Aim 10: Naming OR genes across vertebrates Expand naming to non-mammalian vertebrate OR repertoires, initially looking at Xenopus, Anolis, zebrafish, chicken and zebrafinch. • Aim 11: Increasing gene family resources Curate more human genes into family sets based on shared characteristics, in consultation with specialist advisors when appropriate, continue to collaborate with FlyBase about their ‘Gene Groups’. • Aim 12: Naming in other complex gene families Manually curate gene families with complicated orthology relationships across vertebrate species, develop new synteny and BLAST filtering tools, begin with UGT and GST families. Resource Informatics • Aim 1: Updating internal HGNC curation tools Reimplement internal tools as AngularJS web applications and migrate to a virtual machine • Aim 2: Updating internal HGNC QC tools Expand tools, including “end of day” sanity check, rewrite internal sequence search and alignment tool using EMBL-EBI RESTful web services • Aim 3: Collaborating with EuropePMC To notify us of publications relating to placeholder symbols, and journals to target • Aim 4: Maintaining and updating HCOP Expand with addition of new species, initially sheep, gorilla and S. pombe; investigate further orthology sources. Resource Informatics • Aim 5: Maintaining and updating the VGNC database and pipeline Expand to include data from other species, beginning with cow, dog & macaque, increase utility by incorporating more external cross references, expand the set of tools and views available on the website. • Aim 6: Updating internal VGNC curation and QC tools Create AngularJS web applications for curating individual gene symbols & gene families, synteny tool for curating orthologs in multiple vertebrate species in a single process. • Aim 7: Updating the HGNC database and release pipeline Move from PostgreSQL schema to fully normalised MySQL schema, reimplement update pipeline to streamline the processes and utilise extensive compute farm at EMBL-EBI. • Aim 8: Soliciting user input Encourage feedback via our websites, utilise data from annual survey, “contact us” form, web statistics and from panel of users; continue to attend and participate in a range of conferences and workshops Management, Dissemination & Training • • • • • • Aim 1: Organizational structure and staff responsibilities Elspeth will continue managing 4 FTE curators at EMBL-EBI and University of Cambridge, supported by 2 informatics staff at EMBL-EBI, augmented remotely by 4 complex gene family experts and a programmer. Aim 2: Scientific Advisory Board Continue to receive key advice from their SAB, with yearly face to face meeting Aim 3: HGNC website backend and frontend redesign HGNC website backend replaced with a single server; frontend re-written using Angular JS, Jekyll and HTML5 Aim 4: Maintaining and updating searches & download facilities Continue to support existing facilities, expand Biomart to include gene family data, both Biomart & REST to include VGNC data. Aim 5: Maintaining and updating the VGNC website Initial efforts will focus on methods and tools for downloading VGNC data, along with gene family data displays Aim 6: Training Continue attending major genomics conferences; plan to produce more online tutorials and start an HGNC blog. 1. Transposable elements 2. Pseudogenes • What classes: transcribed? unprocessed? unitary? published? by parent gene? 3. Symbols converted to dates 3. Symbols converted to dates 3. Symbols converted to dates DEC1, deleted in esophageal cancer 1 > ?? MARC1-2, mitochondrial amidoxime reducing component 1-2 > MTARC1-2? MARCH1-10, membrane associated ring-CH-type finger 1-10 > MARF1-10? SEPT1-14, septin 1-4 > SEPTIN1-14? 4. Create tool for simplified queries on recent updates to dataset 5. Alliance of Genomic Resources 6. HGNCmine 7. HCOP - other species 7. HCOP - other species 8. HCOP - more export options 9. Classifying Gene Families Different ways to classify: Homology Domain/motif Complex Shared function/pathway/phenotype Combinations of these… 10. Blogging and other social media Computing Complex Gene Families • Olfactory receptors – Doron • Cytochrome P450s – Jed & David