Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Annotation EPP 245/298 Statistical Analysis of Laboratory Data 1 Annotation • Given that one has found one of more genes that are differentially expressed, there are a number useful things to know – What is the putative function? – What pathways are know to contain this gene? – What other proteins interact with the given protein? – etc. November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 2 Two-color array example > alldata[1,] [1] 473 888 170 1137 86 290 109 226 370 659 [16] 324 196 638 102 293 > geneID[1,] Name ID 1 NM_006182 discoidin domain receptor family, member 359 484 102 293 174 http://www.ncbi.nlm.nih.gov/genome/guide/human/resources.shtml November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 3 Affy Example > library(annaffy) Loading required package: GO Loading required package: KEGG Loading required package: annotate > probeids <- geneNames(eset.rma)[allp1adj < .05] > symbols <- aafSymbol(probeids,"hgu95av2") Loading required package: hgu95av2 > symbols[[1]] An object of class "aafSymbol" [1] "DDR1" > getText(symbols[[1]]) [1] "DDR1" > gos <- aafGO(probeids,"hgu95av2") November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 4 > gos[[1]] An object of class "aafGO" [[1]] An object of class "aafGOItem" @id "GO:0005524" @name "ATP binding" @type "Molecular Function" @evid "IEA" [[2]] An object of class "aafGOItem" @id "GO:0007155" @name "cell adhesion" @type "Biological Process" @evid "IEA" [[3]] An object of class "aafGOItem" @id "GO:0007155" @name "cell adhesion" @type "Biological Process" @evid "TAS" November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 5 [[4]] An object of class "aafGOItem" @id "GO:0005887" @name "integral to plasma membrane" @type "Cellular Component" @evid "TAS" [[5]] An object of class "aafGOItem" @id "GO:0016020" @name "membrane" @type "Cellular Component" @evid "IEA" [[6]] An object of class "aafGOItem" @id "GO:0006468" @name "protein amino acid phosphorylation" @type "Biological Process" @evid "IEA" November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 6 [[7]] An object of class "aafGOItem" @id "GO:0004674" @name "protein serine/threonine kinase activity" @type "Molecular Function" @evid "IEA" [[8]] An object of class "aafGOItem" @id "GO:0004872" @name "receptor activity" @type "Molecular Function" @evid "IEA" [[9]] An object of class "aafGOItem" @id "GO:0016740" @name "transferase activity" @type "Molecular Function" @evid "IEA" November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 7 [[10]] An object of class "aafGOItem" @id "GO:0004714" @name "transmembrane receptor protein tyrosine kinase activity" @type "Molecular Function" @evid "IEA" [[11]] An object of class "aafGOItem" @id "GO:0004714" @name "transmembrane receptor protein tyrosine kinase activity" @type "Molecular Function" @evid "TAS" [[12]] An object of class "aafGOItem" @id "GO:0007169" @name "transmembrane receptor protein tyrosine kinase signaling pathway" @type "Biological Process" @evid "IEA" November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 8 GO Evidence Codes • IEA = inferred from electronic annotation (e.g., BLAST). Uncurated • TAS = traceable author statement (i.e., someone said so). November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 9 • • • • • • • • • IDA = inferred from direct assay IEP = inferred from expression pattern IGI = inferred from genetic interaction IMP = inferred from mutant phenotype IPI = inferred from physical interaction ISS = inferred from sequence similarity NAS = non-traceable author statement ND = no biological data available NR = not recorded November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 10 Online Access > gbs <- aafGenBank(probeids,"hgu95av2") > getURL(gbs[[1]]) [1] "http://www.ncbi.nlm.nih.gov/entrez/query. fcgi?cmd=search&db=nucleotide&term=U48705% 5BACCN%5D&doptcmdl=GenBank" > lls <- aafLocusLink(probeids,"hgu95av2") > getURL(lls[[1]]) [1] "http://www.ncbi.nlm.nih.gov/entrez/query. fcgi?db=gene&cmd=Retrieve&dopt=Graphics&li st_uids=780" November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 11 Abstracts > pmids <- aafPubMed(probeids,"hgu95av2") > pmids[[1]] An object of class "aafPubMed" [1] 15111304 14764702 14500648 12935821 12477932 9659899 8977099 8796349 8682498 8622863 8390675 8302582 [13] 8226977 7848919 7789998 7774938 > getURL(pmids[[1]]) [1] "http://www.ncbi.nih.gov/entrez/query.fcgi?tool=bi oconductor&cmd=Retrieve&db=PubMed&list_uids=151113 04%2c14764702%2c14500648%2c12935821%2c12477932%2c9 659899%2c8977099%2c8796349%2c8682498%2c8622863%2c8 390675%2c8302582%2c8226977%2c7848919%2c7789998%2c7 774938“ > browseURL(getURL(lls[[1]])) November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data 12