Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
1 Introduction The same analysis as in the paper presented at the BIOCOMP2011 has been performed. For this analysis the RefSeq database has been updated and the RefSeqIdentifieres XM ,XP ,NP are ignored in the analysis of cross-hybridzation. So this study does an update of the data an shows the impact of the XM -RefSeqAccessions in comparision with the publication. Additionally, the new RefSeq-accession lead to updated genes, which are represented in new gi-accession-number. For the CROP-Gene the official gene name was also changed to LUC7L3. For the DUSP2 gene no matching Affymetrix probes where found. It is therefore omitted from the results. 2 Datasets For the detection of putative cross-hybridizations by sequence alignment, the sequences of all Affymetrix probes (only the PM probes, the MM probes are discarded) are aligned against the RefSeq database using blastn. The gene annotations from RefSeq were downloaded on October,4th 2010 from ftp://ftp.ncbi.nih.gov/refseq/H sapiens/mRNA Prot/human.rna.fna.gz and ftp://ftp.ncbi.nih.gov/refseq/M musculus/mRNA Prot/mouse.rna.fna.gz. 1 2 Classification Affymetrix Microarrays HGUPlus HGUA HGUB MG430 total number of probes 604258 247954 249502 496468 probes with a BLAST-hit 350537 210587 125766 316551 Unambiguous Probes 321272 185381 102726 269228 Ambiguous Probes 53511 25206 23040 47323 cross-hybridizing fraction 15340 9790 3849 38880 The probes were aligned against the RefSeq-Database. A large fraction did not have a identical 25mer-hit. The other probes are classified if they are unambiguous or ambiguous. Ambiguity results from hits on the wrong strand or if they cross-hybridize. The number of cross-hybridization probes is additionally shown. 3 4 Size of probesets Affymetrix Microarrays HGUPlus HGUA HGUB MG430 5 former number of probesets 18800 12400 6500 16400 number of probesets 19381 12553 7484 16769 Correlation of Probesets with Etanercept and MAQC dataset Gene Probeset TNF A: 207113 s at D: NM 000594 at F: GC06P031652 H: gi 25952110 A: 205067 at A: 39402 at D: NM 000576 at F: GC02M113303 H: gi 27894305 A: 205207 at D: NM 000600 at F: GC07P022732 H: gi 224831235 A: 202859 x at A: 211506 s at D: NM 000584 at F: GC04P074845 H: gi 28610153 A: 212657 s at A: 212659 s at A: 216243 s at D: NM 173841 at D: NM 000577 at D: NM 173842 at D: NM 173843 at F: GC02P113591 H: gi 296010861 A: 202637 s at A: 202638 s at A: 215485 s at D: NM 000201 at F: GC19P010247 H: gi 16746619 A: 215223 s at A: 216841 s at D: NM 001024466 at D: NM 000636 at IL1B IL6 IL8 IL1RN ICAM1 SOD2 PCC ETC (RMA) 0.88 0.88 0.88 0.88 0.95 0.95 0.96 0.96 0.96 0.70 0.68 0.69 0.68 0.88 0.86 0.88 0.88 0.89 0.75 0.77 0.75 0.80 0.80 0.80 0.84 0.83 0.80 0.63 0.62 0.71 0.71 0.70 0.71 0.15 0.18 0.16 0.19 PCC ETC (MAS5) 0.89 0.89 0.89 0.89 0.95 0.96 0.96 0.96 0.96 0.72 0.73 0.72 0.73 0.89 0.87 0.92 0.92 0.92 0.77 0.71 0.69 0.80 0.80 0.80 0.76 0.76 0.80 0.64 0.68 0.62 0.71 0.71 0.71 0.13 0.24 0.21 0.23 3 PCC MAQC (RMA) N/A N/A N/A N/A 0.37 0.82 0.70 0.75 0.76 0.85 0.80 0.81 0.80 0.90 0.98 0.96 0.96 0.96 N/A N/A N/A N/A N/A N/A N/A N/A N/A 0.97 0.98 0.94 0.99 0.99 0.99 N/A N/A N/A N/A Number of Probesetambiguous size probes 0 11 0 11 0 11 0 11 0 11 0 16 0 27 0 27 0 27 0 11 0 11 0 11 0 11 2 11 0 11 2 22 2 22 0 20 0 11 0 11 0 11 0 33 0 33 0 33 9 42 11 44 0 33 0 11 2 11 0 11 2 33 2 33 0 31 1 11 9 11 0 12 10 10 Continued on next page number of covered genes 19381 12553 7484 16769 Gene TRAF1 ZFP36 PTGS2 TNFAIP3 ADM LUC7L3 NFκBIA JUNB Ø Probeset D: NM 001024465 at F: GC06M160020 H: gi 67782308 A: 205599 at D: NM 005658 at F: GC09M122704 H: gi 300193046 A: 201531 at D: NM 003407 at F: GC19P044589 H: gi 141802261 A: 204748 at D: NM 000963 at F: GC01M184907 H: gi 223941909 A: 202643 s at A: 202644 s at D: NM 006290 at F: GC06P138230 H: gi 26051241 A: 202912 at D: NM 001124 at F: GC11P010283 H: gi 4501944 A: 203804 s at A: 208835 s at A: 220044 x at D: NM 016424 at D: NM 006107 at F: GC17P046151 at H: gi 256542312 A: 201502 s at D: NM 020529 at F: GC14M034940 at H: gi 168693660 A: 201473 at D: NM 002229 at F: GC19P012763 at H: gi 44921611 all Affymetrix all Dai all Ferrari Hummert PCC ETC (RMA) 0.16 0.24 0.16 0.60 0.62 0.61 0.61 0.84 0.84 0.83 0.84 0.91 0.91 0.91 0.91 0.78 0.87 0.83 0.82 0.82 0.80 0.81 0.81 0.81 0.44 0.43 0.43 0.49 0.49 0.47 0.49 0.81 0.82 0.82 0.82 0.44 0.45 0.45 0.44 0.69 0.63 0.73 0.72 4 PCC ETC (MAS5) 0.20 0.27 0.26 0.55 0.56 0.54 0.55 0.79 0.80 0.79 0.79 0.94 0.94 0.94 0.94 0.80 0.85 0.84 0.83 0.84 0.84 0.84 0.84 0.84 0.58 0.34 0.46 0.54 0.49 0.52 0.49 0.68 0.69 0.67 0.69 0.39 0.40 0.39 0.39 0.68 0.67 0.72 0.67 PCC MAQC (RMA) N/A N/A N/A 0.91 0.85 0.82 0.87 N/A N/A N/A N/A 0.97 0.98 0.98 0.98 0.98 0.99 0.99 0.99 0.99 0.92 0.92 0.92 0.92 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 0.94 0.95 0.95 0.94 0.89 0.90 0.91 0.91 Number of ambiguous probes 1 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 2 0 3 0 2 2 2 0 0 0 0 0 0.79 1.13 1.06 0.00 Probesetsize 13 33 12 11 11 11 11 11 11 11 11 11 11 11 11 11 11 22 22 22 11 11 11 11 11 11 11 32 30 33 30 11 11 11 9 11 11 11 11 11.20 20.48 20.79 18.64