Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Misha Kapushesky Expression Profiler: Next Generation http://www.ebi.ac.uk/expressionprofiler November 28, 2003 Background • MIAMExpress – MIAME-compliant microarray data submission tool • ArrayExpress – Public repository for microarray data, aimed at storing well-annotated data in accordance with MGED recommendations • Expression Profiler – Microarray exploratory data analysis and management platform Overall Infrastructure EBI Submissions Local MIAMExpress Installations www MIAMExpress MAGE-ML Queries www ArrayExpress MAGE-ML Data pipelines Array Manufacturers LIMS Data analysis www Expression Profiler Data Analysis software Microarray software External Bioinformatics databases Other Microarray databases ArrayExpress Data Representation in MAGE/ArrayExpress in Expression Profiler spots BioAssays (hybridizations, data transformations) DesignElements (spots, genes) QuantitationTypes (signal intensity, ratio etc.) measurements ArrayExpress: Expression Data Matrices BioAssays (hybridizations, data transformations) DesignElements (spots) select BioAssays DesignElements BioAssayData1 QuantitationTypes (signal intensity, ratio etc.) (QT,BA) pairs BioAssayData2 select QuantitationTypes EP:NG – Design Goals • • • • • Maintain present functionality Create an extensible framework Distinct functional components Uniform look-and-feel Flexible collaborative environment EP:NG Framework • How it works… – – – – RDBMS-driven (users, interface, metadata) Component-based web app platform XML/XSLT HTML rendering Available as Web Service • Written in… – XML/XSLT – transformations, component chaining – PERL -- for inter-component “glue” – C, C++, R, etc. -- integrated 3rd party algorithms Architecture Overview • XML component descriptions • Chainable, extensible components Web Interface (Services/UI/etc.) Request Response EP EP EP EP Component Component Component Component (EPC)XML XML (EPC) (EPC) XML (EPC) XML XSLT Processor EPC Rendering XSL External Services Access EP Database ArrayExpress Internal and 3rd party Components R (S-SPLUS) EP Filesystem EPC XML Components • SECTIONS • Inputs • Grouped into subsections • Input names, type, validation type • Used for rendering • Dynamic data • External service access • EP DB/file system access, etc. • Outputs • Output name, type, validation type • Output format (e.g. regular expression) • ACTION • URL (or other definition) • EPC target IDs Transformed with XSL • Accessing EP DB for definition of dynamic input elements Available Components • Data Selection • Data Transformation – – – – • • • • * ** *** **** Raw intensities Log2(ratio) Average row identifiers Missing value imputation: via KNNimpute* Data transposing Hierarchical Clustering + K-groups Clustering Clustering Comparison: via R** Ordination (COA, PCA): via R (ade4)*** Between Group Analysis: via R (ade4)**** Troyanskaya et al. Bioinformatics. 2001 Jun;17(6):520-5 Developed in our group (Aurora Torrente) Ade-4: http://pbil.univ-lyon1.fr/ADE-4 Culhane et al., Bioinformatics. 2002 Dec;18(12):1600-8 Components Coming Soon • Two-way clustering • Iterative Signature Algorithm (Naama Barkai, Jan Ihmels) • Gene ordering (Karlis Freivalds) • Bioconductor integration (UCL) • Normalization methods – LOWESS, ... • Statistical analysis methods – ANOVA, SAM, ... • More! (contributors?) EP:NG is an open source project – if you are interested in contributing, testing or just discussing ideas, let us know! EP:NG Platform Features • User + data management – Multiple folders, data sharing, collaborative features • Analysis History – Analysis steps export/re-application • Visualization Framework – Interactivity: gene searching, cluster/gene group selection + tagging, zooming, etc. EP:NG is an open source project – if you are interested in contributing, testing or just discussing ideas, let us know! Acknowledgements Original EP Development: Clustering Comparison: • Jaak Vilo (Tartu) • Aurora Torrente • Patrick Kemmeren (Utrecht) • Christine Körner (Leipzig) • Misha Kapushesky PCA/COA/BGA: EP:NG Framework Development: • Patrick Kemmeren (Utrecht) • Misha Kapushesky Visualization Components (under development): • Steffen Durinck (Leuven) • Aedín Culhane (Cork) Gene Ordering: • Karlis Freivalds (Riga) Normalization (under development): • Tom Bogaert (Leuven) Discussions: • EBI Microarray Informatics Team • Contributors from the open source community EP:NG is an open source project – if you are interested in contributing, testing or just discussing ideas, let us know!