Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Misha Kapushesky
Expression Profiler:
Next Generation
http://www.ebi.ac.uk/expressionprofiler
November 28, 2003
Background
• MIAMExpress
– MIAME-compliant microarray data submission tool
• ArrayExpress
– Public repository for microarray data, aimed at
storing well-annotated data in accordance with
MGED recommendations
• Expression Profiler
– Microarray exploratory data analysis and
management platform
Overall Infrastructure
EBI
Submissions
Local MIAMExpress
Installations
www
MIAMExpress
MAGE-ML
Queries
www
ArrayExpress
MAGE-ML
Data
pipelines
Array
Manufacturers
LIMS
Data
analysis
www
Expression
Profiler
Data Analysis
software
Microarray
software
External
Bioinformatics
databases
Other
Microarray
databases
ArrayExpress Data Representation
in MAGE/ArrayExpress
in Expression Profiler
spots
BioAssays (hybridizations,
data transformations)
DesignElements (spots, genes)
QuantitationTypes
(signal intensity, ratio etc.)
measurements
ArrayExpress: Expression Data Matrices
BioAssays (hybridizations,
data transformations)
DesignElements (spots)
select BioAssays
DesignElements
BioAssayData1
QuantitationTypes
(signal intensity, ratio etc.)
(QT,BA) pairs
BioAssayData2
select QuantitationTypes
EP:NG – Design Goals
•
•
•
•
•
Maintain present functionality
Create an extensible framework
Distinct functional components
Uniform look-and-feel
Flexible collaborative environment
EP:NG Framework
• How it works…
–
–
–
–
RDBMS-driven (users, interface, metadata)
Component-based web app platform
XML/XSLT  HTML rendering
Available as Web Service
• Written in…
– XML/XSLT – transformations, component chaining
– PERL -- for inter-component “glue”
– C, C++, R, etc. -- integrated 3rd party algorithms
Architecture Overview
• XML component descriptions
• Chainable, extensible components
Web Interface
(Services/UI/etc.)
Request
Response
EP
EP
EP
EP
Component
Component
Component
Component
(EPC)XML
XML
(EPC)
(EPC)
XML
(EPC)
XML
XSLT
Processor
EPC Rendering
XSL
External
Services
Access
EP
Database
ArrayExpress
Internal and
3rd party
Components
R (S-SPLUS)
EP
Filesystem
EPC XML Components
• SECTIONS
• Inputs
• Grouped into subsections
• Input names, type, validation type
• Used for rendering
• Dynamic data
• External service access
• EP DB/file system access, etc.
• Outputs
• Output name, type, validation type
• Output format (e.g. regular expression)
• ACTION
• URL (or other definition)
• EPC target IDs
Transformed with XSL
• Accessing EP DB for definition of dynamic input elements
Available Components
• Data Selection
• Data Transformation
–
–
–
–
•
•
•
•
*
**
***
****
Raw intensities  Log2(ratio)
Average row identifiers
Missing value imputation: via KNNimpute*
Data transposing
Hierarchical Clustering + K-groups Clustering
Clustering Comparison: via R**
Ordination (COA, PCA): via R (ade4)***
Between Group Analysis: via R (ade4)****
Troyanskaya et al. Bioinformatics. 2001 Jun;17(6):520-5
Developed in our group (Aurora Torrente)
Ade-4: http://pbil.univ-lyon1.fr/ADE-4
Culhane et al., Bioinformatics. 2002 Dec;18(12):1600-8
Components Coming Soon
• Two-way clustering
• Iterative Signature Algorithm (Naama Barkai,
Jan Ihmels)
• Gene ordering (Karlis Freivalds)
• Bioconductor integration (UCL)
• Normalization methods
– LOWESS, ...
• Statistical analysis methods
– ANOVA, SAM, ...
• More! (contributors?)
EP:NG is an open source project – if you are interested in
contributing, testing or just discussing ideas, let us know!
EP:NG Platform Features
• User + data management
– Multiple folders, data sharing, collaborative
features
• Analysis History
– Analysis steps export/re-application
• Visualization Framework
– Interactivity: gene searching, cluster/gene
group selection + tagging, zooming, etc.
EP:NG is an open source project – if you are interested in
contributing, testing or just discussing ideas, let us know!
Acknowledgements
Original EP Development:
Clustering Comparison:
• Jaak Vilo (Tartu)
• Aurora Torrente
• Patrick Kemmeren (Utrecht)
• Christine Körner (Leipzig)
• Misha Kapushesky
PCA/COA/BGA:
EP:NG Framework Development:
• Patrick Kemmeren (Utrecht)
• Misha Kapushesky
Visualization Components
(under development):
• Steffen Durinck (Leuven)
• Aedín Culhane (Cork)
Gene Ordering:
• Karlis Freivalds (Riga)
Normalization
(under development):
• Tom Bogaert (Leuven)
Discussions:
• EBI Microarray Informatics Team
• Contributors from the open source community
EP:NG is an open source project – if you are interested in
contributing, testing or just discussing ideas, let us know!
Related documents