Download MapMan Help

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Metabolic network modelling wikipedia , lookup

Transcript
1 von 15
MapMan Help
1 Short Introduction
1.1 View data included in the package
1.2 Display options
1.3 Print
1.4 Save
1.5 Search Functions
1.6 Statistics (Wilcoxon Rank Sum Test)
2 View your own Arabidopsis 22KAffymetrix data
2.1 Data format
2.2 Create an experiment folder
2.3 Add data files to your new folder
2.4 Visualise your data
3 Use MapMan to visualise any gene expression, metabolite or other data
3.1 Creation of a mapping file
3.2 Create your own pathway (Annotation of an image file)
3.3 Histogram frequency chart
4 Structure of MapMan
5 Available Pathways Overview
6 Contact information
index
1 Short introduction
MapMan is a user-driven tool that displays large datasets, e.g. from gene expression experiments onto
diagrams of metabolic pathways or other processes.
You can not only use the provided diagrams, but also generate your own ones and let the software display
yours and others data onto your diagrams.
There are three different types of files needed to use MapMan. They are located in three different folders
inside the directory of example data.
(i) data files: contain experimental results (.xls or .txtfile format)
(ii) pathways: diagrams (.bmp file format) which serve as context
(iii) mapping files: connects gene/metabolite/protein identifiers to BINs (“functional categories”) which are
displayed (.xls or .txt file format)
2 von 15
index
1.1 View data included in the package
The MapMan download provides example.
After starting the MapMan software you will find data files in the Experiments folder, pathway image files
in the Pathways(overview of available predefined pathways)folder and mapping files in the Mappings
folder of the selection directory on the left.
You can view the included datasets in context of different metabolic pathways
(i)Double click an image file from the folder "Pathways",
e.g. "Metabolism_overview" or "Glycolysis".
Choose a mapping file from the pop-up box
and click "OK".
Choose "AFFY2005" in combination with "Metabolism_overview"
with all maps showing biological processes (This is most often
right).
Hint: Images need different mapping files (table 1).
Response_images need an Response_mapping file. (In fact, if you
don’t care about the statistics you can combine both files).
(ii) A simple click activates experiment files from the folder
"Experiment:DiurnalCycle (or other) one by one.
All data files from one experiment can now be
viewed in sequence. Each file is called up after
loaded the first time by mouse click in a fraction of
a second.
Each gene is symbolised by a box,
the gene expression level is colour-encoded
(red = down-, blue = upregulation).
A simple mouse-over action on an individual box will call up the gene annotation beneath your mouse,
while a click will copy the information to the text window below pathway.
Right clicking on an individual box brings up further options, such as opening a webbrowser (link out) with
additional information about the particular spot from the GABI website (http://gabi.rzpd.de) which will also
get you to further information ressources.
You can get unigene information, and information how good a spot reflects a given transcripts.
3 von 15
Number Filename
Description/Keywords
Visualization Data Standard/
Response
1
photosynthesis
Light reaction, Calvin cycle, photorespiration
P
T
S
2
cell functions overview
cellular functions
H
T
S
3
Cell Wall precursors
NDP sugar pathways (used for the cell wall)
P
T
S
4
Cellular reponse
overview
stresses, redox, development, cell cycle and division
P
T
S
5
Glycolysis
glycolysis
P
T
S
6
Large enzyme families
Large enzyme families like oxidases, GDSL lipases, etc
P
T
S
7
Lignin
Monolignol pathway starting from Phenylalanine
P
T
S
8
metabolism overview
overview of metabolic reactions
P
T
S
9
mitochondrial
e-transport
mitochondriol overview including transporters
P
T
S
10
N-metabolism
Nitrogen metabolism
P
T
S
11
RNA-Protein Synthesis
Protein Synthesis, targeting and degradation as well as
RNA processing
P/H
T
S
12
Sucrose Starch
Sucrose and Starch Degradation and Synthesis
P
T
S
13
Transcription
Regulation
Potential Transcription factors and regulators of transcription
P
T
S
14
Transport Overview
different transporters
P
T
S
15
Proteasome
Ubiquitin dependent protein degradation pathway
P
T
S
16
Regulation overview
TFs, Protein modification and degradation,
hormone regulation, receptor kinases, G proteins, MAPKs etc.
P
T
S
17
TCA
TCA Cycle including mitochondrial genes
P
T
S
18
Glycolysis-TCA
Combination of Glycolysis, TCA, and mitochondrial genes
P
T
S
19
Secondary metabolism
Secondary metabolism like flavenols, chalcones, lignins, etc
P
T
S
20
Pentose phosphate
Pentose phosphate pathway, Warburg way, Shunt
P
T
S
21
Metabolites
most metabolites that can be measured, for a conversion
of your metabolite name to the canonical MM one see:
http://csbdb.mpimp-golm.mpg.de/csbdb/gmd/tools/gmd_conv.html
P
M
S
22
C_TCA
only the TCA cycle including some metabolites
P
T/M S
23
Sulfate assimilation
Sulfate assimilation
P
T
24
C_Lignin
Monolignol pathways (slightly different layout as ligin with transcript
only)
P
T/M S
25
Response_nutrients
Response to starvation and readdition of nutrients:
phosphate, nitrate, sulphate, carbon
P
T
R
26
Response_stress
Response to abiotic stresses from Atgenexpress (roots)
P
T
R
Visualization types: P(oints) H(istogram)
Data types: T(ranscripts) M(etabolites) P(roteinas)
Table 1:Available image, mapping and experiment files.
Regard: Some of the mapping files included in this package might be encrypted. To receive the original files, please contact:
Mark Stitt ([email protected])
S
4 von 15
index
1.2 Display options
In default mode, each individual gene (protein, metabolite) is symbolised by a small square box in which the
expression (concentration) level is colour-encoded.
In case of gene expression experiments, up-regulated genes are shown in blue, while down-regulated genes
are stained in red. [Colour coding will be extended with planned additions]
The display options can be changed:
(i) Select "Options" from the "Pathway" menu.
(ii)
Scaling: Change the colour intensity of your data points. Scale
down the value for more, scale up the value for less colour
intensity. The default value is 3.0.
Datasize: Minimize or enlarge the box size. Middle size (M) is
default value.
Background Colour: Change the colour of the image
background.
Visualization Type:
Change mode, default means use visualizations as specified for each data area, other modes override the
modes specified for each data area
Marked Visualization Type:
Change visualization for marked elements. You will only notice changes in visualization for marked
elements which have been marked via the search function!
1.3 Print
Print your image-data-file:
Select "Print" from the "Pathway" menu.
index
index
1.4 Save
You can save the result of an experiment depicted on a pathway as an image-data-file:
Select "Save as Image" from the "Pathway" menu. It is possible to save the image as .jpg or .png file.
After selecting a file-type write the filename and add the correct ending of the file name (.jpg or .png).
5 von 15
index
1.5 Search Functions
The search function allows you searches in the description for all genes/metabolites/proteins which are
present on a selected pathway.
Type in the your search description in the search field.
Press the binocular button right beside the search field to start.
In the "Log" tab ( bottom pane )the number of found matches is displayed.
Additionally, all spots which were matched by the search are displayed according to the selected display
options from the menu: "Pathway->Options->MarkedVisualiationType".
(The experienced user can make use of regular expression a.k.a. REGEX syntax)
Press the recycle button to reset the marking on the pathway diagram
Results of all matches are visually highlighted in each individual box.
The displayed visualization can be altered under the options menu, selecting other options under
"MarkedVisualizationType". The default display is "3D rectangles" In the example shown, the option
"greying" was chosen, the greys out all items that do not match with the descriptor search of interest. Other
available search options are "Inner Rectangles", "Triangles" and "3D Rectangles". The Highlighting option
can be modified under Pathways->Options->MarkedVisualizationType. See paragraph 1.2
6 von 15
index
1.6 Statistics (Wilcoxon Rank SumTest)
The statistics performed in MapMan is based on the Wilcoxon Rank Sum Test to predict BINs that exhibit a
different behaviour in terms of expression profile compared to all the other remaining BINs.
The test is done automatically every time a new experiment is loaded.
For each BIN the results are displayed (BIN, Elements, Probability) in the Statistics panel beside the Log
panel at the bottom pane below the pathway diagram.
The test is done automatically every time a new experiment is loaded.
For each Bin displayed the results (Bin, Elements, Probability) of the test is shown in the Statistics panel
beside the Log panel at the bottom part of the pathway.
7 von 15
index
2 View your own Arabidopsis 22K Affymetrix data
Users of MapMan can visualize their own Arabidopsis 22K Affymetrix microarray experiments using the
mapping and pathway image files included in the MapMan package.
New files with experimental data have to be in a specific format (2.1) and are loaded into MapMan in two
steps.
The first step is to create a new experiment subdirectory folder (2.2) and the second step is to upload data
files into this folder (2.3).
You are able to create as many experiment folders as you like and reference data files belonging to your
experiment.
index
2.1 Data format
The data format MapMan expects is either an EXCEL or tab-delimited text file. Files might have the
following structure: Values should typically be in the range of -10 to 10, but values may be higher or lower).
Single assay file format:
Multiple assay file format:
IDENTIFIER
VALUE
agi
exp3a exp3b exp4a exp4b
EST_1A01
-1,50
at1g6635
-1,50
-1,20
-1,70
-1,10
EST_1C05
1,90
at2g6678
1,90
1,50
X¹
1,40
EST_1G03
X¹
at2g2431
1,90
1,50
1,40
1,40
EST_1F08
-1,80
at4g1786
1,20
-1,50
0,40
0,30
¹Values of 'X' mean absent, which will be displayed as empty squares.
The "IDENTIFIER" - column refers to the unique EST or oligo sequence identifier. For example
Affymetrix Identifiers represented on your filter/array.
The "VALUE" - column indicates for each identifier the measured expression value.
A negative/positive value represents downward/upward regulation.
We recommend using logarithms of measured expression ratios between two treatment conditions in an
experiment (e.g. log2 expression values representing fold change).
MapMan expects numeric information in the "VALUE" column to be within the range -10 to 10, but values
may be higher or lower.
Missing or absent values are marked by a capital "X".
Alternatively MapMan can load files containing multiple assays at once.
MapMan expects the Probe identifier in the first columns, all following columns contain the individual
experiment values.
The tables can contain a header with the name of the experiment, but this is not necessary.
8 von 15
2.2 Create an experiment (folder)
Experimental data can be organized in folders and data files can than be added to those folders.
Alternatively folders from the file system can be selected and the containig data files will be added
automatically.
(i)
Right click on any experiment folder in
the folder structure on the left pane,
and select "new sub folder" to
generate a new experiment folder in
MapMan
(ii)
An option box appears:
Select"by name" to add files
individually.
(Advanced users can optionally
reference directories by selecting
existing folders to load all data files
within a given directory)
(iii)
The new experiment folder will show
up in the folder structure:
index
9 von 15
2.3 Add data files to your new folder
(i)
Right click the mouse button on the newly
created experiment folder and select "add data
file" from the menu.
index
(ii)
Choose your data files one by one.
(Tab delimited text files are much faster to load
than excel files!)
You can export your data from excel as tab
delimited text files.
(iii)
A dialog opens giving the option to configure your datafile.
You have the option to deselect data columns you are not going to use which will speed up loading of the
data and prevent errors from unreadable data.
Usually MapMan does recognize the format of
your data automatically. (Check if the number
format matches (decimal point or comma))
Moreover, you can force MapMan to take a
header row or not to take a header, by checking
or unchecking the check box "first row contains
header" respectively.
10 von 15
(iv)
All configured data files are now listed within
your experiment folder.
index
2.4 Visualise your data
Once you have imported your data file into MapMan it is possible to display these data sets onto an image
(diagram). Please follow the instructions as outlined in paragraph 1.
11 von 15
index
3 Use MapMan to visualise any gene expression, metabolite or other data
In general MapMan can be used to display any data onto user-defined images. Data can be gene expression
data, metabolite or protein concentrations, enzyme activities etc.
Image files (.png, .bmp ?) can be metabolic pathways, cellular processes, regulatory networks etc.
Prior MapMan usage two steps are necessary:
(i) creation of a mapping file (3.1),
(ii) annotation of new image files (3.2).
index
3.1 Creation of a custom mapping file
The mapping file structures your genes, metabolites, enzymes etc. in discrete classes in a hierarchical way.
The mapping file has to be in "MS-EXCEL", tab-delimited ".txt" or ".xml" format.
Define the five firest columns as "BINCODE", "NAME", "IDENTIFIER" , "DESCRIPTION",
"TYPE".
The "BINCODE" column contains the identifier for all your main classes (1, 2, 3, 4 ....), subclasses (1.1,
1.2, ...., 2.1, 2.2....), subsubclasses (1.1.1, 1.1.2....1.2.1, 1.2.2.....) and so on.
Important is the dot between classes and their subclasses.
The BINCODE is used to annotate the image files (3.2).
The "NAME" column includes the names for each class (e.g. Photosynthesis) and subclass (e.g.
Photosynthesis.lightreaction). Again a dot separates classes and subclasses.
The "IDENTIFIER" column lists the identifier of a gene, metabolite, enzyme etc.. These identifiers have to
match the identifier in your data file.
The "DESCRIPTION" column contains a user-defined description of the gene, metabolite, enzyme etc.
There is no space limitation.
The "TYPE" column specifies if the item is a transcript (T), metabolite (M), enzyme (E), protein (P)
Hint: Currently you can leave out the Type column for backward compatibility reasons
BINCODE NAME
IDENTIFIER DESCRIPTION
TYPE
1
Photosynthesis
1.1
Photosynthesis.lightreaction
Chlorophyll a
Approximate Chlorophyll content
with essay XY
M
1.1
Photosynthesis.lightreaction
EST_1B05
photosystem II G protein
T
1.2
Photosynthesis.photorepiration
EST_2A08
glycine hydroxymethyltransferase
(EC 2.1.2.1)
T
1.3
Photosynthesis.calvin cyle
EST_11H01
subunit of rubisco
T
1.3
Photosynthesis.calvin cyle
EST_7K12
fructose-bisphosphate aldolase
T
2
major CHO metabolism
2.1
major CHO metabolism.syntheis
2.1.1
major CHO
metabolism.synthesis.sucrose
EST_5F03
sucrose-phosphate synthase
T
2.1.2
major CHO
metabolism.synthesis.starch
EST_3A09
putative ADP-glucose
pyrophosphorylase large subunit
Table 2: Example for a .xls mapping file.
Regard : Some of the mapping files, included in the package might be encrypted.
Please contact Mark Stitt ([email protected])to get one of the original mapping files.
12 von 15
3.1.1 Add mappings to the "Mappings" folder
Once a new mapping file is created it can be loaded into MapMan:
(i) Right mouse click on "Mappings" and then "Add
mapping file".
Select your newly created
.txt, .xls or .xml file from the folder source.
(ii) A box appears:
Select "from file" to add files individually
or
"download" to add mapping files from the
MapManStore server which has updated mappings.
(iii) Select the appropriate mapping file
(iv) The mapping name is shown in the "Mappings" tree
structure without the file extensions .txt or .xls.
13 von 15
(v) By double click the mapping file is displayed and it is possible to browse through this file.
A mouse click on the (sub)classes or identifier shows up the information as outlined in the
"DESCRIPTION" field of your mapping file as well as the "TYPE" of your spot:
index
3.2 Create your own pathway (Annotation of an image file)
(i) Select "Add pathway" from the "File" menu and select an image file from your directory. The new file
will appear in the "Pathways" folder in the left pane.
(ii) To annotate the new image click the right mouse button on the image
where you want to place your annotation and select "add" from the
"Annotation" menu.
14 von 15
(iii) A dialog box is opened in which the user is asked to type in the numerical identifier of the
BINS/subBINS for which data should be deposited, concordant to your mapping file.The "Block Format"
can be set: type in xa or ya (a=1...n). This format assigns the arrangement of the boxes (if default or points is
chosen as "Visualization Type")
E.g. Block Format x20
Block Format y16
Visualization Type histogram (3.3) Choose "histogram" as
"Visualization type" to view data in a histogram frequency chart.
(iv) You can specify what kind of data you want to show. Currently
ImageAnnotator supports four different kinds of data points that can be
nested. However, you have to have a mapping file that supports these
different data types.
Annotated areas are marked by a dot and can be moved to exactly adjust the position via mouse dragging
(holding down the left mouse key on a annotation dot and moving the mouse). It can also be achieved by
clicking on the annotation dot and afterwards holding down the alt key and using the arrow keys for
positioning. Add as many annotations points to the image as you want.
All annotation points can be changed or deleted:
(i) To change the annotation text or to switch to the histogram frequency chart, click the right mouse button
on the point you want to change and select "Edit" from the menu. Type in your changes.
(ii) To delete an annotation just click on the point with the right mouse button and select "Delete".
15 von 15
index
3.3 Histogram frequency chart
The genes (enzymes, proteins) in a selected group can be treated as a population , and their collective
response displayed as a frequency histogram. Genes (enzymes, proteins) that change by less than a filter
value (e.g., <0.33 and >-0.33 on a log scale 2) are grouped into the central white bar, genes that increase are
displayed as a series of blue bars at right hand side (corresponding on this scale to changes between
0.33-0.99, 0.99-1.66, 1.66-2.33, 2.33-3.00 and >3.0 respectively), and genes that decrease are shown by a
similar set of red bars on the left hand side.
index
4 Structure of Mapman
MapMan relies on SCAVENGER modules to build mapping files. These SCAVENGER modules group
metabolites, transcripts etc. data into the BINS. The SCAVENGER modules are completely independent of
the ImageAnnotator module and vice versa. The ImageAnnotator module uses mapping files from the
SCAVENGER modules or user-built mapping files as its data source. It then paints out the experimental
data onto maps (images with accompanying XML files) according to the hierarchical structure of the
mapping files.
6 Contact information
Axel Nagel, [email protected] (application design, software development)
GABI Primary Database
RZPD Deutsches Ressourcenzentrum für Genomforschung GmbH
Heubnerweg 6
D-14059 Berlin
Bjoern Usadel, [email protected] (annotation computing, pathways)
Max Planck Institute of Molecular Plant Physiology
Am Muehlenberg 1
D-14476 Golm
index