Download Microarray Data Analysis Using BASE - MGH-PGA

Document related concepts

Maximum parsimony (phylogenetics) wikipedia , lookup

Gene expression profiling wikipedia , lookup

Quantitative comparative linguistics wikipedia , lookup

RNA-Seq wikipedia , lookup

Metagenomics wikipedia , lookup

Transcript
Microarray Data
Analysis Using BASE
Danny Park
MGH Microarray Core
March 15, 2004
You’ve got data!
What was I asking? – remember your
experimental design
 How do I analyze the data?

– How do I find interesting stuff? – learn
some analysis tools
– How do I trust the results? – statistics is
key
What was I asking?
Typically: “which genes changed expression
levels when I did ____”
 Common ____:

– Binary conditions: knock out, treatment, etc
– Continuous scales: time courses, levels of
treatment, etc
– Unordered discrete scales: multiple types of
treatment or mutations

This tutorial’s focus: binary experiments
How do I analyze the data?

BASE – BioArray Software Environment
– Data storage and distribution
– Simple filtering, normalization, averaging,
and statistics
– Export/Download results to other tools
MS Excel
 TIGR Multi Experiment Viewer (TMEV)
 This tutorial’s focus: using BASE

Today’s Presentation
Demonstrate the most basic analysis
techniques
 Using our most frequently used
software (BASE)
 For the most common kind of
experiments

Work Flow
RNA
QC & label
Labeled cDNA
hybridize
Slides
Researcher
scan, segment
analysis
BASE
upload
Images &
data files
The Most Common experiment

Two-sample comparison w/N replicates
– KO vs. WT
– Treated vs. untreated
– Diseased vs. normal
– Etc

Question of interest: which genes are
(most) differentially expressed?
Experimental Design – naïve
A
From Gary Churchill,
Jackson Labs
B
Experimental Design – tech repl
A
From Gary Churchill,
Jackson Labs
B
Experimental Design – bio repl
Treatment
Biological
Replicate
Technical
Replicate
Dye
Array
From Gary Churchill,
Jackson Labs
A
A
B
B
The Most Common Analysis
Filter out bad spots
 Adjust low intensities
 Normalize – correct for non-linearities
and dye inconsistencies
 Filter out dim spots
 Calculate average fold ratios and pvalues per gene
 Rank, sort, filter, squint, sift data
 Export to other software

BASE @ MGH

BASE is a microarray data storage and
analysis package
 BASE resides on our web server
– Data is stored at our facility
– Computation is performed on our machines

All you need is a web browser
– https://base.mgh.harvard.edu/
– A Microarray Core technician will provide you with
a username, password, and experiment name
BASE – Login page
BASE – Login page
BASE – Login page
BASE – Login page
BASE – Logged in
BASE – Logged in
BASE – Sidebar

Reporters
BASE – Sidebar

Reporters
BASE – Sidebar

Array LIMS
BASE – Sidebar

Array LIMS
BASE – Sidebar

Biomaterials
BASE – Sidebar

Biomaterials
BASE – Sidebar

Hybridizations
BASE – Sidebar

Hybridizations
BASE – Sidebar

Analyze Data
BASE – Sidebar

Analyze Data
BASE – Sidebar

Users
BASE – Sidebar

Users
BASE – My Account
Change your password
and access defaults
BASE – My Account
Change your password
and access defaults
BASE – My Account
Change your password
and access defaults
BASE – My Account
Change your password
and access defaults
Find your experiment
Find your experiment
Find your experiment
Find your experiment
Experiment view: Four Tabs
Experiment view: Four Tabs
Experiment view: Four Tabs
Experiment view: Four Tabs
Experiment view: Four Tabs
Experiment view: Four Tabs
Experiment view: Four Tabs
Experiment view: Four Tabs
Group slide data together
Group slide data together
Select the slides that measure
the same thing. Later in
analysis, they will be averaged
together. In this experiment,
all ten slides are replicates, so
there is only one grouping.
Group slide data together
Select the slides that measure
the same thing. Later in
analysis, they will be averaged
together. In this experiment,
all ten slides are replicates, so
there is only one grouping.
Group slide data together
Select the slides that measure
the same thing. Later in
analysis, they will be averaged
together. In this experiment,
all ten slides are replicates, so
there is only one grouping.
Group slide data together
Group slide data together
Give your data set a descriptive
name to distinguish it from
other slide groupings. In this
Myd88 knockout experiment,
there is only one grouping, so a
generic name is fine.
Group slide data together
Give your data set a descriptive
name to distinguish it from
other slide groupings. In this
Myd88 knockout experiment,
there is only one grouping, so a
generic name is fine.
Group slide data together
Give your data set a descriptive
name to distinguish it from
other slide groupings. In this
Myd88 knockout experiment,
there is only one grouping, so a
generic name is fine.
Analysis: Begin
Analysis: Begin
Analysis: Begin
Analysis: Begin
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Oligos are annotated with
species codes, but control
spots are not. Set species to
your two-letter code of
choice (Mm, Hs, Dr, Pa, etc)
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Oligos are annotated with
species codes, but control
spots are not. Set species to
your two-letter code of
choice (Mm, Hs, Dr, Pa, etc)
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Oligos are annotated with
species codes, but control
spots are not. Set species to
your two-letter code of
choice (Mm, Hs, Dr, Pa, etc)
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Oligos are annotated with
species codes, but control
spots are not. Set species to
your two-letter code of
choice (Mm, Hs, Dr, Pa, etc)
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Oligos are annotated with
species codes, but control
spots are not. Set species to
your two-letter code of
choice (Mm, Hs, Dr, Pa, etc)
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Oligos are annotated with
species codes, but control
spots are not. Set species to
your two-letter code of
choice (Mm, Hs, Dr, Pa, etc)
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Oligos are annotated with
species codes, but control
spots are not. Set species to
your two-letter code of
choice (Mm, Hs, Dr, Pa, etc)
Analysis: Filter Setup
“Bad” spots are
marked with a
negative Flag value.
Oligos are annotated with
species codes, but control
spots are not. Set species to
your two-letter code of
choice (Mm, Hs, Dr, Pa, etc)
Analysis: Filter Setup
Naming the filter and
the child data set are
essential to reducing
confusion later.
Analysis: Filter Setup
Naming the filter and
the child data set are
essential to reducing
confusion later.
Analysis: Filter Setup
Naming the filter and
the child data set are
essential to reducing
confusion later.
Analysis: Filter Run
Analysis: Quality Data
Analysis: Quality Data
Analysis: Unfiltered Data
Analysis: Filter Parameters
Analysis: Limit-Int Setup
Analysis: Limit-Int Setup
Analysis: Limit-Int Setup
Analysis: Limit-Int Setup
Analysis: Limit-Int Setup
Analysis: Limit-Int Setup
Analysis: Check job status
Analysis: Check job status
Analysis: Check job status
Analysis: Check job status
Analysis: Check job status
“All done” indicates
the job is complete.
Analysis: Check job status
“All done” indicates
the job is complete.
Analysis: Limit-Int Output
Analysis: Limit-Int Output
Analysis: Limit-Int Output
Analysis: Limit-Int Output
Analysis: Limit-Int Output
Analysis: Limit-Int Output
Analysis: Change data set name
Analysis: Change data set name
Analysis: Change data set name
Change the name of
this set to “Intensity
limited Data”
Analysis: Change data set name
Analysis: Change data set name
Analysis: Change data set name
Analysis: Change data set name
Analysis: LOWESS Setup
Analysis: LOWESS Setup
Analysis: LOWESS Setup
Analysis: LOWESS Setup
Analysis: LOWESS Setup
Analysis: LOWESS Setup
Analysis: Check job status
Analysis: Check job status
Analysis: LOWESS Output
Analysis: LOWESS Output
Analysis: LOWESS Output
Change the name of
this set to “Normalized
Data” using the same
steps as before.
Analysis: Change data set name
Change the name of
this set to “Normalized
Data” using the same
steps as before.
Analysis: Change data set name
Change the name of
this set to “Normalized
Data” using the same
steps as before.
Analysis: Filter Setup
Set up the filter as
indicated, hit
Add/Update on the
Gene filter, then hit
Accept and select the
resulting data set.
Analysis: Useful Data
Analysis: Useful Data
MA Plots: Raw Myd88 Data
MA Plots: Raw Myd88 Data
MA Plots: Raw Myd88 Data
MA Plots: Raw Myd88 Data
MA Plots: Quality Data
MA Plots: Quality Data
MA Plots: Quality Data
MA Plots: Quality Data
MA Plots: Quality Data
MA Plots: Quality Data
MA Plots: Int-limited Data
MA Plots: Int-limited Data
MA Plots: Int-limited Data
MA Plots: Int-limited Data
MA Plots: Int-limited Data
MA Plots: Int-limited Data
MA Plots: Normalized Data
MA Plots: Normalized Data
MA Plots: Normalized Data
MA Plots: Normalized Data
MA Plots: Normalized Data
MA Plots: Normalized Data
MA Plots: Norm. Corr. Factor
MA Plots: Norm. Corr. Factor
MA Plots: Useful Data
MA Plots: Useful Data
MA Plots: Useful Data
MA Plots: Useful Data
MA Plots: Useful Data
MA Plots: Useful Data
Analysis: Useful Data
Analysis: Useful Data
Analysis: Fold Ratio Setup
Analysis: Fold Ratio Setup
Analysis: Fold Ratio Setup
Analysis: Fold Ratio Setup
Analysis: Fold Ratio Output
Analysis: Fold Ratio Output
Analysis: Fold Ratio Output
Analysis: Fold Ratio Output
Analysis: Fold Ratio Output
Analysis: Fold Ratio Output
Analysis: Fold Ratio Output
Analysis: Fold Ratio Output
Analysis: Change list name
Analysis: Change list name
Analysis: Change list name
Change the name of
this list as indicated
here.
Analysis: Change list name
Change the name of
this list as indicated
here.
Analysis: Change list name
Analysis: Change list name
Analysis: Fold Ratio Graphs
Analysis: Fold Ratio Graphs
Analysis: Fold Ratio Graphs
Analysis: Fold Ratio Graphs
Analysis: Fold Ratio Graphs
Analysis: Fold Ratio Graphs
Analysis: t-test Setup
Analysis: t-test Setup
Analysis: t-test Setup
Analysis: t-test Setup
Analysis: t-test Output
Analysis: t-test Output
Analysis: t-test Output
Analysis: t-test Output
Analysis: t-test Output
Analysis: t-test Output
Analysis: Change list name
Change the name of
this set to “myd88 pvalue” using the same
steps as before.
Analysis: Change list name
Change the name of
this set to “myd88 pvalue” using the same
steps as before.
Analysis: Change list name
Change the name of
this set to “myd88 pvalue” using the same
steps as before.
Analysis: t-test Graphs
Analysis: t-test Graphs
Analysis: t-test Graphs
Analysis: t-test Graphs
Analysis: t-test Graphs
Analysis: t-test Graphs
Analysis: Experiment Explorer
Analysis: Experiment Explorer
EExplore: Single Gene View
EExplore: Single Gene View
EExplore: Single Gene View
EExplore: Single Gene View
EExplore: Single Gene View
EExplore: Single Gene View
EExplore: Gene List View
EExplore: Gene List View
EExplore: Gene List View
EExplore: Gene List View
Fill out the table as
indicated, then hit
Add/Update.
EExplore: Gene List View
EExplore: Gene List View
EExplore: Gene List View
EExplore: Gene List View
EExplore: Gene List View
EExplore: Gene List View
EExplore: Gene List View
EExplore: Gene List View
EExplore: NCBI Links
EExplore: Gene List View
This additional row
will restrict hits to P
values of 5% or less.
EExplore: Gene List View
This additional row
will restrict hits to P
values of 5% or less.
EExplore: Single Gene View
EExplore: Single Gene View
EExplore: Single Gene View
EExplore: Single Gene View
EExplore: Single Gene View
EExplore: Single Gene View
EExplore: Gene List View
EExplore: Gene List View
Open MS Excel and tell
it to open the file you
downloaded (typically
called base.tsv).
EExplore: Gene List View
Open MS Excel and tell
it to open the file you
downloaded (typically
called base.tsv).
Have Fun!
The rest of the analysis is largely driven
by your biological understanding of the
genes indicated in these lists. We
cannot help much in the interpretation of
this data.
 Don’t forget to go back to the raw data
sets and repeat this entire analysis for
any other slide groupings.

Acknowledgements
MGH Lipid Metabolism Unit
Mason Freeman
Harry Bjorkbacka
MGH Molecular Biology
Bioinformatics Group
Chuck Cooper
Xiaowei Wang
Harvard School of Public
Health Biostatistics
Xiaoman Li
MGH Microarray Core
Glenn Short
Jocelyn Burke
Najib El Messadi
Jason Frietas
Zhiyong Ren
LUND (Sweden) Dept. Theoretical
Physics & Dept. Oncology
Carl Troein
Lao H. Saal
Johan Vallon-Christersson
Sofia Gruvberger
Åke Borg
Carsten Peterson