Download Using Gene Ontology Annotations to Interpret DNA Array Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Epigenetics of human development wikipedia , lookup

Genome (book) wikipedia , lookup

Molecular cloning wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene desert wikipedia , lookup

Gene therapy wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Genomics wikipedia , lookup

Gene nomenclature wikipedia , lookup

Gene wikipedia , lookup

Genome evolution wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Helitron (biology) wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Microevolution wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene expression programming wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene expression profiling wikipedia , lookup

Designer baby wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Using Gene Ontology
Annotations to Interpret DNA
Array Data
Stefan Pierrou PhD, AstraZeneca
Spotfire Users Conference 2001-05-03
Department
Author
© 2001, AstraZeneca, Inc. - All Rights Reserved.
COPD Genomics
In collaboration with Southampton University
Department
Author
AstraZeneca-Southampton Collaboration
AZ R&D Lund and AZ R&D Charnwood
Stephen Holgate, Donna Davies & Ratko Djukanovic et al.,
Univ. of Southampton, U.K.
Analysis of Epithelial Gene Expression in COPD
Hypotheses:
 COPD - caused by smoke and exacerbated by infections
 COPD - characterised by altered epithelial genotypes and phenotypes
 In the absence of epithelial activation there is no development of
chronic bronchitis, and no progression of airways remodeling which
ultimately leads to irreversible obstruction
 Epithelial responses to stress (smoke) determine the pathological and
clinical presentations of COPD
Molecular Sciences R&D Lund
Stefan Pierrou
3
Objective
To identify candidate genes associated with disease to
provide opportunities for development of novel
treatments of COPD.
Molecular Sciences R&D Lund
Stefan Pierrou
Cycles of Tissue Damage in Pathogenesis of COPD
Chronic irritation (Smoking, infections, etc)
Genetic predisposition?
Mucociliary
Dysfunction
Epithelial activation,
injury & remodeling
Disease
progression
Mucus hypersecretion
Bacterial
products
Inflammatory
Response
• Proteases • Chemokines
• Cytokines • Oxidants
Molecular Sciences R&D Lund
Stefan Pierrou
Bacterial
colonization
COPD Pathology
Chronic airflow obstruction due to
chronic bronchitis and/or emphysema
Molecular Sciences R&D Lund
Stefan Pierrou
Analysis of Epithelial Gene Expression in COPD
“Critical Path”
Smokers
with/without
COPD
Tissue source
Brushings
Bronchial biopsies
Lung resection
Primary cell-based model
Non-smokers
Microarrays
Identify differentially
expressed genes
Bioinformatics
Tissue expression pattern
RT-PCR
IHC/in situ
Functional assays
Cytokine production
Differentiation
Proliferation
Secretion
Motility
Molecular Sciences R&D Lund
Stefan Pierrou
Candidate Targets
7
Stress related to COPD
Smoke
Oxidants
GFs
Define the
biochemical
pathways initiated
by COPD related
stresses
Analysis of Epithelial Gene Expression in COPD
Study Design
Day 0: Clinical assessment
Day 14: Reversibility testing (salbutamol)
Day 21: Sputum induction to characterize inflamm. cells & mediators
Day 28: Bronchoscopy to obtain samples of bronchial epithelium
Day 70: Bronchoscopy to obtain bronchial biopsies
Molecular Sciences R&D Lund
Stefan Pierrou
8
Analysis of Epithelial Gene Expression in COPD
General Exclusion Criteria
(1) Atopy (positive skin prick tests and history)
(2) Asthma (reversibility to salbutamol <12%)
(3) Respiratory diseases other than COPD
(4) Other conditions which might compromise bronchoscopy
(5) Recent respiratory or other infections (6 weeks)
(6) Recent treatment with oral or inhaled corticosteroids
Molecular Sciences R&D Lund
Stefan Pierrou
9
Tests Performed
Clinical screening
MRC scale
St.Georges´questionnarie
Allergy testing
Histamine challenge
Diary card and peak flow
Serum, DNA
Sputum induction
Blood gases
Full Lung function
Salbutamol reversibility
CT Thorax
Bronchoscopies
Brushings
Biopsies-IHC, ISH
Molecular Sciences R&D Lund
Stefan Pierrou
10
Data Analysis
Sort and Select
p-value
P call
E-lab
Excel Results Sheet
Molecular Sciences R&D Lund
Stefan Pierrou
Brushing+control+stimulated
(Sammon) incl. ALI
Molecular Sciences R&D Lund
Stefan Pierrou
12
Clinical Data Overlay
Molecular Sciences R&D Lund
Stefan Pierrou
GOAC - The Gene Ontology
Annotation Campaign
- some background
Department
Author
Analysis and clustering of gene
expression data
generates most often lists of incomprehensible gene names
Molecular Sciences R&D Lund
Stefan Pierrou
15
Clustering of gene expression data
according to protein function classification
• Classification is currently
done manually
• Need for automatisation
• Gene Ontology is a tool
to make this happen
Molecular Sciences R&D Lund
Stefan Pierrou
16
History - why GO?
• Need for data reduction based on biological
information
• Data overlay tool discussions with Spotfire
• Spotfire created a plug in for GO
• GO has since then started to become the de
facto standard for annotation of genes.
Molecular Sciences R&D Lund
Stefan Pierrou
17
GO Consortium - www.geneontology.org
•
Drosophila (fruitfly) - FlyBase
•
Saccharomyces (budding yeast) - Saccharomyces Genome
Database (SGD)
•
Mus (mouse) - Mouse Genome Database (MGD) & Gene
Expression Database (GXD)
•
Arabidopsis (brassica or mustard family) - The Arabidopsis
Information Resource (TAIR)
•
Caenorhabditis (nematode) - WormBase
Molecular Sciences R&D Lund
Stefan Pierrou
18
GOC Collaborators
• Academic
•
SwissProt - annotations ongoing
•
Interpro - annotations ongoing - currently 1/2 done
• Corporate
•
Celera - uses GO for Drosophila
•
Incyte - sponsor to Stanford group
• GOC Sponsor - AstraZeneca
Molecular Sciences R&D Lund
Stefan Pierrou
19
The Gene Ontology
• Molecular function describes the tasks performed by
individual gene products; examples are transcription
factor and DNA helicase.
• Biological process describes broad biological goals,
such as mitosis or purine metabolism, that are
accomplished by ordered assemblies of molecular
functions.
• Cellular component encompasses subcellular
structures, locations, and macromolecular complexes;
examples include nucleus, telomere, and origin
recognition complex
Molecular Sciences R&D Lund
Stefan Pierrou
20
GOC Annotation status
- as of April 29 2001
SGD
FlyBase
MGI
•
Biological Process
5,684
1,306
3,461
•
Molecular Function
5,780
5,290
4,574
•
Cellular Component
2,350
1,347
3,545
•
Total gene prod. Annot.
6,373
5,628
5,603
Molecular Sciences R&D Lund
Stefan Pierrou
21
The GOAC development project
How to make use of Gene Ontology annotations
a reality
Department
Author
Starting point for the GOAC
development project
Overlay expression data &
visualise
Gene ontology DB
& browser
Molecular Sciences R&D Lund
Stefan Pierrou
23
GOAC components
Excel Support file
Gene ontology DB
& browser
GO
mySQL
DB
Molecular Sciences R&D Lund
Stefan Pierrou
Annotation database
with GO terms
Oracle
DB
24
Overlay expression data &
visualise
Molecular Sciences R&D Lund
Stefan Pierrou
Molecular Sciences R&D Lund
Stefan Pierrou
Summary
• DNA micro array data can be analysed for a
selected set of genes or complete profiles
• We suggest the use of controled vocabulary
such as GO for profile analysis
• The possibility of overlaying expression data
on a structure such as GO, will be essential
• The Spotfire plugin developed in Göteborg will
fill this need
Molecular Sciences R&D Lund
Stefan Pierrou
Acknowledgement
•
AstraZeneca
•
Bo Servenius
•
Robert Virtala
•
Krzysztof Pawlowski
•
Jacob Sjöberg
•
Dan Gustavsson
• Spotfire
•
Tobias Fändriks
Molecular Sciences R&D Lund
Stefan Pierrou