Download Cancer Genomics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

The Cancer Genome Atlas wikipedia , lookup

Transcript
CANCER GENOMICS
A A K R O S H R A TA N
P H S 5 5 0 0 : S P E C I A L T O P I C S I N P U B L I C H E A LT H - P U B L I C H E A LT H G E N O M I C S
14TH MARCH, 2016
H T T P : / / B I M S . V I R G I N I A . E D U / F A C U LT Y / A A K R O S H - R A TA N
R A TA N @ V I R G I N I A . E D U
OUTLINE
• Historical perspective.
• Challenges of cancer genomics.
• Some lessons from WGS studies.
• Genomic resources.
• A brief introduction to RNA-Seq, non-coding RNA and
single cell genomics in cancer.
W H AT I S C A N C E R ?
• Cancer is a generic term for a large group of diseases
that can affect any part of the body. Other terms used
are malignant tumors and neoplasms. One defining
feature of cancer is the rapid creation of abnormal
cells that grow beyond their usual boundaries, and
which can then invade adjoining parts of the body and
spread to other organs, the latter process is referred to
as metastasizing. Metastases are the major cause of
death from cancer.
Source: WHO
W H AT I S C A N C E R ?
• Cancer is a generic term for a large group of
diseases that can affect any part of the body. Other
terms used are malignant tumors and neoplasms. One
defining feature of cancer is the rapid creation of
abnormal cells that grow beyond their usual
boundaries, and which can then invade adjoining parts
of the body and spread to other organs, the latter
process is referred to as metastasizing. Metastases are
the major cause of death from cancer.
Source: WHO
W H AT I S C A N C E R ?
• Cancer is a generic term for a large group of diseases
that can affect any part of the body. Other terms used
are malignant tumors and neoplasms. One defining
feature of cancer is the rapid creation of abnormal
cells that grow beyond their usual boundaries, and
which can then invade adjoining parts of the body and
spread to other organs, the latter process is referred to
as metastasizing. Metastases are the major cause of
death from cancer.
Source: WHO
HISTORICAL
MILESTONES
• Theodor Boveri, a German
biologist proposed
• A malignant tumour cell
is a cell with a specific
defect; it has lost
properties that a normal
tissue cell retains
• In other words: Cancer is a
disease of the genome.
HISTORICAL MILESTONES
FIGURE 1.1 Historical milestones in cancer genomics. Key milestones in the field of cancer genomics are depicted starting with the elucidation of the structure of DNA by Watson and Crick in
1953. These milestones are depicted over a line graph of the total number of publications listed in the Pubmed database of the National Center for Biotechnology Information (NCBI) with the keywords “Cancer 1 (Genetics or Gene)” (in blue), or “Cancer 1 (Genomics or Genome)” (in green) from 1945 to 2013.
Source: Cancer Genomics
Historical Perspective and Current Challenges of Cancer Genomics
C O M P R E H E N S I V E G E N O M I C A N A LY S E S
Chapter | 9
Bioinformatics for Cancer Genomics
135
DNA
Histones
mRNA
Experimental
Assay
technique
Data type
Data type
RNA-seq Microarray
(Transcriptome)
SNP arrays WGS
(genome) (genome)
-SNPs
-RNA edits
-Isoforms
-ncRNA
-Gene expression
-Novel/fusion transcripts
Result
Result
Data
Interpretation
Integration &
Interpretation
-SNPs
-CNVs
-LOH
-SNPs
-indels
-CNVs
-LOH
-SVs
WES
(exome)
-SNPs
-LOH
Targeted
PCR
Bisulfite-seq
(epigenome)
-Diagnosis
-Verification
-Methylation
-Gene regulation
Gene lists
- Filtering gene lists
- Mechanism of oncogenesis
- Classifying disease subtypes
- Diagnostic biomarkers
- Driver versus passenger
aberrations
FIGURE 9.1 Through the application of high-throughput sequencing technologies, the genome, the epigenome and the transcriptome can be examined in great detail, providing a comprehensive picture of the state of health or any alterations leading to disease. Such experiments allow the identification of both small and large variations in individualSource:
samples. Cancer Genomics
Historical Perspective and Current Challenges of Cancer Genomics
O N E R E A S O N W E A N A LY Z E G E N O M E S
T Y P E S O F VA R I A N T S
Source: Alkan et al., 2011
TYPICAL CANCER
GENOMIC
I N V E S T I G AT I O N
• Tumour and adjacent healthy
tissue samples are
sequenced. After alignment,
detection tools identifies
alterations and abberations,
which are then annotated and
analysed individually (Level I)
— for example, for likely
functional implications — and
collectively (Level II) — for
example, to identify relevant
gene pathways and networks.
Determine which somatic variants are statistically
significant in the complete population of patients with
that cancer type and determine which genes and
pathways are essential to this tumor type.
GENETIC HETEROGENEITY
Source: Cancer Genome Landscapes. Bert Vogelstein et al., Science 339, 1546 (2013);
Source: doi:10.1038/nmeth.3440
Source: doi:10.1038/nrg3767
HOW MANY GENES ARE
M U TAT E D I N A T Y P I C A L
HUMAN CANCER?
• Melanomas and lung
tumors: ~200
nonsynonymous mutations
per tumor
• Other solid tumors: ~33-66
per tumor (95% of these
are SNVs)
• Pediatric tumors &
Leukemias: ~9.6 per tumor
M U TAT I O N A L T I M I N G
Genetic alterations and the progression of colorectal cancer. The major signaling pathways that drive
tumorigenesis are shown at the transitions between each tumor stage. One of several driver genes that encode
components of these pathways can be altered in any individual tumor. Patient age indicates the time intervals
during which the driver genes are usually mutated. Note that this model may not apply to all tumor types. TGF-β,
transforming growth factor–β.
Source: Cancer Genome Landscapes. Bert Vogelstein et al., Science 339, 1546 (2013);
O T H E R T Y P E S O F A LT E R AT I O N S
IN TUMORS
Alterations affecting protein-coding
genes in selected tumors
TMPRSS2-ERG Oncogene
Source: Cancer Genome Landscapes. Bert Vogelstein et al., Science 339, 1546 (2013);
D R I V E R V S . PA S S E N G E R M U TAT I O N S
• Driver gene mutation: Mutation that confers selective growth
advantage.
• Driver gene: Gene that harbors the driver mutation. NB: A driver gene
can harbor passenger mutations.
• Methods to identify such genes based on:
• Frequency of muts. in a gene, compared to other genes in the
same/related tumors after corrections.
• Predicted effects of mutations.
• Confusion over definition of “Driver gene” in literature.
ONE METHOD TO IDENTIFY AND
CLASSIFY DRIVER GENES
Distribution of mutations in two oncogenes (PIK3CA and IDH1) and two tumor suppressor genes (RB1 and VHL).
The distribution of missense mutations (red arrowheads) and truncating mutations (blue arrowheads) in
representative oncogenes and tumor suppressor genes are shown. The data were collected from genome-wide
studies annotated in the COSMIC database (release version 61). For PIK3CA and IDH1, mutations obtained from
the COSMIC database were randomized by the Excel RAND function, and the first 50 are shown.
Source: Cancer Genome Landscapes. Bert Vogelstein et al., Science 339, 1546 (2013);
HOW MANY DRIVER GENES EXIST?
• Identify candidate driver genes in 3,205 samples from 12 cancer types
• frequency-based algorithm MuSiC : 232
• functional impact bias tool OncodriveFM: 259
• 68 of those candidate driver genes were common
• Cancer Gene Census:
• ~1% of human genes are implicated via muts. in cancer.
• ~90% have somatic mutations in cancer
• ~20% bear germline mutations that predispose to cancer
• ~10% show both somatic and germline mutations.
D A R K M AT T E R
• ~5-7 “hits” in driver genes needed to develop solid tumors.
• But as we saw, molecular genetic studies report 0-2 driver must in
pediatric cancers, 3-6 in several adult tumors.
• Missing mutations
• Technical issues with WGS.
• Sample issues.
• Intronic or Intergenic mutations
• Epi-Driver genes (Altered through non-DNA changes)
S I G N A L I N G PAT H W AY S I N T U M O R S
Cancer cell signaling pathways and the cellular processes they regulate. Most of the driver genes can be classified
into one or more of 12 pathways (middle ring) that confer a selective growth advantage. These pathways can
themselves be further organized into three core cellular processes.
Source: Cancer Genome Landscapes. Bert Vogelstein et al., Science 339, 1546 (2013);
H Y P O T H E S E S R E L AT E D T O O R I G I N
AND EVOLUTION OF CANCER
Source : Maugeri-Saccà et al., 2013
CANCER TRANSCRIPTOME
S E Q U E N C I N G A N D A N A LY S I S
• RNA-seq libraries : fragmentation or randomly primed amplification of cDNA
molecules, followed by the subsequent addition of universal sequencing
adaptors
• Replicates are important.
• Alignment to the reference is done using a splice-aware aligner e.g., STAR
• Differential Expression: DESeq2, EdgeR
• Allele-specific expression: GeneiASE, ASE-TIGAR (requires DNA of sample)
• Null hypothesis is that the ratio of observed alleles will be balanced at
heterozygous sites
• Deviation shows how mutation can effect transcription
CANCER TRANSCRIPTOME
S E Q U E N C I N G A N D A N A LY S I S
• Fusion analysis:
• Align against a reference: TopHat-Fusion, DeFuse
• De novo assembly: Trans-ABySS
• Somatic SNV analysis: coverage can be higher on
genes that are expressed
GENOMIC RESOURCE PROJECTS
• The Cancer Genome Atlas (TCGA)
• aims to catalog and discover major cancer- causing somatic lesions in over 20 types of
adult cancers
• ~20 research institutes starting in 2006.
• http://cancergenome.nih.gov
• International Cancer Genome Consortium
• aims to sequence 25 000 cancer genomes, supplementing these with epigenomic and
transcriptomic studies for each case over a 10-year period
• ~50 cancer types, involves several groups from several countries.
• offer guidelines for projects if they want to generate data that would be included later.
P O R TA L S
• Portals provides visualization, analysis and download of large-
scale cancer genomics data sets
• cBio portal (Memorial Sloan-Kettering Cancer Center)
• http://www.cbioportal.org/
• Integrative analysis of complex cancer genomics and clinical
profiles using the cBioPortal. Sci. Signal. 6, pl1 (2013).
• http://www.tumorportal.org/
• Broad Institute
ROLE OF NON-CODING RNAS
• MicroRNAs are a class of non-coding RNAs that are estimated to regulate expression of
two thirds of the mammalian genome by binding to promoter, coding and untranslated
regions of messenger RNAs, proteins or other non-coding RNAs
• MicroRNAs are frequently located within cancer- associated genomic regions (CAGR)
and can act as tumor suppressors or oncogenes
• Aberrant microRNA gene expression signatures characterize cancer cells and microRNA
profiling can be applied in diagnosis, prognosis, and treatment of cancer patients
• Circulating microRNAs are potential non-invasive biomarkers in cancer
• Restoring or blocking microRNA function is a potential treatment method for specific
types of cancer
• Ultraconserved genes are deregulated in cancer and the unique expression profile of
these genes is characterized in chronic lymphocytic leukemia and colorectal cancers
SINGLE CELL
GENOMICS
• Addresses key issues in Cancer
research
• resolving intratumor
heterogeneity
• tracing cell lineages
• understanding rare tumor
cell populations
• measuring mutation rates …
• Methods for SCS still evolving
Methods for isolating single cancer cells from abundant and
rare populations. (a) Methods for isolating single cells from
abundant cellular populations include: micromanipulation by
robotics or mouth pipetting, serial dilutions, flow-sorting,
microfluidics platforms and laser-capture microdissection(b)
Methods for isolating single cells from rare cellular
populations include: CellSearch, DEP-Array, CellCelector,
MagSweeper and nano-fabricated filters
S O B E R I N G FA C T S ( S O U R C E : W H O )
• 14 million new cases and 8.2 million cancer related deaths in 2012.
• The number of new cases is expected to rise by about 70% over the next 2 decades.
• Among men, the 5 most common sites of cancer are lung, prostate, colorectum, stomach, and liver cancer.
• Among women, the 5 most common sites are breast, colorectum, lung, cervix, and stomach cancer.
• Around one third of cancer deaths are due to the 5 leading behavioural and dietary risks: high body mass
index, low fruit and vegetable intake, lack of physical activity, tobacco use, alcohol use.
• Tobacco use is the most important risk factor for cancer causing around 20% of global cancer deaths and
around 70% of global lung cancer deaths.
• Cancer causing viral infections such as HBV/HCV and HPV are responsible for up to 20% of cancer deaths in
low- and middle-income countries.
• More than 60% of world’s total new annual cases occur in Africa, Asia and Central and South America.
These regions account for 70% of the world’s cancer deaths.
• It is expected that annual cancer cases will rise from 14 million in 2012 to 22 within the next 2 decades.