Download From Tissue to Mutations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

The Cancer Genome Atlas wikipedia , lookup

Transcript
Digital Comprehensive Summaries of Uppsala Dissertations
from the Faculty of Medicine 1028
From Tissue to Mutations
Genetic Profiling of Colorectal Cancer
LUCY MATHOT
ACTA
UNIVERSITATIS
UPSALIENSIS
UPPSALA
2014
ISSN 1651-6206
ISBN 978-91-554-9042-3
urn:nbn:se:uu:diva-231508
Dissertation presented at Uppsala University to be publicly examined in Rudbecksalen,
Rudbecklaboratoriet, Dag Hammarskjölds väg 20, Uppsala, Friday, 7 November 2014 at
09:15 for the degree of Doctor of Philosophy (Faculty of Medicine). The examination will
be conducted in English. Faculty examiner: Assc. Professor Alberto Bardelli (University of
Torino).
Abstract
Mathot, L. 2014. From Tissue to Mutations. Genetic Profiling of Colorectal Cancer. Digital
Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1028.
47 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-9042-3.
Comprehensive characterisation of the mutational landscapes of solid tumours is a multistep
process involving the collection of suitable samples, the extraction of nucleic acids and the
preparation of these materials for mutational analyses. In this thesis, I aimed to develop a
streamlined process for the analysis of colorectal cancer (CRC) patient samples in order to
identify novel mutations that hallmark the development of advanced disease.
Papers I and II outline a technique for serial extraction of nucleic acids from frozen tissue that
we developed and subsequently implemented on a robotic platform to enable high-throughput
processing. The extracted nucleic acids were validated in downstream processes relevant for
genetic analyses, including traditional Sanger and next generation sequencing techniques.
In Paper III, we developed a genotyping method based on multiplex ligation-dependent
genome amplification. The method was designed such that InDel polymorphisms of between
30 and 70 % prevalence in a European population were selected and amplified in a multiplex
PCR assay. DNA from 24 patient-matched colorectal tumour and normal tissues was genotyped
and paired with a high match probability.
In Paper IV, we performed targeted resequencing of 107 primary CRCs, of which
approximately half developed metastatic disease or had distant metastases at the time of
diagnosis. We chose to analyse 676 genes based on their involvement in key signalling pathways
in CRC. We found an enrichment of mutations in the Eph receptor tyrosine kinase gene family
in metastatic patients, indicating a potential role for these genes in CRC metastasis.
This thesis outlines a series of procedures that can be employed in a high-throughput setting
for the analysis of solid tumours. We applied these methods to the analysis of colorectal tumours
and propose a link between novel somatic mutations and metastatic disease.
Keywords: Nucleic acid extraction, genotyping, targeted sequencing, mutation analysis,
colorectal cancer, metastatic disease
Lucy Mathot, Department of Immunology, Genetics and Pathology, Rudbecklaboratoriet,
Uppsala University, SE-751 85 Uppsala, Sweden.
© Lucy Mathot 2014
ISSN 1651-6206
ISBN 978-91-554-9042-3
urn:nbn:se:uu:diva-231508 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-231508)
Éist le fuaim na habhann agus gheobhfaidh tú breac.
Do mo thuismitheoirí
List of Papers
This thesis is based on the following papers, which are referred to in the text
by their Roman numerals.
I
Mathot, L., Lindman, M. and Sjöblom, T. (2011) Efficient and scalable serial extraction of DNA and RNA from frozen tissue samples.
Chem Commun. 47: 547-549 – Reproduced by permission of The
Royal Society of Chemistry.
II
Mathot, L., Wallin, M. and Sjöblom, T. (2013) Automated serial extraction of DNA and RNA from biobanked tissue specimens. BMC
Biotechnol, 13, 66.
III Mathot, L., Falk-Sörqvist, E., Moens, L., Allen, M., Sjöblom, T.
and Nilsson, M. (2012) Automated genotyping of biobank samples
by multiplex amplification of insertion/deletion polymorphisms.
PloS ONE, 7, e52750.
IV Mathot, L.*, Ljungström, V.*, Moens, L., Adlerteg, T., FalkSörqvist, E., Mayrhofer, M., Sundström, M., Micke, P., Botling, J.,
Isaksson, A., Birgisson, H., Glimelius, B., Nilsson, M† and Sjöblom,
T†. (2014) Somatic Eph receptor mutations in primary colorectal
cancers are associated with metastasis. Manuscript.
* These authors contributed equally to this work.
†
Equal last authors.
BMC Biotechnology and PLOS ONE are open-access journals and the authors retain rights to their work.
Contents
Introduction ................................................................................................... 11
Cancer ....................................................................................................... 11
Genes and pathways involved in cancer development ........................ 11
Genomic instability in cancer .............................................................. 12
Characterisation of cancer genomes on a large scale ............................... 13
Sample availability and quality ............................................................ 13
Nucleic acid isolation ........................................................................... 14
Sample identification and matching..................................................... 16
Mutational analysis .............................................................................. 17
Colorectal cancer ........................................................................................... 22
Epidemiology and aetiology ................................................................ 22
Pathology and staging .......................................................................... 23
Genetic basis of disease development.................................................. 23
Instability phenotypes in colorectal cancer .......................................... 24
Candidate CRC driver mutations ......................................................... 26
Therapeutic interventions..................................................................... 27
Prognostic markers in metastatic disease ............................................. 28
Present investigations .................................................................................... 30
Aims.......................................................................................................... 30
Serial extraction of DNA and RNA from frozen tissue............................ 31
Genotyping of patient-matched tumour/normal tissue ............................. 32
Deep sequencing reveals mutations associated with CRC metastasis ...... 33
Concluding remarks and future perspectives ................................................ 35
Acknowledgements ....................................................................................... 38
References ..................................................................................................... 40
Abbreviations
CIMP
CIN
CRC
ctDNA
DNA
DSBR
FAP
FFPE
HNPCC
InDel
MLGA
MMR
MSI
MSS
NAPI
NGS
PARP
PCR
RNA
RT-PCR
SNP
STR
T/N
TNM
TSG
5-FU
CpG island methylator phenotype
Chromosomal instability
Colorectal cancer
Circulating tumour DNA
Deoxyribonucleic acid
Double strand break repair
Familial adenomatous polyposis
Formalin-fixed, paraffin-embedded
Hereditary non-polyposis colorectal cancer
Insertion/deletion polymorphism
Multiplex ligation-dependent genome amplification
Mismatch repair
Microsatellite instability
Microsatellite stable
Nucleic acid purification and isolation
Next generation sequencing
Poly ADP ribose polymerase
Polymerase chain reaction
Ribonucleic acid
Reverse transcription polymerase chain reaction
Single nucleotide polymorphism
Short tandem repeat
Tumour/normal
Tumour, nodes, metastases
Tumour suppressor gene
5-Fluorouracil
Introduction
Cancer
Cancer is a genetic disease, caused by mutations in genes controlling proliferation and the regulation of cellular activities. Genetic alterations found in
cancer cells include subtle alterations (point mutations, small insertions and
deletions) and chromosomal aberrations (translocations, insertions and deletions). Some of these cancer-causing mutations are present in the germline,
resulting in a hereditary susceptibility to cancer; however, the vast majority
are somatic mutations, accumulated over time. In common solid tumours, it
is estimated that an average of between thirty-three and sixty-six genes display subtle somatic alterations1. Consequently, no single gene defect causes
cancer, as may be the case in other genetic diseases, such as Cystic fibrosis
or Duchenne muscular dystrophy. Instead, cancer development is a multistep
process, where some of the mutations acquired provide a growth advantage
to the cell, causing it to proliferate uncontrollably2,3.
Genes and pathways involved in cancer development
Although much remains to be discovered about the complexities of cancer
development, in particular in terms of the progression to advanced disease
stages, many of the genes and alterations responsible have been identified.
Genes involved in cancer development can be divided into oncogenes, tumour suppressor genes (TSGs) and stability genes4. Oncogenes and TSGs,
when mutated, drive disease progression by stimulating cell growth through
constitutive activation of a proliferation signal, or by inhibiting cell-cycle
arrest or cell death, respectively. For oncogenes, the activation of one allele
is generally sufficient to confer a growth advantage to the cell, e.g. activating
KRAS mutations in colorectal cancer (CRC). Oncogene activation can arise
by point mutations, gene amplifications, chromosomal translocations and
genomic rearrangements creating fusion genes. Conversely, for TSGs, both
alleles generally require inactivation, e.g. APC mutations and loss of heterozygosity in CRC. This is known as the “two-hit” hypothesis, first proposed
by Knudson in 19715. Like oncogenes, the inactivation of TSGs can also
occur through various mechanisms; point mutations resulting in missense or
nonsense mutations that render the protein product inactive or truncated,
small insertions/deletions, deletion of a chromosomal region (or entire
11
chromosomes), rearrangements disrupting the coding sequence or silencing
by DNA hypermethylation.
Mutations in, or the dysregulation of stability genes promote tumourigenesis indirectly, by increasing mutation rate. Stability genes maintain genomic
integrity and are involved in repairing and sustaining the genome during
normal replication and following exposure to mutagens6. This class of genes
include those involved in base excision repair, nucleotide excision repair,
mismatch repair (MMR) and double-strand break repair (DSBR). Inactivation of these stability genes leads to an overall increased mutation rate,
termed genomic instability, increasing the probability for a cancer-initiating
mutation.
Although there are many genes mutated in cancer, only a limited number
of pathways are affected. Mutations in genes belonging to the same pathway
lead to similar phenotypes. Alteration events within the same pathway are
generally mutually exclusive, i.e. a mutation of one gene in the pathway is
sufficient and it is unlikely that two genes of the same pathway will be affected in cancer4. Understanding which pathway the mutated gene belongs to
can provide opportunities for therapeutic intervention.
Genomic instability in cancer
Genomic instability has been described as an “enabling characteristic” in
cancer whereby an increased mutation rate is often required to accelerate
tumour progression2,3. The maintenance of genomic integrity is usually so
well-controlled by surveillance systems involving stability genes that mutation rates in normal cells are low (1 – 2.5 × 10-8 mutations per generation7,8)
and a sufficient number of mutations would take a longer time to accumulate
without increasing mutability.
Instability in cancer is linked with almost all solid tumours9, both hereditary and sporadic. In some hereditary cancer syndromes, a germline mutation
in one of the stability genes results in a predisposition to cancer development, often at a younger age compared to sporadic cancers of the same tissue
of origin. For example, mutations involved in genes regulating DSBR such
as BRCA1 and BRCA2 predispose to breast cancer while germline mutations
in MMR genes in hereditary non-polyposis colorectal cancer (HNPCC) predispose to CRC10.
In non-inherited cancers, the situation is more complex. In many sporadic
tumours, where no known predisposing condition exists, it is unclear whether an increased mutability is required for cancer initiation, or if it is sufficient for genomic instability to be acquired during progression. In addition,
conflicting studies on CRCs have reported both a lack of evidence of genomic instability in early lesions while, conversely, mathematical models
have predicted that instability is a required initiating event11,12. Intriguingly,
some solid tumours do not show genomic instability, indicating that some
12
cancers develop with a normal mutation rate, or one that is increased by extrinsic factors.
Bearing the inherent genomic instability of cancer in mind, it is evident
that the delineation of driver mutations in cancer is not a trivial task, since
many other genes become mutated along the way, i.e. passengers, which are
not required for tumour progression. Driver mutations are those driving tumour progression, they are selected for and confer a growth advantage to the
cell (an estimated 0.4 % increase in the difference between cell birth and cell
death rate for each driver mutation)13. Passenger mutations in cancer are
those that arise as a result of instability and increased cell division but which
do not confer a growth advantage to the cell. Therefore, passenger mutations
are somatic but not necessarily selected for1. As a consequence, it is important to be able to distinguish passenger mutations from mutations that are
driving tumour growth, since these are the alterations that must be targeted
in order to halt or decrease the rate of tumour progression. Studies suggest
that the majority of solid tumours require five to eight driver mutations to
develop, however, this may be an underestimation, and more driver genes
are likely to be discovered as technologies improve1.
Characterisation of cancer genomes on a large scale
Large-scale efforts to systematically characterise many different types of
solid tumours at a mutational level have been a major focus of the cancer
genetics field for the past decade14–24, greatly broadening our knowledge of
cancer genomes and the complexities underlying this disease. Characterisation of the cancer genome is a multi-step process, with each step presenting
its own challenges and pitfalls. From the time a sample is taken from the
patient to the final diagnosis or new research finding, many factors must be
taken into consideration. A properly conducted study requires adequate sample storage facilities, means to extract nucleic acids in a robust fashion, quality control of these extracted components and sensitive techniques for downstream applications such as mutation detection by sequencing.
Sample availability and quality
Translational cancer research is dependent on the availability of a large
number of samples of high quality. Gaining access to patient material has
been identified as one of the main bottlenecks in large-scale cancer research
efforts25. Issues with consent, ethical permits and the availability of the correct samples for each kind of analysis performed are some of the difficulties
commonly encountered.
Biobanking, particularly in the field of cancer research, is becoming an
increasingly important resource. Tumour biobanks are specimen and patient
13
information storage facilities that are designed to collect, store and distribute
samples of tumour and normal tissue26. These samples are intended for further use in fundamental and translational cancer research and greatly impact
the development of targeted or personalised anti-cancer treatment. A large
tumour sample repository is vital for studies such as large-scale next generation sequencing (NGS) efforts, where hundreds to thousands of tumour/normal (T/N) paired specimens are necessary for deducing less commonly mutated genes that may be involved in tumourigenesis27.
Issues of biobanking in cancer research
Researchers face two major problems when it comes to cancer biobanking.
Firstly, that they cannot always gain access to a sufficient number of samples
and secondly, that the available samples are not suitable for the required
purposes28. In addition, although biospecimens stored in biobanks are generally considered to give an accurate representation of the biology of a patient's disease, this is often not the case. The biospecimen may instead reflect
a combination of disease biology and the biospecimen's response to a wide
range of biological stresses, thus providing irreproducible data and confounding the results of many downstream applications, which ultimately
leads to the misinterpretation of artefacts29.
As biospecimen quality is a critical issue for tumour banks25,30, cancer tissue biobanking involves striking a balance between maintaining diagnostic
utility of tissue specimens while preserving nucleic acid integrity31. This is
important since biological materials are vulnerable to rapid change once
removed from the patient and need to be preserved so that the in vitro state
resembles as closely as possible that of the in vivo state28.
Nucleic acid isolation
Nucleic acid purification and isolation (NAPI) from biological specimens is
a process that has been employed by researchers for more than a century,
ever since Miescher isolated “nuclein” from leukocytes in 186932. Since
then, several techniques have been developed for optimal isolation of both
DNA and RNA from biological sources.
Methods of nucleic acid preparation from tissues
When dealing with tissue samples, the specimen may be fresh, frozen or
stabilised in a fixative and each type needs to be processed differently depending on the nature of the sample. Fresh and frozen tissues produce biomolecules of higher integrity than fixed specimens and require less processing prior to nucleic acid isolation33. Fresh tissues, however, are susceptible to endonuclease digestion unless processed immediately. Fixed specimens have been chemically modified to preserve biomolecule integrity for
long periods of storage at room temperature. The most commonly used fixa14
tive is formaldehyde, which denatures DNA and introduces sequencing artefacts by crosslinking cytosine nucleotides34–36. Tissues fixed in formaldehyde
are generally embedded in paraffin wax to maintain tissue morphology37.
Hence, paraffin must be removed prior to nucleic acid isolation either by
heat treatment or removal using an organic solvent such as xylene38. Formalin-fixed, paraffin-embedded (FFPE) tissue also affects the quality of RNA
that can be extracted, often due to incomplete lysis and high tissue content of
RNases that can degrade RNA during fixation39.
To recover the cell-associated and tissue-linked molecules, in addition to
removing debris and large molecules which may interfere with the extraction
of nucleic acids, the sample must first be disrupted by homogenisation, using
physical means, enzymatic digestion, detergents or a combination of all
three. Physical disruption utilises various apparatuses and technologies, e.g.
bead mill homogenisers, shaking-type bead mills, blade homogenisers,
freeze fracturing, grinders etc.40. Enzymatic digestion techniques such as the
use of proteinase K are widely employed for the removal of contaminating
proteins and nuclease digestion. An additional advantage of proteinase K in
nucleic acid isolation is that it retains activity in the presence of denaturing
agents and detergents which are commonly used for destabilising and removing proteins when isolating DNA and RNA from complex biological
mixtures41,42.
Following sample disruption and denaturation, various techniques can be
used to isolate nucleic acids (Table 1). Many of these are traditional, wellestablished methods and may be employed either in liquid phase with the
application of organic solvents (phenol-chloroform followed by ethanol precipitation)43 or salting out procedures44, or in solid phase with the selective
binding of the nucleic acid to a solid support (mainly silica) under chaotropic
conditions45–47.
High-throughput tissue processing
NAPI is critical for many major analytical procedures and poor sample preparation leads to sub-optimal results in downstream processes. Recent genome sequencing efforts have led to an increase in the number of samples
that nucleic acids are routinely prepared from. Consequently, there is a requirement not only for high quality biomolecules, but also for methods by
which samples can be prepared in a high-throughput manner. It is also desirable that both DNA and RNA can be serially extracted from the same tissue
lysate, not only to enable certain biological questions to be addressed e.g.
relating genomic and transcriptomic analyses, but also to reduce consumption of both biological samples and expensive reagents.
Robotic platforms provide a means for simplifying many routine laboratory procedures, such as nucleic acid extraction. High-throughput processing
allows for the simultaneous preparation of many samples, thus reducing the
need for time-consuming and labour-intensive protocols. Automating a pro15
cess calls for a reduction in the use of instrumentation such as centrifuges,
since these components are some of the primary causes of system failures.
The use of robotic platforms when extracting nucleic acids is increasing,
with many suppliers of nucleic acid extraction kits offering an automatic
alternative. Most of these instruments use solid phase extraction methods,
where silica columns or magnetised beads facilitate the isolation of DNA
and RNA based on hydrogen-binding interactions. However, magnetic separation techniques, such as the method described in Papers I and II, have an
advantage over column-based techniques when it comes to automation since
they obviate the need for centrifugation. Consequently, magnetic separation
is a simple and efficient way to separate nucleic acid-bound particles during
binding and elution, and does not generate shear forces that may lead to nucleic acid degradation48.
Table 1. Established and routinely used NAPI techniques.
Isolation technique
Suitable
sample types
Phenolchloroform43
Whole blood,
cultured cells,
tissue homogenates
Silica or glass
particles45–47
Salting out44
Whole blood,
cultured cells,
tissue homogenates
Whole blood,
cultured cells,
tissue homogenates
Advantages
Disadvantages
Suitable for
automation*
Inexpensive,
high DNA yield
Time consuming,
phenol is corrosive,
toxic and flammable, interference
downstream
No
Flexible, rapid,
yields of high
molecular
weight
Laborious, low
yields
No
Non-toxic
Laborious, requires
centrifugation
No
Chromatography (anionexchange and
silica gel
membranes)45–
Non-toxic, no
mechanical
Nucleic acids may
Yes, but
All sample
disruption,
bind irreversibly,
centrifugation
eluted nucleic
types
columns become
may cause
acids are ready
overloaded
problems
47,49
to use
Non-toxic, no
Lower yields than
shearing of
Magnetic bead
other silica-based
nucleic acids,
Most sample
Yes
based isolatmethods, nucleic
eluted nucleic
types
ion50–52
acids may bind
acids are ready
irreversibly
to use
*Suitability for automation based on the use of non-toxic substances and minimal additional
components on the robotic platform/liquid handler.
Sample identification and matching
In recent years, whole genome sequencing efforts have revealed the extent of
human genetic variation53. Genotyping techniques have made use of these
variable regions in order to successfully differentiate and match individuals
16
at a genetic level. This is becoming increasingly important in large biobanks,
where sample mix-up due to manual handling is a rare yet inevitable occurrence leading to misinterpretation of results. There are usually at least two
tissue specimens per cancer patient (tumour and normal sections for comparison), and more for those with metastases.
It is vital that large-scale genomic analyses utilise reliable input data, and
a major feature of this requirement is that patient data is matched to the correct sample. In order to accomplish this, it is desirable that all sample pairs
(i.e. tumour and normal tissue) are typed and matched following nucleic acid
extraction to ensure that no mix-ups have occurred. While essential, tissue
matching can be labour-intensive and time-consuming, and should ideally be
carried out in an automated fashion in 96- or 384-well format in order to be
feasible for large studies. Many of the available commercial kits for genotyping are simple and PCR-based but are time-consuming, may require expensive analysis software such as GeneMapper® (Applied Biosystems), or
are not sufficiently discriminatory to type samples on a stand-alone basis e.g.
the PCR-based Investigator DIPplex Kit from Qiagen54.
Genotyping techniques commonly make use of highly polymorphic short
tandem repeats (STRs) found throughout the genome. These di-, tri- or tetranucleotide repeats show a high level of allelic variation with regard to the
number of repeats and can be analysed using simple PCR-based techniques.
However, biallelic variations such as single nucleotide polymorphisms
(SNPs) and insertion/deletion (InDel) polymorphisms have recently been a
subject of growing interest in the forensic field55. Despite being less informative than STRs, InDels have a number of advantages, including (1) high
abundance, (2) high variability within and between populations (3) location
within regions of genomic stability (4) ability to type short fragments in
degraded samples and (5) lower mutation rates than STRs. Like STRs, it is
also possible to use multiplex PCR-based genotyping methods to study InDels56, which we make use of in the method described in Paper III.
Mutational analysis
As cancer is essentially a genetic disease, resulting from the accumulation of
mutations over time, identifying genetic alterations that arise specifically in
tumour cells forms the basis for studying the disease at a genetic level. In
order to do this, the genomic sequence in both the tumour and normal cells
must be determined and compared, thus generating a list of somatic mutations. Many of the most frequently mutated genes in a variety of cancer
types have been identified in this way using traditional methods like Sanger
sequencing. Technological improvements in sequencing during the past
twenty years have produced more sensitive sequencing methods capable of
detecting less frequently mutated genes present in subpopulations of the
tumour. However, the identification of genetic mutations with a low fre17
quency of occurrence requires a large number of samples to confidently
identify less common driver mutations.
Next generation sequencing
NGS consists of sequencing technologies that generally rely on a combination of template preparation, sequencing, imaging and genome alignment57.
Several commercially available platforms e.g. SOLiD (ABI), 454 (Roche),
Illumina (Illumina Inc.), PacBio RS II (Pacific Biosystems), Ion Torrent and
Ion Proton (Life Technologies) use massively parallel sequencing technologies with distinct nucleotide signal detection methods to generate millions of
reads of DNA/RNA sequence. Additional, more recent techniques include
the Oxford Nanopore Technologies’ strand sequencing method, which aims
to provide longer read lengths without using amplification, labelling or optical instrumentation; instead individual DNA strands are passed through a
nanopore by a processive enzyme, generating electronic signals58,59. NGS
has been widely implemented for de novo genome sequencing, targeted
resequencing of exons, transcriptome sequencing and epigenome sequencing
and has played a particularly important role in cancer research with worldwide efforts to catalogue all mutations found in each cancer type ongoing.
Much of the large-scale somatic mutation analyses in cancer have utilised
the Illumina sequencing technique17–24, requiring three major processing
steps; 1) library preparation of genomic DNA fragments, 2) amplification of
templates by isothermal ‘bridging’ to form DNA clusters and 3) sequencingby-synthesis where fluorescently-labelled reversible terminators are incorporated (Figure 1). Incorporated nucleotides are then identified by excitation of
the fluorophores and imaging of fluorescent signals60. Illumina technology
can be employed for whole genome, exome, transcriptome or targeted gene
panel sequencing using a variety of DNA library preparation techniques and
sequencing platforms.
NGS has revealed a large number of previously unknown alterations in
cancer, including point mutations, InDels, copy number alterations and
structural rearrangements. However, the number of novel recurrent mutations found by NGS efforts has been low in many cancer types, partly due to
the fact that many of the important tumour initiating mutations are present at
a high frequency and thus suitable for detection by older techniques. NGS
efforts have generated a vast quantity of data, increasing the complexity of
the categorisation and classification of each cancer type on a genetic level.
However, functional studies to assign a role to these mutations in cancer
have not yet been completed for most of these novel alterations. Additionally, it has been demonstrated that there are a higher number of unique causal
gene mutations in each tumour than previously thought and significant intraand inter-tumour heterogeneity lead to highly complex mutational profiles61.
Recently, studies that elucidated mutational spectra at single cell level in
breast cancer have further characterised the complexities of heterogeneity in
18
cancer, indicating that bulk analysis of tumours may conceal important lesions contained within individual tumour cell subpopulations, and even within individual tumour cells62,63. Appreciating the implications of these findings and the frequency of unique or private events within individual tumours
provides us with an incentive to work towards personalised diagnostics and
treatment.
a.
b.
c.
d.
e.
f.
g.
h.
A
T
G
i.
C
Figure 1. Illumina cluster generation and sequencing-by-synthesis chemistry.
(a) DNA fragments with adaptor sequence on either end. (b) Denatured singlestranded fragments are annealed to complementary oligonucleotides on the flow-cell
surface. (c) Solid-phase bridge amplification. (d) Nucleotides are incorporated to
build a double-stranded bridge. (e) Templates are denatured to produce singlestranded templates anchored to the flow-cell surface. (f) The process is repeated to
produce a cluster of molecules. (g) Reverse stands are removed. (h) Sequencing
primers are annealed to the single-stranded template and four labelled reversible
terminators are added to the flow-cell. (i) Fluorescently-labelled nucleotides are
incorporated based on the sequence of the template.
Targeted sequencing strategies
Poor depth of coverage (the average number of times each nucleotide is sequenced) in whole genome and exome sequencing results in insufficient data
for the detection of low frequency variants in samples with low tumour cell
content or a high degree of heterogeneity. In addition, despite the cost reduc19
tion in NGS in recent years, storage of large datasets and data analysis requirements mean that costs remain high for comprehensive genome or exome sequencing. In cancer diagnostics, depth of coverage of approximately
1000-fold is desirable, whereas 30-fold coverage may suffice in constitutional disease testing64. In addition, the expense involved in sequencing an
entire genome or exome to identify a small number of mutations that may be
relevant for a clinician or to provide the basis for the choice of a particular
therapy means that these strategies are not feasible outside the field of research. Instead, focus is now on the use of targeted NGS whereby a limited
(and clinically relevant) gene set is selected and enriched for64. In a clinical
setting, where sample volume is high but the number of targetable mutations
is relatively low, it is more time and cost-effective to sequence only the genomic regions of interest65. Indeed, sequencing has already been implemented within clinical settings, albeit to a limited degree, to aid in treatment decision-making (e.g. patients with KRAS mutations are not treated with antiEGFR therapies).
In a research setting, targeted resequencing of genes likely to harbour
driver mutations (by assessing the ratio of non-synonymous to synonymous
mutations within that gene, or an unexpected high mutation rate relative to
gene size) at a greater read depth may help to identify disease-causing alterations that can be targeted by specific treatments in the future. By limiting the
region of interest, a higher coverage of selected gene targets can be
achieved66. This information may highlight regions of particular interest and
importance in cancer and identify possible drug target strategies.
There are currently a number of methods by which libraries for targeted
resequencing can be prepared, most of which are based on hybrid capture
(array-based or in-solution), multiplexed PCR or molecular inversion probes.
In this thesis, targeted sequencing was performed using a variation of amplicon sequencing i.e. where highly multiplexed PCR is used to amplify
genomic DNA that has been hybridized to a selector probe, thus facilitating
the preparation of up to tens of thousands of amplicons in a single reaction67,68. Custom gene panels can be designed for the preparation of sequencing libraries for a specific application, and some companies now provide
cancer panels where genes that are commonly mutated in cancer can be targeted. This enriched library can then be sequenced on a massively parallel
sequencing platform and analysed.
Bioinformatic analysis and mutation-calling
NGS produces a large amount of data, requiring computational tools for
analysis. Generally, these technologies generate fluorescent signals corresponding to the sequence template, which are recorded by an imagecapturing device, resulting in readouts of nucleotides. A bioinformatics pipeline is necessary for analysing this data in order to detect somatic variants69.
In amplicon-based sequencing, this consists of: 1) base calling and quality
20
control, 2) removal or “trimming” of sequencing adaptors, 3) alignment of
trimmed reads to the reference sequence, 4) amplicon mapping of aligned
reads, 5) variant calling and 6) variant annotation. For somatic mutation
analysis, variant calling involves the detection of variant alleles present in
the tumour sample but not in the normal sample. This is a challenging task as
many types of errors can produce false positives, e.g. PCR amplification
errors, sequencing errors, and errors in read alignment. In addition, somatic
mutation analysis in solid tumours is confounded by normal cell contamination in the tumour and the presence of tumour cell subclones. There are a
number of variant-callers for somatic mutation detection that are publicly
available, e.g. MuTect70 and VarScan 271. However, many others are developed and used privately within research groups or projects. Currently no best
practice for somatic mutation calling exists. Discrepancies between individual callers using the same data highlight the issue that the choice of which
algorithm to use must be carefully considered72.
21
Colorectal cancer
Epidemiology and aetiology
CRC is the third most commonly diagnosed cancer, with over 1.4 million
new cancer cases diagnosed worldwide in 201273. CRC is characterised by
the development of normal colonic epithelial cells into malignant polyps, a
process that takes approximately 8-10 years in the absence of a known predisposing condition74. The strongest risk factor in the development of CRC is
a family history of the disease, but only 10-15 % of CRCs are caused by a
hereditary component. Known risk factors for sporadic disease development
include environmental factors (such as diet, alcohol use and cigarette smoking), increasing age and a sedentary lifestyle74,75. The prevalence is higher in
males than in females (male:female ratio of 1.2:1)73,76. It has been suggested
that hormone replacement therapy in postmenopausal females has a protective effect, which may explain the lower incidence in women77. Ulcerative
colitis and Crohn’s disease are inflammatory bowel diseases that account for
approximately two-thirds of CRC incidence, with the risk of disease development increasing with a longer duration of illness74,78.
Several syndromes are associated with an increased risk of CRC development and for each, a particular mutation (or group of mutations) has been
identified which gives rise to an increased risk for the disease. Mutations in
these pre-disposing genes result in the development of numerous benign
growths in the colon and rectum, some of which may become cancerous.
Examples of pre-disposing syndromes include familial adenomatous polyposis (FAP) caused by a mutation in the APC gene79 and HNPCC/Lynch syndrome caused by mutations in MMR genes80,81. Other, less common Mendelianly-inherited CRC syndromes include Peutz-Jeghers syndrome (mutant
LKB1/STK11), juvenile polyposis (mutant SMAD4, BMPR1A), MUTYHassociated polyposis (mutant MUTYH), hereditary mixed polyposis (mutant
GREM1) and more recently, inherited syndromes caused by mutations in the
POLE and POLD1 genes82,83. Whilst these syndromes account for less than 5
% of all colorectal cancer cases, it is estimated that a hereditary component
is implicated in 15- 30 % of CRCs. Therefore, it is reasonable to assume that
many mutations predisposing to CRC remain unidentified, perhaps due to
small sample sizes. It is also possible that other contributing factors accelerate disease development in families that are exposed to the same environment.
22
Pathology and staging
Most CRCs are located in the sigmoid colon and rectum, however a high
number are also found in other areas of the large intestine, with the particular
location seemingly linked with the type of genetic instability present. Carcinomas and adenocarcinomas of the colon and rectum are classified as malignant when they have penetrated through the muscularis mucosa into the
submucosa. Spread via lymphatic and blood vessels can occur early in disease progression and result in systemic disease. Tumours are clinically
staged using the tumour, nodes, metastases (TNM) classification (previously
Dukes classification) and staging is based on the depth of tumour penetration, the involvement of regional lymph nodes and the presence or absence
of distant metastases78. These individual parameters are then evaluated collectively to give a pathological stage between 0 and IV (Figure 3).
M - Distant metastases
N - Lymph node
T - Primary Tumour
Tis
T1
T2
T3
T4
Carcinoma in situ
Invasion of submucosa
Invasion of muscularis propria
Invasion of pericolorectal tissue
Invasion of peritoneum or adjacent organs
N0
N1
N2
No regional node involvement
Metastases in 1-3 nodes
Metastasis in 4 or more nodes
M0
M1
No distant metstases
Distant metastases in
one or more organs
Colorectal Tumour
M1
M0
N0
N1
T4
T1/T2
Tis
T1/T2
T3
Stage 0
Stage I
Stage IIA
Stage IIIA
Stage IIB
Stage IIIB
Stage IIC
Stage IIIC
N2
T3/T4
T3/T4
Stage IV
Figure 2. Colorectal cancer staging. Staging according to the TNM classification
system78,84.
Genetic basis of disease development
CRC develops due to a series of well-characterised mutations that arise in a
particular order and affect both oncogenes and TSGs. The progression of
CRC has been extensively studied and characterised on a genetic level, due
to the availability of biopsies from various stages in disease progression, i.e.
from benign polyps to advanced carcinomas. The evolution of benign polyps
to malignant tumours was first recognised by Muto et al.85 in 1975 and the
23
genetic changes at each stage were described in 1990 by Fearon and Vogelstein86, revealing that the inactivation of three TSGs and the activation of
one oncogene are required events87. Mutations in the Wnt pathway, which
regulates proliferation by the phosphorylation of β-catenin, thereby preventing its accumulation in the cell, are the earliest event in CRC tumourigenesis
and appear in most cases to be required for tumour initiation88. Activating
mutations in the Ras-MAPK pathway affecting KRAS, BRAF or NRAS are
found in up to 60 % of all CRCs89–91 while point mutations activating the
kinase activity of PIK3CA, leading to activation of the PI3K-Akt pathway
are found at a frequency of 15 – 25 %92. Allelic loss and point mutations in
TP53 are implicated later in tumour progression, eventually affecting 60 –
70 % of colorectal tumours93. Other relatively common genes that are targeted by mutations in CRC and become inactivated include FBXW7 mutations
(20 %)15,94, PTEN (10 %)95, SMAD2 (5 %), SMAD4 (10 – 15 %)96 and
TGFβRII (25 %)97,98. The progression from normal epithelium to carcinoma
is depicted in Figure 4, with the most frequent mutation event at each stage
indicated.
IV
Chromosomal
Instability
Metastasis
III
TP53/SMAD2/SMAD4
Disease Stage
(e)
?
?
Advanced
Carcinoma
II
(d)
BAX/TFG-βRII
Early
Carcinoma
(c)
PIK3CA
PTEN
Large
Adenoma
I
(b)
KRAS
BRAF
Small
Adenoma
0
APC
Normal
Epithelium
(a)
MMR germine mutation /
MLH1 hypermythylation
Microsatellite
Instability
Time
Figure 3. Simplified genetic model for colorectal cancer evolution. Development
of disease from stages 0-IV occurs by accumulation of mutations in (a) Wnt/MMR,
(b) Ras-MAPK, (c) PI3K-Akt, (d) p53/TGFβ and (e) unknown pathways.
Instability phenotypes in colorectal cancer
There are two main types of genomic instability in colorectal cancer, chromosomal instability (CIN) and microsatellite instability (MSI). These CRC
types are generally considered to be mutually exclusive and comprise a dif24
ferent mutational profile depending on the type of genomic instability9.
However, the genes affected are grouped within the same pathways.
Chromosomal instability
The majority (up to 85 %) of sporadic CRCs as well as tumours from FAP
patients exhibit large chromosomal abnormalities, both structural and numerical, known as CIN. The rate of chromosomal loss or gain is estimated to be
in excess of 10-2 per generation in these tumours99. These chromosomal aberrations are so complex that there is no representative karyotype for CIN
CRC. It is currently unknown what causes CIN, although alterations in genes
regulating the formation of the mitotic spindle and segregation of chromosomes at mitosis may contribute91. Inactivation of APC has also been proposed as a potential initiator of CIN100, but this has been disputed in other
studies, as some cancers with APC mutations do not lead to CIN. CIN tumours follow the “traditional pathway” for tumourigenesis, involving the
progression of a precursor adenomatous polyp to an advanced carcinoma,
with loss of function of APC and mutations in KRAS, PIK3CA, TP53,
SMAD2 and SMAD4 being the most common alterations. CIN tumours are
typically found in the distal colon and are shown to have a worse prognosis
than microsatellite unstable tumours.
Microsatellite instability
The MMR system maintains genomic integrity by repairing replicationassociated lesions like base-base mismatches and insertion/deletion loops
arising as a result of DNA polymerase slippage. Mutations in MMR genes,
most commonly MLH1, MSH2, MSH6, and PMS2 give rise to MSI, where
widespread insertions and deletions occur in repetitive DNA sequences
known as microsatellites. Microsatellites are short repeat sequences in the
genome that are particularly prone to replication errors due to slippage of
polymerase machinery, leading to an overall increased rate of mutation in
the genome. MSI was first recognized in HNPCC101, where approximately
95 % of tumours are MSI+. MSI also occurs in approximately 15 % of sporadic CRC, due to hypermethylation of the MLH1 promoter, resulting in loss
of expression of MLH1102. MSI tumours are characterised by mutations in
BRAF, TGFβRII and PTEN and are primarily located in the proximal colon,
having arisen from adenomatous or serrated polyps. These types of tumours
generally have a better prognosis than CIN tumours.
CpG island methylator phenotype
A more recently described alternative pathway progresses from a serrated
hyperplastic polyp, and is characterised by a CpG island methylator phenotype (CIMP). CIMP tumours have a higher incidence of hypermethylation at
multiple promoter regions, including promoter hypermethylation of MLH1,
resulting in MSI103. CIMP tumours also have a high frequency of BRAF mu25
tations104. There is a strong link between features of CIMP and sporadic MSI
tumours and CIMP tumours also mainly arise in the proximal colon. However, these tumours have unique clinical feature in cases that are microsatellite
stable (MSS), and these patients have a particularly poor prognosis105.
Other CRC subgroups
Although the three instability phenotypes described above are those most
commonly used to classify CRC, Jass proposed a more comprehensive characterisation of CRC with five molecular subgroups as summarised in Table
2. Group 1 is more commonly referred to as sporadic MSI-high CRC, group
4 as MSS CRC arising either sporadically, FAP-associated or MUTYHassociated polyposis, and group 5 as HNPCC.
Table 2. Molecular and clinical features of CRC groups 1-5106.
Feature
Group 1
Group 2
Group 3
Group 4
Group 5
MSI status
Methylation
High
+++
Stable/Low
+++
Stable/Low
++
Stable
+/-
High
+/-
Precursor
lesion
Serrated
polyp
Serrated
polyp
Adenoma
Adenoma
Frequency
12 %
8%
57 %
3%
Serrated
polyp or
adenoma
20 %
In addition, recent studies have classified sporadic CRCs into three subgroups: 1) non-hypermutated (generally CIN, MSS tumours), 2) hypermutated MSI tumours and 3) hypermutated MSS tumours with somatic POLE
mutations20,82, although the prognostic implications of the third subgroup are
currently unknown.
Candidate CRC driver mutations
Although changes in the aforementioned genes and pathways are the most
common alterations in CRC, other mutations appear to be involved in disease development, some of which have only come to light in the NGS era.
Recurrent mutations in candidate cancer genes first identified by Sjöblom et
al.14 and Wood et al.15 have since been identified by other studies, as well as
a myriad of other potentially novel driver mutations20,107–112, including those
described in Paper IV. Although numerous bioinformatic tools are available
for determining driver mutations, several of these (such as MutSig CV113)
are biased with prior knowledge like background mutation rates and replication time, meaning that novel driver mutations may be continuously bypassed. It is however, sometimes necessary to input prior knowledge when
designing such tools, otherwise, large genes that can accumulate many passenger mutations in a background of genomic instability will continuously
surface as putatively significant genes. Although tools for rank ordering the
26
most likely driver mutations are useful, only functional and biological studies can definitively classify a candidate cancer gene.
Due to the time-consuming nature of functional studies in using cell lines
and animal models, the majority of novel mutations found at an intermediate
frequency of 2 – 20 % have yet to be assigned a role in disease, and their
importance in CRC development remains elusive. Examples of cancer genes
that are mutated at a lower frequency, but have a proposed functional role in
CRC development are shown in Table 3. Although the selected genes are by
no means exhaustive, considering that many hundreds of candidate cancer
genes have been identified it is evident that much remains to be done on a
functional level.
Table 3. Examples of CRC genes mutated at low frequency with assigned functions.
Gene
Mutation prevalence*
Functional role
TCF7L2
12 %
TSG, modulates proliferation, encodes transcription
factor downstream of Wnt114
FAM123B
11 %
TSG, negative regulator of Wnt signalling by promoting degradation of β-catenin115
ARID1A
9%
PRDM2
8%
FAT1
6%
TSG, encodes component of SWI/SNF chromatin
remodelling complex regulating gene expression116,117
TSG, histone methyltransferase, induces cell cycle
arrest and apoptosis118
TSG, negative regulator of Wnt signalling encodes
cadherin-like protein which binds β-catenin119
Oncogene, histone methyltransferase, enhancermediated transcriptional regulation120
* Mutation prevalence obtained from the cBioPortal for Cancer Genomics121,122. Genes that
are chosen are mutated at a frequency between 2 and 20 % in CRC and have been characterised by functional studies.
KMT2D
5%
Therapeutic interventions
Surgery is currently the main treatment for patients with potentially curable
disease. Radiotherapy is also frequently employed for CRC therapy, as both
a pre- and post-operative treatment for rectal cancer, or as a palliative treatment in advanced colon cancer. In Sweden, the use of preoperative radiotherapy has improved the survival of rectal cancer patients and is currently
standard practice123. For advanced CRC, a number of systemic treatment
options are currently used in clinical practice, which may complement resection of the tumour in advanced disease stages. These include fluoropyrimidines such as 5-fluorouracil (5-FU) and capecitabine, and cytoxic agents
such as irinotecan and oxaliplatin.
27
Interestingly, the instability phenotype of the tumour can have a direct
consequence for treatment efficacy, as with MSI tumours which in some
studies demonstrate a lack of response to 5-FU treatment124. This suggests
that the assessment of genomic instability can have worthwhile predictive
consequences for therapy. Targeted therapies such as bevacizumab (a
VEGF-A inhibitor) and cetuximab or panitumumab (EGFR inhibitors) may
also be implemented, although it is important to note that acquired resistance
has been a major issue in such targeted therapeutics125–127. Currently, it is
recommended that all patients are tested for mutations in KRAS prior to
treatment with EGFR blockade128 due to the lack of response in patients with
KRAS mutations129. However, these patients should also be assessed for mutations in other downstream targets of EGFR signalling i.e. KRAS, NRAS,
BRAF and PIK3CA to predict response due to intrinsic mechanisms of resistance130. These treatments may be used in various combinations depending on disease stage, level of patient tolerance and mutational profile131. In
stage IV disease, FOLFOX (leucovorin, 5-FU and oxaliplatin) or FOLFIRI
(leucovorin, 5-FU and irinotecan) in combination with cetuximab or panitumumab followed by liver resection have shown promising results for treating liver metastases in CRC132,133. However, the benefits of adjuvant treatment at different stages of disease are complicated and controversial (in particular in stage II CRC) and correct disease staging is of utmost importance,
as this determines the treatment used134. In CRC, the five-year survival rate
for localised disease is approximately 91 %, whereas it is only 13 % in cases
where distant metastases are found135.
Prognostic markers in metastatic disease
CRCs primarily metastasise to the liver, likely due to the nature of portal
circulation and the fact that the sinusoids in the liver do not represent a major barrier to extravasation. CRCs also metastasise to the lungs, bone and
brain, although this occurs less frequently. There are four main elements to
the metastatic process in solid tumours: 1) increased angiogenesis of the
primary tumour, 2) intravasation of tumour cells, 3) extravasation by proteolysis of the extra-cellular matrix and 4) migration into the surrounding tissue136. A number of classifications have been proposed to illustrate the types
of genes that are important during each process; metastasis initiating genes
that provide a growth advantage to the primary tumour (thus favouring intravasation), metastasis progression genes that provide growth advantages
both in the primary tumour and at the metastatic site and metastasis virulence genes that provide a selective growth advantage at the metastatic site
only137. These distinctions may provide an explanation for the opposing
views that metastatic disease is either a pre-determined or evolving process,
as it may be the case that both of these hypotheses are plausible. Indeed, a
recent study has confirmed that most of the genetic alterations known to be
28
early events in CRC are shared between the primary and metastatic tumours,
and only a small number of mutations are unique to the metastatic tumours138.
Despite the great deal of knowledge available regarding frequent mutations causing CRC, the pathogenesis of the disease is still not fully understood, in particular with regard to the development of distant metastases.
Approximately 20%–25% of patients with CRC present with metastatic disease at time of diagnosis, and 20%–25% of patients will develop metastases
later. This leads to a relatively high overall mortality rate of 40%–
45%128. Therefore, there is a great need for prognostic and predictive genetic
markers that can help to stratify patients at higher risk of developing advanced metastatic disease and to aid in decision-making for adjuvant treatment.
The prognostic value of BRAF mutations appears promising, but only for
a poor overall survival139,140. Large-scale NGS efforts have identified genes
where mutations appear to have a protective effect for metastatic disease
development, e.g. in FBXW720. Nevertheless, no specific mutations are currently associated with disease free survival and testing for MSI and CIN are
the sole predictors for the risk of relapse141. Increased knowledge of the
mechanisms involved in metastasis may help us to understand what genes to
search for alterations in. However, as it is often difficult to obtain biopsies of
metastatic tumours (as many are not resectable), profiling these tumours for
specific gene mutations remains a challenging task. We have identified a
group of mutations potentially related to metastatic disease development in
Eph receptor tyrosine kinases (detailed in Paper IV), although the functional
role of these alterations remains to be determined.
29
Present investigations
Aims
The aim of this thesis was to establish robust, high-throughput methods for
the analysis of CRC tissue, starting from fresh frozen tissue and leading to
the identification of variants with high sensitivity. The specific aims of each
paper were as follows:
I
Develop a method for serial nucleic acid extraction from fresh
frozen tissue that is suitable for automation.
II
Implement the technique developed in Paper I on a robotic
platform and assess the integrity of the extracted nucleic acids, in addition to their performance in downstream applications.
III
Establish a method for the rapid identification of biobanked
samples from the same individual by targeting InDel polymorphisms.
IV
Perform a pathway-oriented mutational analysis of a large cohort of CRCs to discover mutations associated with metastasis
using targeted deep sequencing.
30
Serial extraction of DNA and RNA from frozen tissue
With increasing biobanking of biological samples, methods for large-scale
extraction of nucleic acids are in demand. The lack of such techniques designed for extraction from tissues results in a bottleneck in downstream genetic analyses, particularly in the field of cancer research. Although several
methods for nucleic acid extraction from tissue are available, not all are cost
effective or suitable for automation due to centrifugation, disposable plastic
ware and the use of toxic reagents. For those solutions available on an automated platform, one is restricted in the choice of technology due to closed
systems. We aimed to develop an open automated procedure for tissue homogenisation and serial extraction of DNA and RNA into separate fractions
from the same frozen tissue specimen, thus negating the need for multiple
specimens from the same patient.
Results and discussion
A process for serial extraction of nucleic acids from the same frozen tissue
sample based on magnetic silica particles with differential affinities for DNA
and RNA was developed in Paper I and automated on a Tecan Freedom Evo
robotic workstation in Paper II. A schematic representation of the extraction
process is shown in Figure 5. Fresh-frozen human normal and tumour tissue
samples (n = 864) from breast and colon were serially extracted in batches of
96 samples, and the yields and quality of DNA and RNA were determined.
The DNA was evaluated in several downstream analyses, and the stability of
RNA was determined after 9 months of storage.
Tissue
sample
Lysis
Selective
beads
Capture
Wash
Elute
DNA
RNA
Figure 4. Schematic representation of the DNA/RNA extraction process. The
tissue sample is lysed and DNA is bound to selective paramagnetic silica beads,
washed and eluted. The process is repeated for RNA using the same lysate with
DNA removed.
31
The extracted DNA performed consistently well across numerous processes
including Sanger sequencing, PCR-based STR analysis, HaloPlex target
enrichment and deep sequencing on an Illumina platform, and gene copynumber analysis using microarrays. The RNA has performed well in RTPCR analyses and maintains integrity upon storage. Comparing our results
with established techniques, such as DNA extraction using a phenolchloroform based method; we saw no decrease in performance in downstream applications. Instead, we demonstrate an advantage of our methodology concerning reduction in the time taken to perform the extractions.
The technology developed in Papers I and II enabled the simultaneous
processing of many tissue samples, reducing the sample preparation bottleneck in cancer research. The open automation format also facilitated integration with upstream and downstream devices for automated sample quantitation or storage. Our technology was used for the extraction of DNA used in a
large scale sequencing effort on 107 matched T/N CRC specimens, which is
described in Paper IV.
Genotyping of patient-matched tumour/normal tissue
Characterisation of cancer genomes entails mutational analyses of vast numbers of patient-matched tumour and normal tissue samples. However, this
increases the risk of patient sample mix up due to manual handling, confounding the interpretation of results. Correctly matching T/N pairs from the
same patient is essential for somatic mutation analyses, as material from the
normal counterpart is required in order to filter away germline polymorphisms and determine the contribution of acquired mutations to the disease.
Genotyping methods are often laborious and time-consuming and scalable
sample identification procedures are essential for pathology biobanks and
research efforts where large numbers of patient samples are analysed simultaneously. In Paper III, we aimed to develop a method using multiplex
amplification of prevalent InDel polymorphisms in order to match samples
from the same individual based on their unique genetic variation. We also
sought to establish a program for the rapid analysis and comparison of profiles between many samples to facilitate large-scale efforts in the field of
cancer genomics. The method should also be suitable for low amounts of
input DNA, as well as fragmented DNA, since these are often limiting factors when analysing DNA from FFPE specimens.
Results and discussion
A method originally described by Isaksson et al.142 was adapted in Paper III
for InDel genotyping using multiplex ligation dependent genomic amplification (MLGA). Fifty-three prevalent InDel polymorphisms found in a European population were targeted by restriction digestion of genomic DNA fol32
lowed by ligation to selector probes and multiplex amplification by PCR
with fluorescently-labelled primers. The PCR products were separated in a
capillary sequencer and a peak spectrum was obtained and analysed using an
in-house fragment analysis program.
Twenty-four fresh frozen T/N colorectal samples from different patients
were successfully matched using this method. The technique was found to be
sensitive with as little input DNA as 0.3 ng. The technique was also successful when using DNA extracted from FFPE tissue, despite a lower number of
amplifiable fragments. As the technique is PCR-based and a program for the
simultaneous analysis of many pairs at once was developed for the assay,
this technique can be scaled up for use in 96- or 384-well formats. 107 T/N
CRC pairs were successfully genotyped with the assay prior to the somatic
mutation analysis study described in Paper IV.
Deep sequencing reveals mutations associated with
CRC metastasis
In Paper IV the aim was to uncover mutations associated with metastasis in
CRC by employing a targeted deep sequencing strategy. The genetic model
for CRC development is well defined and the stepwise progression of the
disease is largely understood91. CRC progression occurs over many years,
with the development from a large adenoma founder cell to an advanced
carcinoma founder cell taking as long as 26.5 years143. However, there remains a large gap in our understanding of the genetic basis for the development of CRC from an advanced carcinoma to metastatic disease, a process
taking only 1.8 years on average143. This presents a significant problem for
clinicians, since patients presenting with stage II and II disease cannot easily
be stratified into high and low risk groups, where adjuvant therapy may or
may not be beneficial. This also suggests that the majority of mutations required for metastases are already present in the advanced carcinoma. As
genetic events driving metastases are suspected to occur late in the disease,
conventional sequencing by Sanger and low coverage NGS techniques may
not have sufficient sensitivity to discover base level somatic mutations driving the development of metastases. We therefore performed a targeted deep
sequencing strategy (~1000-fold coverage) in 107 CRCs in an attempt to
uncover less frequent mutations that may be associated with late stage disease.
Results and discussion
Previous approaches for cataloguing the mutation spectrum in CRC were
limited either by the number of patients analysed or a low depth of coverage14,20,107. In this study, we present pathway-oriented mutational profiles of
33
a large number of primary CRCs, of which approximately half eventually
developed metastatic disease. We observed the expected frequencies of mutations in known CRC genes such as APC, TP53 and KRAS, confirming the
sensitivity of our somatic mutation calling. In addition, we discovered a high
mutation rate in genes such as TCF7L2, SETD1B and BMPR2 in microsatellite repeat regions in MSI tumours, suggesting a greater sensitivity of our
mutation detection software in regions that are difficult to analyse. Interestingly, we uncovered a significantly higher mutation rate in the Ephrin receptor tyrosine kinase family in MSS primary CRC tumours that proceeded to
develop distant metastases compared to those that showed no such progression. Due to the pattern of mutations found in these genes, we propose that
they may have a metastasis suppressor role in their wild type state, and that
mutations may inactivate this function, thus promoting a more aggressive
disease progression. The Ephrin receptor gene family has long been thought
to have a role in metastatic disease development and expression analyses
have linked both over- and under-expression with cancer progression in
many different tissue types. However, specific mutations associated with
metastases have not yet been characterised. As a number of patients had
mutations in more than one Ephrin receptor gene, we hypothesise that the
inactivating effect of these mutations may be cumulative, requiring a number
of mutations to promote metastases. This proposal is also supported by the
high level of sequence similarity between genes in this family. In conclusion,
although a significant difference was observed between metastatic and nonmetastatic patient groups, a larger number of patients would be required to
confirm our findings, as well as functional analyses of the mutations found
in this study.
34
Concluding remarks and future perspectives
Problems associated with sample preparation and correctly identifying patient material have long been recognised, as well as the vast amounts of preanalytical variables that may confound interpretation of results. Currently,
the Biorepositories and Biospecimen Research Branch (BBRB) of the National Cancer Institute in the US, and the European Organisation for Research and Treatment of cancer (EORTC) are developing guidelines for the
standardisation and improvement of biospecimen handling to ensure that
findings are reproducible and reflective of disease state in vivo.
Unfortunately, the failure rate in drug development for the treatment of
cancer indicates that much remains to be done. Studies by Amgen and Bayer
HealthCare attempting to reproduce pre-clinical data have reported dismally
low success rates from 11 – 25 %, even with studies from papers considered
to show “landmark” findings144,145. Although one cannot only blame poor
experimental design and non-standardised sample preparation for the irreproducibility of pre-clinical findings (cancer in itself is a particularly difficult disease due to both inter- and intra-patient heterogeneity), these are certainly some of the limiting factors. Problems with poor sample handling and
preparation, combined with the complexity of the disease have resulted in a
wealth of problems for drug development. The field of personalised medicine, once an exciting buzzword for finding the ultimate cure to cancer, has
become littered with reports of resistance to therapy and relatively small
gains in overall or disease-free survival130,146-150.
For treatment of advanced CRC, fluoropyrimidines and cytotoxic agents,
in combination with surgical resection are still the best treatment options
available. Considering that these adjuvant therapies were used prior to the
influx of vast quantities of knowledge on the genetic basis of CRC and that
targeted treatments have not delivered the results we had hoped for, it is
worth questioning the current model of drug development. There is a considerable need to find better strategies for tackling advanced stage cancer. Despite the fact that CRC is a heterogeneous disease, many patients are treated
with the same therapy, a “one size fits all” approach. Improving the classification of CRCs at a genetic level, and appreciating the fact that different
subtypes respond better or worse to various cytotoxic dugs may help to improve survival rates. In this thesis I attempted to address some of these issues
by: 1) developing a means for the comprehensive analysis of a large number
of colorectal tissues, 2) using these methods to prepare high quality input
35
materials and 3) analysing this material with the latest sequencing technologies in order to further develop our understanding of advanced stage CRC.
In much of the developed world where screening measures for early disease detection in at-risk populations are in place, such as Papanicolaou
smears for cervical cancer, mammography for breast cancer and colonoscopy for colorectal cancer, there have been promising results for the prevention
and treatment of cancer. As distant metastases cause the majority of deaths
from cancer, early detection of disease is imperative. In the Nordic countries,
where a faecal occult blood screening programme has been set up to identify
patients eligible for colonoscopy for CRC detection, results have been positive, with estimates showing that up to 1% of all premature deaths in a particular age-group can be affected by invitation to the programme151. In fact,
during the past two decades mortality from CRC has declined, in particular
in Northern and Western Europe. This has in part been attributed to improved screening resulting in earlier detection128.
New sequencing technologies for the non-invasive detection of cancer in
the circulation are another promising method for potentially detecting tumours at an early stage. There have been a wealth of publications in the field
of circulating tumour DNA (ctDNA) in the last few years, and it is clear that
as amplification techniques become less error-prone, sequencing methods
become more sensitive, and mutation detection software can distinguish true
from false positives in an unbiased manner; screening can successfully be
performed even in populations not defined as being at-risk152-157. As sequencing costs continue to decrease, one may be able to use the mutational information acquired by sequencing primary tumours to identify cancers in the
blood at an early stage, thus preventing advanced disease. This year, Bettegowda and colleagues proposed that the detection of early stage cancers by
ctDNA analysis was possible in 50 % of the patients studied155, while Misale
et al. demonstrated the potential of “liquid biopsies” for evaluating acquired
resistance to therapy in metastatic disease158, indicating that these types of
screening strategies may indeed be feasible in the future.
For targeted and personalised therapies, there have been a number of successful developments e.g. Gleevec/Imatinib in the treatment of chronic myeloid leukaemia159. In addition, synthetic lethality approaches in the case of
poly ADP ribose polymerase (PARP) inhibitors160,161 have shown efficacy in
the treatment of the appropriate subset of patients, i.e. cancers with deficient
DSBR. Efforts to move away from personalised therapy by targeting processes cancer cells depend on, e.g. by promoting oxidative damage in cancer
cells by MTH1 inhibition162 are also promising, but it remains to be seen
how these will behave in a clinical setting.
Mechanisms of inherent and acquired resistance have caused many problems for tackling disease e.g. as is the case with PARP inhibitors163. This is
particularly problematic for EGFR blockade in CRC, due to intrinsic (mutations in genes downstream of EGFR, EGFR copy number, HER3 overex36
pression) and extrinsic (genetic alterations in the Ras pathway, mutations in
EGFR and amplification of MET) mechanisms of resistance130. However, as
our knowledge of the mechanisms in cancer increases, ways to circumvent
the problems of acquired resistance are being proposed. For example, in
CRC it has recently been proposed that crizotinib could be used in combination with EGFR blockade in order to circumvent resistance due to amplification of the MET receptor in suitable patients164. In fact, few genetic alterations that are suitable targets for therapy have been identified in CRC. Therefore sequencing and other methods for profiling cancer genomes provide
value, to identify possible targets and to allow a better understanding of the
genetic mechanisms of innate and acquired resistance, potentially providing
a means and rationale for overcoming these problems.
37
Acknowledgements
This work was carried out at the Department of Immunology, Genetics and
Pathology, at Uppsala University. Financial support was provided by grants
from the Swedish Cancer Foundation, The Uppsala/Umeå Comprehensive
Cancer Consortium (U-CAN), Bio-X VINNOVA, IMI and the Swedish
Foundation for Strategic Research.
I would like to express my gratitude to a number of people:
First of all, I would like to thank my supervisor Tobias Sjöblom, for taking
me in and giving me this great opportunity, for allowing me to develop as a
scientist and for encouraging me to delve into the world of business. Thank
you for the confidence you have always shown in me.
To my co-supervisor Mats Nilsson, it has been a pleasure to work with you.
Thank you for all of the patience you have shown and for always having a
solution for everything!
Thank you to all of the people working at IGP, especially the IGP administration and Christina Magnusson, for taking care of the PhD students and
for always being there to listen to our problems and queries. Thanks also to
past and present members of our office (the best one at Rudbeck) for making
it a pleasure to come to work in the mornings.
Thank you to the Uppsala Genome Center for the valuable resource you
provide and to Simin Tahmasebpoor for preparing all of the tissue samples
used in this thesis.
To my colleagues, in particular the members of the Molcan group; Chatarina, Evangelia, Ivaylo, Muhammad, Sara, Snehansghu, Tanja, Tom and Verónica and the now-dwindling Selector group; Elin and Lotte, thank you for
all the problem-solving discussions and extra-curricular activities!
To all the people that have worked with me on the projects included in this
thesis, thank you for some great collaborative efforts. I would particularly
like to thank Bengt Glimelius for always being ready to shed some light on
the clinical side of CRC. Special thanks to Monica for showing me how to
38
work PCR product-free J and Viktor for valuable discussions and light relief during the “Comedia” months.
To the unofficial U-CAN group, thanks for all the fun evenings we shared
together. I hope there will be many more! Thanks in particular to Tony for
the imaginative email subject lines.
To the ExScale team: Karin, Ammar, Sten and Anna, thank you for showing
me that there is a world outside of research; working with all of you has been
a wonderful and eye-opening experience.
To the MGC teaching crew: Anja, Diego and Vasil, thanks for all the fun we
managed to have despite the hectic schedules (which version was it again?).
I’m sad that there will be no more “red red” emails!
To all the friends I have made since coming to Sweden, thank you for taking
me in and making me feel that I had a home away from home. To the Irish
contingent in Uppsala; Nicola, Lesley, Paul, Frank, and Emma for always
being a source of teabags, Cadburys and sanity, go raibh maith agaibh!
Thank you Fiona, Linnea and Tanja, for becoming such dear friends to me.
To my friends at home, thank you for always making it seem that it was only
yesterday when we last saw each other. In particular, thank you Saoirse and
Eoin for being the best friends one could ever wish for.
To my “Swedish family”: Anette, Josefin, Mikael and Erik, thanks for all the
support you have shown me over the past few years and for welcoming me
so much into your lives.
To my godparents Mary and Liam, thank you for always supporting me in
my endeavours and never forgetting to send birthday cards.
Johan, det går inte att beskriva hur mycket jag älskar dig för allt du gör. Tack
för att du alltid finns där för mig.
To my beloved family, Dad, Mam, Judith, Catherine, Tim and Johnny, for
your love and all of the support you have given me, thank you. I am so lucky
to have a family like you! Dad, a special thanks to you for being the reason I
am doing research, you have always encouraged me to learn more.
Finally, thank you to all of the patients that agreed to donate samples to these
studies and made this work possible.
39
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
40
Vogelstein, B. et al. Cancer Genome Landscapes. Science 339, 1546–1558
(2013).
Hanahan, D. & Weinberg, R. A. The hallmarks of cancer. Cell 100, 57–70
(2000).
Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell
144, 646–74 (2011).
Vogelstein, B. & Kinzler, K. W. Cancer genes and the pathways they control.
Nat Med 10, 789–99 (2004).
Knudson, A. G. Mutation and Cancer: Statistical Study of Retinoblastoma.
Proc. Natl. Acad. Sci. U. S. A. 68, 820–823 (1971).
Kinzler, K. W. & Vogelstein, B. Gatekeepers and caretakers. Nature 386, 761–
763 (1997).
Tomlinson, I. P. M., Novelli, M. R. & Bodmer, W. F. The mutation rate and
cancer. Proc. Natl. Acad. Sci. U. S. A. 93, 14800–14803 (1996).
Nachman, M. W. & Crowell, S. L. Estimate of the mutation rate per nucleotide
in humans. Genetics 156, 297–304 (2000).
Lengauer, C., Kinzler, K. W. & Vogelstein, B. Genetic instabilities in human
cancers. Nature 396, 643–9 (1998).
Hoeijmakers, J. H. J. Genome maintenance mechanisms for preventing cancer.
Nature 411, 366–374 (2001).
Sieber, O. M., Heinimann, K. & Tomlinson, I. P. M. Genomic instability--the
engine of tumorigenesis? Nat. Rev. Cancer 3, 701–708 (2003).
Michor, F., Iwasa, Y., Vogelstein, B., Lengauer, C. & Nowak, M. A. Can
chromosomal instability initiate tumorigenesis? Semin. Cancer Biol. 15, 43–49
(2005).
Bozic, I. et al. Accumulation of driver and passenger mutations during tumor
progression. Proc. Natl. Acad. Sci. U. S. A. 107, 18545–18550 (2010).
Sjoblom, T. et al. The consensus coding sequences of human breast and colorectal cancers. Science 314, 268–74 (2006).
Wood, L. D. et al. The genomic landscapes of human breast and colorectal
cancers. Science 318, 1108–13 (2007).
Stephens, P. J. et al. The landscape of cancer genes and mutational processes in
breast cancer. Nature 486, 400–4 (2012).
Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525 (2012).
Cancer Genome Atlas Research Network. Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature 499, 43–49 (2013).
Cancer Genome Atlas Research Network et al. Comprehensive molecular
characterization
of
gastric
adenocarcinoma.
Nature
(2014).
doi:10.1038/nature13480
Cancer Genome Atlas Network. Comprehensive molecular characterization of
human colon and rectal cancer. Nature 487, 330–7 (2012).
21. Cancer Genome Atlas Research Network. Comprehensive molecular characterization of urothelial bladder carcinoma. Nature 507, 315–322 (2014).
22. Cancer Genome Atlas Network. Comprehensive molecular portraits of human
breast tumours. Nature 490, 61–70 (2012).
23. Cancer Genome Atlas Research Network. Comprehensive molecular profiling
of lung adenocarcinoma. Nature 511, 543–550 (2014).
24. Cancer Genome Atlas Network. Integrated genomic analyses of ovarian carcinoma. Nature 474, 609–15 (2011).
25. Riegman, P. H., Morente, M. M., Betsou, F., de Blasio, P. & Geary, P. Biobanking for better healthcare. Mol. Oncol. 2, 213–22 (2008).
26. Oosterhuis, J. W., Coebergh, J. W. & van Veen, E. B. Tumour banks: wellguarded treasures in the interest of patients. Nat. Rev. Cancer 3, 73–7 (2003).
27. Hudson, T. J. et al. International network of cancer genome projects. Nature
464, 993–8 (2010).
28. Carter, A. & Betsou, F. Quality Assurance in Cancer Biobanking. Biopreservation Biobanking 9, 7 (2011).
29. Moore, H. M. et al. Biospecimen Reporting for Improved Study Quality. Biopreservation Biobanking 9, 57–70 (2011).
30. Hewitt, R. E. Biobanking: the foundation of personalized medicine. Curr.
Opin. Oncol. 23, 112–9 (2011).
31. Mee, B. C. et al. Maintaining breast cancer specimen integrity and individual
or simultaneous extraction of quality DNA, RNA and proteins from Allprotectstabilized and nonstabilized tissue samples. Biopreservation Biobanking 9, 10
(2011).
32. Choudhuri, S. Some major landmarks in the path from nuclein to human genome (1). Toxicol. Mech. Methods 16, 137–59 (2006).
33. Farragher, S. M., Tanney, A., Kennedy, R. D. & Paul Harkin, D. RNA expression analysis from formalin fixed paraffin embedded tissues. Histochem. Cell
Biol. 130, 435–45 (2008).
34. Douglas, M. P. & Rogers, S. O. DNA damage caused by common cytological
fixatives. Mutat. Res. 401, 77–88 (1998).
35. McGhee, J. D. & von Hippel, P. H. Formaldehyde as a probe of DNA structure. 3. Equilibrium denaturation of DNA and synthetic polynucleotides. Biochemistry (Mosc.) 16, 3267–76 (1977).
36. Williams, C. et al. A high frequency of sequence alterations is due to formalin
fixation of archival specimens. Am. J. Pathol. 155, 1467–71 (1999).
37. Carpi, F. M., Di Pietro, F., Vincenzetti, S., Mignini, F. & Napolioni, V. Human
DNA extraction methods: patents and applications. Recent Pat. DNA Gene Seq.
5, 1–7 (2011).
38. Goelz, S. E., Hamilton, S. R. & Vogelstein, B. Purification of DNA from formaldehyde fixed and paraffin embedded human tissue. Biochem. Biophys. Res.
Commun. 130, 118–26 (1985).
39. Von Ahlfen, S., Missel, A., Bendrat, K. & Schlumpberger, M. Determinants of
RNA Quality from FFPE Samples. PLoS ONE 2, (2007).
40. Schmitt, M. et al. European Organisation for Research and Treatment of Cancer (EORTC) Pathobiology Group standard operating procedure for the preparation of human tumour tissue extracts suited for the quantitative analysis of
tissue-associated biomarkers. Eur J Cancer 43, 835–44 (2007).
41. Ebeling, W. et al. Proteinase K from Tritirachium album Limber. Eur. J. Biochem. FEBS 47, 91–7 (1974).
41
42. Hilz, H., Wiegers, U. & Adamietz, P. Stimulation of proteinase K action by
denaturing agents: application to the isolation of nucleic acids and the degradation of ‘masked’ proteins. Eur. J. Biochem. FEBS 56, 103–8 (1975).
43. Sambrook, J. & Russell, D. Molecular cloning, a laboratory manual. (Cold
Spring Harbor Laboratory Press, 2001).
44. Miller, S. A., Dykes, D. D. & Polesky, H. F. A simple salting out procedure for
extracting DNA from human nucleated cells. Nucleic Acids Res. 16, 1215
(1988).
45. Vogelstein, B. & Gillespie, D. Preparative and analytical purification of DNA
from agarose. Proc. Natl. Acad. Sci. U. S. A. 76, 615–9 (1979).
46. Boom, R. et al. Rapid and simple method for purification of nucleic acids. J.
Clin. Microbiol. 28, 495–503 (1990).
47. Boom, R. et al. Improved silica-guanidiniumthiocyanate DNA isolation procedure based on selective binding of bovine alpha-casein to silica particles. J.
Clin. Microbiol. 37, 615–9 (1999).
48. Berensmeier, S. Magnetic particles for the separation and purification of nucleic acids. Appl Microbiol Biotechnol 73, 495–504 (2006).
49. Endres, H. N., Johnson, J. A. C., Ross, C. A., Welp, J. K. & Etzel, M. R. Evaluation of an ion-exchange membrane for the purification of plasmid DNA. Biotechnol. Appl. Biochem. 37, 259–266 (2003).
50. Bartl, K., Wenzig, P. & Kleiber, J. Simple and broadly applicable sample preparation by use of magnetic glass particles. Clin. Chem. Lab. Med. CCLM
FESCC 36, 557–9 (1998).
51. Hourfar, M. K. et al. High-throughput purification of viral RNA based on novel aqueous chemistry for nucleic acid isolation. Clin Chem 51, 1217–22 (2005).
52. Hawkins, T. L., O’Connor-Morin, T., Roy, A. & Santillan, C. DNA purification and isolation using a solid-phase. Nucleic Acids Res. 22, 4543–4 (1994).
53. Feuk, L., Carson, A. R. & Scherer, S. W. Structural variation in the human
genome. Nat. Rev. Genet. 7, 85–97 (2006).
54. Friis, S. L. et al. Typing of 30 insertion/deletions in Danes using the first commercial indel kit--Mentype(R) DIPplex. Forensic Sci. Int. Genet. 6, e72–4
(2012).
55. Fondevila, M. et al. Forensic performance of two insertion-deletion marker
assays. Int. J. Legal Med. 126, 725–37 (2012).
56. Pereira, R. et al. A new multiplex for human identification using insertion/deletion polymorphisms. Electrophoresis 30, 3682–90 (2009).
57. Metzker, M. L. Sequencing technologies - the next generation. Nat. Rev. Genet.
11, 31–46 (2010).
58. Schneider, G. F. & Dekker, C. DNA sequencing with nanopores. Nat. Biotechnol. 30, 326–328 (2012).
59. Oxford Nanopore Technologies. Oxford Nanopore introduces DNA ‘strand
sequencing’ on the high-throughput GridION platform and presents MinION, a
sequencer the size of a USB memory stick. Press Release (2012). at
<https://www.nanoporetech.com/news/press-releases/view/39>
60. Bentley, D. R. et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–9 (2008).
61. Gerlinger, M. et al. Intratumor heterogeneity and branched evolution revealed
by multiregion sequencing. N Engl J Med 366, 883–92 (2012).
62. Navin, N. et al. Tumour evolution inferred by single-cell sequencing. Nature
472, 90–94 (2011).
63. Wang, Y. et al. Clonal evolution in breast cancer revealed by single nucleus
genome sequencing. Nature (2014). doi:10.1038/nature13600
42
64. Hagemann, I. S., Cottrell, C. E. & Lockwood, C. M. Design of targeted, capture-based, next generation sequencing tests for precision cancer therapy. Cancer Genet. 206, 420–431 (2013).
65. Bodi, K. et al. Comparison of Commercially Available Target Enrichment
Methods for Next-Generation Sequencing. J. Biomol. Tech. JBT 24, 73–86
(2013).
66. Johansson, H. et al. Targeted resequencing of candidate genes using selector
probes. Nucleic Acids Res. 39, e8 (2011).
67. Dahl, F., Gullberg, M., Stenberg, J., Landegren, U. & Nilsson, M. Multiplex
amplification enabled by selective circularization of large sets of genomic
DNA fragments. Nucleic Acids Res. 33, e71 (2005).
68. Dahl, F. et al. Multigene amplification and massively parallel sequencing for
cancer mutation discovery. Proc. Natl. Acad. Sci. U. S. A. 104, 9387–92
(2007).
69. Altmann, A. et al. A beginners guide to SNP calling from high-throughput
DNA-sequencing data. Hum. Genet. 131, 1541–1554 (2012).
70. Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure
and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).
71. Koboldt, D. C. et al. VarScan 2: Somatic mutation and copy number alteration
discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
72. Xu, H., DiCarlo, J., Satya, R. V., Peng, Q. & Wang, Y. Comparison of somatic
mutation calling methods in amplicon and whole exome sequence data. BMC
Genomics 15, 244 (2014).
73. Ferlay J, S. I. GLOBOCAN 2012 v1.0, Cancer Incidence and Mortality
Worldwide. (2013). at <http://globocan.iarc.fr>
74. Cunningham, D. et al. Colorectal cancer. Lancet 375, 1030–47 (2010).
75. Boyle, P. & Langman, J. S. ABC of colorectal cancer: Epidemiology. BMJ
321, 805–8 (2000).
76. Jemal, A. et al. Global cancer statistics. CA. Cancer J. Clin. 61, 69–90 (2011).
77. Chlebowski, R. T. et al. Estrogen plus progestin and colorectal cancer in postmenopausal women. N. Engl. J. Med. 350, 991–1004 (2004).
78. Hamilton, S. R., A., L.A. Pathology and Genetics of Tumours of the Digestive
System. (IARC Press, 2000).
79. Bodmer, W. F. et al. Localization of the gene for familial adenomatous polyposis on chromosome 5. Publ. Online 13 August 1987 Doi101038328614a0 328,
614–616 (1987).
80. Lynch, H. T. & Krush, A. J. Cancer family ‘G’ revisited: 1895-1970. Cancer
27, 1505–1511 (1971).
81. Lynch, H. T. & Smyrk, T. Hereditary nonpolyposis colorectal cancer (Lynch
syndrome): An updated review. Cancer 78, 1149–1167 (1996).
82. Briggs, S. & Tomlinson, I. Germline and somatic polymerase ε and δ mutations
define a new class of hypermutated colorectal and endometrial cancers. J.
Pathol. 230, 148–153 (2013).
83. Palles, C. et al. Germline mutations affecting the proofreading domains of
POLE and POLD1 predispose to colorectal adenomas and carcinomas. Nat.
Genet. 45, 136–144 (2013).
84. National Cancer Institute. Rectal Cancer Treatment (PDQ®). National Cancer
Institute at <http://www.cancer.gov/cancertopics/pdq/treatment/rectal/Health
Professional/page3>
85. Muto, T., Bussey, H. J. R. & Morson, B. C. The evolution of cancer of the
colon and rectum. Cancer 36, 2251–2270 (1975).
43
86. Fearon, E. R. & Vogelstein, B. A genetic model for colorectal tumorigenesis.
Cell 61, 759–67 (1990).
87. Kinzler, K. W. & Vogelstein, B. Lessons from hereditary colorectal cancer.
Cell 87, 159–170 (1996).
88. Powell, S. M. et al. APC mutations occur early during colorectal tumorigenesis. Nature 359, 235–7 (1992).
89. Vogelstein, B. et al. Genetic alterations during colorectal-tumor development.
N Engl J Med 319, 525–32 (1988).
90. Rajagopalan, H. et al. Tumorigenesis: RAF/RAS oncogenes and mismatchrepair status. Nature 418, 934 (2002).
91. Fearon, E. R. Molecular Genetics of Colorectal Cancer. Annu. Rev. Pathol.
Mech. Dis. 6, 479–507 (2011).
92. Samuels, Y. et al. High Frequency of Mutations of the PIK3CA Gene in Human Cancers. Science 304, 554 (2004).
93. Baker, S. J. et al. p53 Gene Mutations Occur in Combination with 17p Allelic
Deletions as Late Events in Colorectal Tumorigenesis. Cancer Res. 50, 7717–
7722 (1990).
94. Akhoondi, S. et al. FBXW7/hCDC4 Is a General Tumor Suppressor in Human
Cancer. Cancer Res. 67, 9006–9012 (2007).
95. Danielsen, S. A. et al. Novel mutations of the suppressor gene PTEN in colorectal carcinomas stratified by microsatellite instability- and TP53 mutationstatus. Hum. Mutat. 29, E252–E262 (2008).
96. Leary, R. J. et al. Integrated analysis of homozygous deletions, focal amplifications, and sequence alterations in breast and colorectal cancers. Proc. Natl.
Acad. Sci. U. S. A. 105, 16224–16229 (2008).
97. Markowitz, S. et al. Inactivation of the type II TGF-beta receptor in colon
cancer cells with microsatellite instability. Science 268, 1336–1338 (1995).
98. Grady, W. M. et al. Mutational Inactivation of Transforming Growth Factor β
Receptor Type II in Microsatellite Stable Colon Cancers. Cancer Res. 59, 320–
324 (1999).
99. Lengauer, C., Kinzler, K. W. & Vogelstein, B. Genetic instability in colorectal
cancers. Nature 386, 623–627 (1997).
100. Fodde, R. et al. Mutations in the APC tumour suppressor gene cause chromosomal instability. Nat. Cell Biol. 3, 433–8 (2001).
101. Aaltonen, L. A. et al. Clues to the pathogenesis of familial colorectal cancer.
Science 260, 812–816 (1993).
102. Herman, J. G. et al. Incidence and functional consequences of hMLH1 promoter hypermethylation in colorectal carcinoma. Proc. Natl. Acad. Sci. U. S. A. 95,
6870–6875 (1998).
103. Toyota, M. et al. CpG island methylator phenotype in colorectal cancer. Proc.
Natl. Acad. Sci. U. S. A. 96, 8681–8686 (1999).
104. Kambara, T. et al. BRAF mutation is associated with DNA methylation in
serrated polyps and cancers of the colorectum. Gut 53, 1137–1144 (2004).
105. Issa, J.-P. CpG island methylator phenotype in cancer. Nat. Rev. Cancer 4,
988–993 (2004).
106. Jass, J. R. Classification of colorectal cancer based on correlation of clinical,
morphological and molecular features. Histopathology 50, 113–130 (2007).
107. Bass, A. J. et al. Genomic sequencing of colorectal adenocarcinomas identifies
a recurrent VTI1A-TCF7L2 fusion. Nat. Genet. 43, 964–8 (2011).
108. Kim, T. M., Laird, P. W. & Park, P. J. The landscape of microsatellite instability in colorectal and endometrial cancer genomes. Cell 155, 858–68 (2013).
44
109. Cajuso, T. et al. Exome sequencing reveals frequent inactivating mutations in
ARID1A, ARID1B, ARID2 and ARID4A in microsatellite unstable colorectal
cancer. Int. J. Cancer 135, 611–623 (2014).
110. Seshagiri, S. et al. Recurrent R-spondin fusions in colon cancer. Nature 488,
660–4 (2012).
111. Tahara, T. et al. Colorectal Carcinomas With CpG Island Methylator Phenotype 1 Frequently Contain Mutations in Chromatin Regulators. Gastroenterology 146, 530–538.e5 (2014).
112. Tuupanen, S. et al. Identification of 33 candidate oncogenes by screening for
base-specific mutations. Br. J. Cancer (2014). doi:10.1038/bjc.2014.429
113. Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for
new cancer-associated genes. Nature 499, 214–8 (2013).
114. Angus-Hill, M. L., Elbert, K. M., Hidalgo, J. & Capecchi, M. R. T-cell factor 4
functions as a tumor suppressor whose disruption modulates colon cell proliferation and tumorigenesis. Proc. Natl. Acad. Sci. U. S. A. 108, 4914–9 (2011).
115. Major, M. B. et al. Wilms tumor suppressor WTX negatively regulates
WNT/beta-catenin signaling. Science 316, 1043–1046 (2007).
116. Patsialou, A., Wilsker, D. & Moran, E. DNA-binding properties of ARID family proteins. Nucleic Acids Res. 33, 66–80 (2005).
117. Xie, C., Fu, L., Han, Y., Li, Q. & Wang, E. Decreased ARID1A expression
facilitates cell proliferation and inhibits 5-fluorouracil-induced apoptosis in
colorectal carcinoma. Tumour Biol. J. Int. Soc. Oncodevelopmental Biol. Med.
(2014). doi:10.1007/s13277-014-2074-y
118. Chadwick, R. B. et al. Candidate tumor suppressor RIZ is frequently involved
in colorectal carcinogenesis. Proc. Natl. Acad. Sci. 97, 2662–2667 (2000).
119. Morris, L. G. T. et al. Recurrent somatic mutation of FAT1 in multiple human
cancers leads to aberrant Wnt activation. Nat. Genet. 45, 253–261 (2013).
120. Guo, C. et al. KMT2D maintains neoplastic cell proliferation and global histone H3 lysine 4 monomethylation. Oncotarget 4, 2144–2153 (2013).
121. Gao, J. et al. Integrative Analysis of Complex Cancer Genomics and Clinical
Profiles Using the cBioPortal. Sci. Signal. 6, pl1–pl1 (2013).
122. Cerami, E. et al. The cBio Cancer Genomics Portal: An Open Platform for
Exploring Multidimensional Cancer Genomics Data. Cancer Discov. 2, 401–
404 (2012).
123. SRCT. Improved Survival with Preoperative Radiotherapy in Resectable Rectal Cancer. N. Engl. J. Med. 336, 980–987 (1997).
124. Bracht, K., Nicholls, A. M., Liu, Y. & Bodmer, W. F. 5-Fluorouracil response
in a large panel of colorectal cancer cell lines is associated with mismatch repair deficiency. Br. J. Cancer 103, 340–346 (2010).
125. Bardelli, A. & Siena, S. Molecular Mechanisms of Resistance to Cetuximab
and Panitumumab in Colorectal Cancer. J. Clin. Oncol. 28, 1254–1261 (2010).
126. Kopetz, S. et al. Phase II Trial of Infusional Fluorouracil, Irinotecan, and
Bevacizumab for Metastatic Colorectal Cancer: Efficacy and Circulating Angiogenic Biomarkers Associated With Therapeutic Resistance. J. Clin. Oncol. 28,
453–459 (2010).
127. Misale, S. et al. Emergence of KRAS mutations and acquired resistance to
anti-EGFR therapy in colorectal cancer. Nature 486, 532–536 (2012).
128. Schmoll, H. J. et al. ESMO Consensus Guidelines for management of patients
with colon and rectal cancer. A personalized approach to clinical decision making. Ann. Oncol. 23, 2479–2516 (2012).
129. Van Cutsem, E. et al. Cetuximab and Chemotherapy as Initial Treatment for
Metastatic Colorectal Cancer. N. Engl. J. Med. 360, 1408–1417 (2009).
45
130. Van Emburgh, B. O., Sartore-Bianchi, A., Di Nicolantonio, F., Siena, S. &
Bardelli, A. Acquired resistance to EGFR-targeted therapies in colorectal cancer. Mol. Oncol. (2014). doi:10.1016/j.molonc.2014.05.003
131. Segal, N. H. & Saltz, L. B. Evolving Treatment of Advanced Colon Cancer.
Annu. Rev. Med. 60, 207–219 (2009).
132. Adam, R. et al. Patients with initially unresectable colorectal liver metastases:
is there a possibility of cure? J. Clin. Oncol. Off. J. Am. Soc. Clin. Oncol. 27,
1829–1835 (2009).
133. Folprecht, G. et al. Survival of patients with initially unresectable colorectal
liver metastases treated with FOLFOX/cetuximab or FOLFIRI/cetuximab in a
multidisciplinary concept (CELIM study). Ann. Oncol. 25, 1018–1025 (2014).
134. Gill, S. et al. Pooled analysis of fluorouracil-based adjuvant therapy for stage II
and III colon cancer: who benefits and by how much? J. Clin. Oncol. Off. J.
Am. Soc. Clin. Oncol. 22, 1797–806 (2004).
135. Siegel, R., DeSantis, C. & Jemal, A. Colorectal cancer statistics, 2014. CA.
Cancer J. Clin. 64, 104–117 (2014).
136. Chambers, A. F., Groom, A. C. & MacDonald, I. C. Metastasis: Dissemination
and growth of cancer cells in metastatic sites. Nat Rev Cancer 2, 563–572
(2002).
137. Nguyen, D. X. & Massagué, J. Genetic determinants of cancer metastasis. Nat.
Rev. Genet. 8, 341–352 (2007).
138. Brannon, A. R. et al. Comparative sequencing analysis reveals high genomic
concordance between matched primary and metastatic colorectal cancer lesions. Genome Biol. 15, 454 (2014).
139. Roth, A. D. et al. Prognostic role of KRAS and BRAF in stage II and III resected colon cancer: results of the translational study on the PETACC-3,
EORTC 40993, SAKK 60-00 trial. J. Clin. Oncol. Off. J. Am. Soc. Clin. Oncol.
28, 466–474 (2010).
140. Gavin, P. G. et al. Mutation profiling and microsatellite instability in stage II
and III colon cancer: an assessment of their prognostic and oxaliplatin predictive value. Clin. Cancer Res. Off. J. Am. Assoc. Cancer Res. 18, 6531–6541
(2012).
141. Mouradov, D. et al. Survival in stage II/III colorectal cancer is independently
predicted by chromosomal and microsatellite instability, but not by specific
driver mutations. Am. J. Gastroenterol. 108, 1785–1793 (2013).
142. Isaksson, M. et al. MLGA--a rapid and cost-efficient assay for gene copynumber analysis. Nucleic Acids Res 35, e115 (2007).
143. Jones, S. et al. Comparative lesion sequencing provides insights into tumor
evolution. Proc. Natl. Acad. Sci. U. S. A. 105, 4283–8 (2008).
144. Prinz, F., Schlange, T. & Asadullah, K. Believe it or not: how much can we
rely on published data on potential drug targets? Nat. Rev. Drug Discov. 10,
712–712 (2011).
145. Begley, C. G. & Ellis, L. M. Drug development: Raise standards for preclinical
cancer research. Nature 483, 531–533 (2012).
146. Turner, N. C. & Reis-Filho, J. S. Genetic heterogeneity and cancer drug resistance. Lancet Oncol. 13, e178–185 (2012).
147. Groenendijk, F. H. & Bernards, R. Drug resistance to targeted therapies: Déjà
vu all over again. Mol. Oncol. (2014). doi:10.1016/j.molonc.2014.05.004
148. Workman, P., Al-Lazikani, B. & Clarke, P. A. Genome-based cancer therapeutics: targets, kinase drug resistance and future strategies for precision oncology.
Curr. Opin. Pharmacol. 13, 486–496 (2013).
46
149. Chong, C. R. & Jänne, P. A. The quest to overcome resistance to EGFRtargeted therapies in cancer. Nat. Med. 19, 1389–1400 (2013).
150. Von Manstein, V. et al. Resistance of Cancer Cells to Targeted Therapies
Through the Activation of Compensating Signaling Loops. Curr. Signal
Transduct. Ther. 8, 193–202 (2013).
151. Sigurdsson, J. A., Getz, L., Sjönell, G., Vainiomäki, P. & Brodersen, J. Marginal public health gain of screening for colorectal cancer: modelling study,
based on WHO and national databases in the Nordic countries. J. Eval. Clin.
Pract. 19, 400–407 (2013).
152. Newman, A. M. et al. An ultrasensitive method for quantitating circulating
tumor DNA with broad patient coverage. Nat Med 20, 548–54 (2014).
153. Diehl, F. et al. Circulating mutant DNA to assess tumor dynamics. Nat Med 14,
985–90 (2008).
154. Diehl, F. et al. Detection and quantification of mutations in the plasma of patients with colorectal tumors. Proc. Natl. Acad. Sci. U. S. A. 102, 16368–73
(2005).
155. Bettegowda, C. et al. Detection of circulating tumor DNA in early- and latestage human malignancies. Sci. Transl. Med. 6, 224ra24 (2014).
156. Forshew, T. et al. Noninvasive identification and monitoring of cancer mutations by targeted deep sequencing of plasma DNA. Sci. Transl. Med. 4,
136ra68 (2012).
157. Swisher, E. M. et al. Tumor-specific p53 sequences in blood and peritoneal
fluid of women with epithelial ovarian cancer. Am. J. Obstet. Gynecol. 193,
662–7 (2005).
158. Misale, S. et al. Blockade of EGFR and MEK Intercepts Heterogeneous Mechanisms of Acquired Resistance to Anti-EGFR Therapies in Colorectal Cancer.
Sci. Transl. Med. 6, 224ra26–224ra26 (2014).
159. Druker, B. J. et al. Efficacy and Safety of a Specific Inhibitor of the BCR-ABL
Tyrosine Kinase in Chronic Myeloid Leukemia. N. Engl. J. Med. 344, 1031–
1037 (2001).
160. Bryant, H. E. et al. Specific killing of BRCA2-deficient tumours with inhibitors of poly(ADP-ribose) polymerase. Nature 434, 913–917 (2005).
161. Farmer, H. et al. Targeting the DNA repair defect in BRCA mutant cells as a
therapeutic strategy. Nature 434, 917–921 (2005).
162. Gad, H. et al. MTH1 inhibition eradicates cancer by preventing sanitation of
the dNTP pool. Nature 508, 215–221 (2014).
163. Edwards, S. L. et al. Resistance to therapy caused by intragenic deletion in
BRCA2. Nature 451, 1111–1115 (2008).
164. Bardelli, A. et al. Amplification of the MET receptor drives resistance to antiEGFR therapies in colorectal cancer. Cancer Discov. 3, 658–673 (2013).
47
Acta Universitatis Upsaliensis
Digital Comprehensive Summaries of Uppsala Dissertations
from the Faculty of Medicine 1028
Editor: The Dean of the Faculty of Medicine
A doctoral dissertation from the Faculty of Medicine, Uppsala
University, is usually a summary of a number of papers. A few
copies of the complete dissertation are kept at major Swedish
research libraries, while the summary alone is distributed
internationally through the series Digital Comprehensive
Summaries of Uppsala Dissertations from the Faculty of
Medicine. (Prior to January, 2005, the series was published
under the title “Comprehensive Summaries of Uppsala
Dissertations from the Faculty of Medicine”.)
Distribution: publications.uu.se
urn:nbn:se:uu:diva-231508
ACTA
UNIVERSITATIS
UPSALIENSIS
UPPSALA
2014