Download Whole Genome Low Pass Sequencing

1 SUPPLEMENTARY INFORMATION 2 3 METHODS 4 5 Tumor sample collection and cell culture 6 Resected brain tumor specimens were collected at Henry Ford Hospital (Detroit, MI) with written 7 informed consent from patients, under a protocol approved by the Henry Ford Hospital 8 Institutional Review Board, and graded pathologically according to the WHO criteria. A portion 9 of each tumor specimen was snap frozen and stored in liquid nitrogen. An adjacent portion was 10 used for cell culture. Tumors are dissociated enzymatically and neurospheres enriched in 11 cancer stem-like cells (CSC) were cultured, as described in detail 1,2. Neurosphere cultures 12 were serially passaged in vitro.No mycoplasma contamination was identified in the subset of 13 samples tested. Cells with passages between 7 and 18 were used for mouse implants and 14 molecular analysis, except for those designed “high passage”, where passage 40 was used. 15 16 Patient derived xenografts (PDX) 17 Orthotopic xenografts: Following IACUC guidelines in an institutionally approved animal use 18 protocol, GBM neurosphere cell suspensions were implanted into 8-week old female nude mice 19 (NCRNU, Taconic Farms) as described 3. A minimum of 8 mice were implanted with each 20 neurosphere line. Animals were anesthetized with a mixture of ketamine and xylazine. 21 Dissociated neurosphere cells (3x105) were injected using a Hamilton syringe at a defined 22 intracranial location: AP+1.0, ML+2.5, DV-3.0. Animals were monitored daily by an observer 23 blinded to the group allocation and sacrificed upon first signs of neurological deficit or weight 24 loss greater than 20%. Brains were harvested, placed in a coronal matrix for 2 mm sections, 25 with the first cut across the implant site. Brain sections were alternately frozen in dry ice and 26 embedded in OCT for storage at -80oC, or formalin fixed and paraffin embedded (FFPE). 27 Subcutaneous xenografts: Dissociated neurosphere cells (1x106) were injected in the flank of 28 nude mice. Animals were sacrificed and tumors excised when diameter reached 10 mm. 29 Drug Treatment: HF3077 PDXs were treated with capmatinib (purchased from Matrix Scientific 30 Products (Columbia, SC)) suspensions in 0.5% methylcellulose/0.1% Tween 80 were prepared 31 every week and administered by oral gavage using a 20g x1.5” gavage needle (Cadence) at a 32 dose of 30 mg/kg once a day (5 days/week) until the end of the study. Control animals received 1 33 vehicle only mock gavage. Forty-five days after implant, animals were randomized to control or 34 treatment groups. Each mouse was followed until death with no censoring and mean survival 35 differences were estimated using a t-distribution to estimate 95% confidence intervals. With a 36 sample size of 9 mice per group, a two-sided 95% confidence interval for the difference in mean 37 survival would extend 0.92SD from the observed difference in mean survival, assuming the CI is 38 based on large sample z statistic. Equivalently we expected 80% power to detect a difference in 39 mean survival of 1.4SD, for the common standard deviation, when n=9 animals per group and 40 alpha=0.05. Animals were monitored daily and sacrificed upon first signs of neurological deficit 41 or weight loss greater than 20%. Control animals were administered vehicle. Kaplan-Meier 42 Survival curves were compared by log-rank test. 43 To evaluate brain penetrance of capmatinib, 2h after administration of the last capmatinib 30 44 mg/kg dose, blood samples were drawn, animals were sacrificed and brains were harvested 45 and 2mm coronal sections were frozen in OCT. Tumor tissue was dissected from the frozen 46 blocks. Capmatinib concentration in homogenized tumor tissue and plasma was determined for 47 3 treated animals and one control was quantified by LC-MS/MS . 48 49 Xenograft tumor macrodissection of frozen tissue 50 Brain samples of 3 randomly selected animals per xenograft line were used. Frozen 2mm 51 coronal sections were transferred to a cryostat (Cryotome E, ThermoElectronCorporation) set to 52 -16oC. Six m sections were cut and stained with hematoxylin, to locate the tumor. Tumor tissue 53 was excised from the frozen block with a scalpel into a pre-chilled microtube and stored at - 54 80oC. 55 56 Nucleic Acids isolation 57 Genomic DNA was isolated from frozen tumor samples, macrodissected xenograft tumor (3 58 biological replicates), and neurosphere cultures using QIAamp DNA mini Kit (Qiagen #51304), 59 with on column RNase A digestion, following manufacturer instructions. DNA was isolated from 60 blood using DNA QIAamp Blood kit (Qiagen). 61 Total RNA was extracted from frozen tumor samples, macrodissected xenograft tumor (3 62 biological replicates), and neurosphere culture using MirVana (Ambion # AM1560), followed by 63 DNAse treatment using DNA-free (Ambion AM1906). 64 65 Fluorescence in situ hybridization (FISH) 2 66 FISH on matching tumor samples/neurospheres/PDX: FISH probes were prepared from purified 67 BAC clones (BACPAC Resource Center, https://bacpacresources.org). Probes were labeled 68 with Orange-dUTP or with Green-dUTP (Abbott Molecular Inc., Abbott Park, IL), by nick 69 translation. Locus Human BAC Clones: 12q13.3 - q14.1 (CDK4 gene) 7p11.2 (EGFR gene) 8q24.21 (MYC gene) 7q31 (MET gene) 4q12 (PDGFRA gene) 4q11 (Ch. 4 control) 7q11.22, (Ch 7 control) 8q11.21 (Ch. 8 control) RP11-181L23, RP11-571M6, RP11-277A02 RP11-708P5, CTD-2026N22, RP11-148P17 CTD-3056O22 RP11-95I20, RP11-564A14, RP11-39K12 RP11-58C6 and RP11-977G3 RP11-365H22 RP11-747K2, RP11-668K3 CH17-311E13 and CH17-425G9 70 71 Metaphase slides were prepared from neurosphere cell cultures that were harvested and fixed in 72 methanol:acetic acid (3:1), according to standard cytogenetic procedures. Tumor touch 73 preparations were prepared by imprinting thawed tumor tissue onto positively-charged glass 74 slides and fixing them in methanol:acetic acid (3:1) for 30 min then air-dried. Frozen tumor and 75 macrodissected xenograft tumor samples were prepared as described 4. The FISH probes were 76 denatured at 75 °C for 5 min and held at 37 °C for 10-30 min until 10 ul of probe was applied to 77 each sample slide. Slides were coverslipped and hybridized overnight at 37 °C in the 78 ThermoBrite hybridization system (Abbott Molecular Inc.). The posthybridization wash was with 79 2X SSC/0.2% TWEEN 20 at 73 °C for 3 min followed by a brief water rinse. Slides were air-dried 80 and then counterstained with VECTASHIELD mounting medium with 4'-6-diamidino-2- 81 phenylindole (DAPI) (Vector Laboratories Inc., Burlingame, CA). 82 Image acquisition was performed at 1000x system magnification with a COOL-1300 83 SpectraCube camera (Applied Spectral Imaging-ASI, Vista, CA) mounted on an Olympus BX43 84 microscope. Images were analyzed using FISHView v7 software (ASI) and 100 - 200 85 interphase nuclei were scored for each sample in addition to analysis of 50 - 100 metaphase 86 spreads for each cell line. 87 FISH on paired primary/recurrent FFPE gliomas: Fluorescence in situ assay was performed 88 using RPS6/Con 9, CDK4/Con 12, EGFR/con 7, MYC/ con 8, PDGFRA/con 4, C-Met/con 7, 89 TERT/Con 5 FISH probes from Empire Genomics (Buffalo, N.Y.). The slides were hybridized 90 with the FISH probes according to the manufacturer's instructions with slight modifications. 91 The slides were then examined under fluorescence microscope (Nikon 80i) equipped with 92 multiple filters and signals were manually counted in 50 cells for each slide. 3 93 94 95 Immunohistochemistry 96 Sections of formalin fixed, paraffin embedded human glioma surgical samples, tumor 97 xenografts, or multicellular spheroids were deparaffinized with xylene and rehydrated through 98 graded alcohol into in phosphate buffered saline. Antigens were unmasked by 10 min incubation 99 in boiling in citrate buffer and sections stained with anti-Met rabbit monoclonal antibody (D1C2) 100 (Cell signaling #8198) or anti-phospho-Met (Tyr1234/1235) rabbit monoclonal antibody (D26) 101 (Cell signaling #3077) and visualized with Betazoid DAB (Biocare BDB2004) and counterstained 102 with Envision Flex Hematoxylin (Dako K8008). Images were captured using a Eclipse E800M 103 microscope equipped with a Nikon DS-Fi2 color digital camera (Nikon). 104 105 Reverse Transcription and PCR 106 cDNA was prepared from 1 g DNAseI-treated total RNA isolated from tumor, neurosphere and 107 xenografts using Superscript III Reverse Transcriptase and oligo dT (Thermo Fisher Scientific). 108 cDNA was used as a template for PCR reaction in a iCycler instrument (BioRad), using 109 Platinum Taq DNA Polymerase (Thermo Fisher Scientific) and the following oligos: 110 Human MET: exon 2 forward (M2F): 5’ AGCAATGGGGAGTGTAAAGAGG and exon 8 reverse 111 (M8R): 5‘ GTAAGTAAAGTGCCACCAGCC 112 Human CAPZA2 exon 1 forward (C1F): 5’ GTAAGTAAAGTGCCACCAGCC 113 Human EGFR forward: 5’GCAGCGATGCGACCCTCCGGG and reverse: 114 5’-CTATTCCGTTACACACTTTGCGG 115 Human b-actin: forward 5’ CCGACAGGATGCAGAAGGAG and reverse 5’ 116 CATCTGCTGGAAGGTGGACA 117 118 LC-MS/MS Quantitation of Capmatinib and Crizotinib in Mouse Plasma and Tumor 119 For mouse plasma sample analysis, 25 L of each sample was precipitated with 200 L of 120 acetonitrile. This suspension was vortexed for 30 min and centrifuged at 4k rpm for 15 min, 121 after which 100 L of the extract was aliquoted and mixed with 200 L of acetonitrile/water (1/2, 122 v/v) prior to LC-MS/MS analysis. The extracted plasma samples were analyzed on a Waters 123 Acquity UPLC system coupled with a Waters Xevo TQ-S triple quadrupole mass spectrometer 124 operated at positive mode. The capillary voltage was set to 0.5 kv and collision energy to 32 ev. 125 Capmatinib (purchased from Matrix Scientific Products (Columbia, SC)) and crizotinib 4 126 (purchased from LC Laboratories (Woburn, MA)) were separated using a Waters Acquity UPLC 127 BEH C18 column (1.7 µm, 2.1 x 30 mm) and detected by a multiple reaction monitoring 128 transition, m/z 413.04>354.07 for capmatinib and m/z 450.04>260.18 for crizotinib, respectively. 129 The mobile phase A was 0.1% acetic acid/water and B was 0.1% acetic acid/acetonitrile. The 130 LC gradient was 10% B (0-0.3 min), 10-95% B (0.3-1.3 min), 95% B (1.3-1.7 min), 10% B (1.7- 131 2.0 min) and the flow rate was 0.5 mL/min. The column temperature was 40 C. The injection 132 volume was 2 µL. Under these conditions, the retention time was 0.85 min for capmatinib and 133 0.74 min for crizotinib. The method was validated with an analytical range of 1 – 1000 ng of 134 capmatinib and crizotinib in untreated CD-1 mouse plasma, respectively. 135 Mouse tumor tissue samples were homogenized in methanol:water (80:20, v/v) to a 136 concentration of 100 mg (tissue)/mL. The homogenates were vortexed for 10 min and 137 centrifuged at 15k rpm for 5 min, then 100 L of the supernatant was transferred into an HPLC 138 vial for LC-MS/MS analysis. The tissue homogenates were analyzed by using the same method 139 as described above. The method was validated with an analytical range of 1 – 1000 ng/mL of 140 capmatinib and crizotinib in untreated mouse tumor tissue homogenates, respectively. 141 142 Whole Exome Sequencing 143 Library Construction and Sequencing 144 The sequencing libraries were prepared using the KAPA library prep protocol (catalog number 145 KK8234, KAPA Biosystems, Wilmington, MA). The exomes were captured using the SureSelect 146 XT Human All Exon V5 kit (Agilent Technologies, Santa Clara, CA). Samples were then 147 sequenced 2x100 bp to about 340x depth on the Illumina HiSeq 2000. 148 149 BAM File Generation 150 The raw output (BCL) files of an Illumina sequencer were converted to FASTQ files using 151 Illumina's offline basecalling software CASAVA version 1.8.2. The FASTQ files were then 152 aligned to the reference genome (hg19 for human) using BWA version 0.7.0 5 for DNA samples 153 with parameters suitable for a given aligner. The aligned BAM files are subjected to mark 154 duplication, re-alignment, and re-caliberation using Picard version 1.112 155 (http://picard.sourceforge.net) and GATK version 1.5 6 when applicable before any downstream 156 analysis are conducted. 5 157 158 Whole Genome Low Pass Sequencing 159 Library Construction and Sequencing 160 The Illumina compatible libraries were prepared using KAPA DNA Library preparation kit 161 (Catalog No. KK8232) as per the manufacturer’s protocol. In brief, DNA was fragmented to a 162 median size of 200bp by sonication. Fragmented DNA ends were polished and 5′- 163 phosphorylated. After addition of 3′-A to the ends, indexed Y-adapters were ligated and the 164 samples were PCR amplified. The resulting DNA libraries were quantified and validated by 165 qPCR, and sequenced on Illumina’s HiSeq 2000 in a paired-end read format for 76 cycles. The 166 resulting BCL files containing the sequence data were converted into “.fastq.gz” files and 167 individual sample libraries were demultiplexed using CASAVA version 1.8.2 with no 168 mismatches. 169 170 RNA Sequencing 171 Library Construction and Sequencing 172 The Illumina compatible libraries were prepared using Illumina’s TruSeq RNA Sample Prep kit 173 v2, as per the manufacturer’s protocol. In brief, Poly-A RNA was enriched using Oligo-dT beads. 174 Enriched Poly-A RNA was fragmented to a median size of 150bp using chemical fragmentation 175 and converted into double stranded cDNA. Ends of the double stranded cDNA were polished, 176 5′-phosphorylated, and 3’-A tailed for ligation of the Y-shaped indexed adapters. Adapter ligated 177 DNA fragments were PCR amplified, quantified and validated by qPCR, and sequenced on 178 Illumina’s HiSeq 2000 in a paired-end read format for 76 cycles. The resulting BCL files 179 containing the sequence data were converted into “.fastq.gz” files & individual sample libraries 180 were demultiplexed using CASAVA version 1.8.2 with no mismatches. 181 182 BAM File Generation 183 RNA sequencing BAM files were generated and analyzed using the Pipeline for RNAseq Data 184 Analysis (PRADA) (http://sourceforge.net/projects/prada/) 7. In brief, PRADA uses Burroughs- 185 Wheeler alignment, Samtools, and Genome Analysis Toolkit to align RNAseq reads to a 186 reference database composed of whole genome sequences (hg19) and transcriptome 187 sequences (Ensembl64). Details of the PRADA pipeline are described in its manuscript. 188 6 189 Targeted Resequencing 190 Library Construction and Sequencing 191 The Illumina compatible libraries were prepared using KAPA DNA Library preparation kit 192 (Catalog No. KK8232) as per the manufacturer’s protocol. In brief, DNA was fragmented to a 193 median size of 200bp by sonication. Fragmented DNA ends were polished and 5′- 194 phosphorylated. After addition of 3′-A to the ends, indexed Y-adapters were ligated and the 195 samples were PCR amplified. The resulting DNA libraries were enriched for targeted regions 196 using NimbleGen SeqCap EZ Choice Library 4 RXN (Catalog No. 06740251001) and 197 NimbleGen SeqCap EZ Reagent Kit Plus v2 (Catalog No. 06953247001) as per the 198 manufacturer’s protocol. The enriched libraries were quantified and validated by qPCR, and 199 sequenced on Illumina’s HiSeq 2000 in a paired-end read format for 76 cycles. The resulting 200 BCL files containing the sequence data were converted into “.fastq.gz” files and individual 201 sample libraries were demultiplexed using CASAVA version 1.8.2 with no mismatches. 202 203 BAM File Generation 204 Sequencing FASTQ files were aligned to the reference genome (hg19 for human) and 205 processed to BAM files by the same pipeline as in whole exome sequencing. 206 207 Pacific Biosciences (PacBio) Long Read Sequencing 208 Library Construction and Sequencing 209 The DNA libraries were prepared following the Pacific Biosciences 20 kb Template Preparation 210 Using BluePippin Size-Selection System protocol. No DNA shearing was performed since the 211 samples were already fragmented. The sheared DNA was size selected on a BluePippin 212 system (Sage Science Inc., Beverly, MA, USA) using a cutoff range of 7 kb to 50 kb. The DNA 213 Damage repair, End repair and SMRT bell ligation steps were performed as described in the 214 template preparation protocol with the SMRTbell Template Prep Kit 1.0 reagents (Pacific 215 Biosciences, Menlo Park, CA, USA) . The sequencing primer annealing and the P6 polymerase 216 binding reactions were prepared according to the BindingCalculator (Pacific Biosciences 217 BindingCalculator-master_v2.3.1.1). The libraries were sequenced on a PacBio RSII instrument 218 at a loading concentration (on-plate) of 80pM, 90pM and 100pM using the MagBead 219 OneCellPerWell v1 collection protocol, DNA sequencing kit 4.0, SMRT cells v3 and 4 hours 220 movies. 221 7 222 Filtering the sequencing reads 223 Reads and subreads were filtered based on their length and quality values, using smrtpipe.py 224 from the SMRT-Analysis package. 225 226 Structural Variation Analysis 227 Canu (version 1.2) 8 was used for assembling the filtered PacBio sequence subreads with the 228 parameters suggested for low coverage data. The assembled contigs were aligned by nucmer 229 (version 3.23) 9 to the human genome reference (hg19) and the contigs having sequence 230 fragments aligned to the MET-CAPZA2 region of chromosome 7 were selected for structural 231 variation analysis. For the selected contigs, we performed a blastn search 10 against mouse 232 genome using the sequence fragments aligned to the MET-CAPZA2 region of hg19 in order to 233 make sure that they originated from human (Supplementary Table 3). Sequence framents 234 shared by two contigs were identified with pairwise alignment of the contigs using the nucmer. 235 Two contigs were considered to be connected only if they shared a sequence fragment which 236 was at least 5,000 bp long with the minimum 99% identity. The high confident shared sequence 237 fragments were used for connecting the contigs into a circular form in the HF3035. In HF3077, 238 only two contigs (tig01141776 and tig01141835) were aligned to the MET-CAPZA2 region of 239 chromosome 7, and the two contigs shared 621 bp long sequence with 95.6% identity between 240 the 3’ end of tig01141776 and the 5’ end of tig01141835. 241 242 Gene Fusion and Gene Expression Analysis 243 To detect transcript fusions, PRADA aligned RNAseq reads to a reference database composed 244 of whole genome sequences (hg19) and transcriptome sequences (Ensembl64). Two lines of 245 evidence were required for identification of a gene fusion: 1) a minimum of two discordant read 246 pairs mapping to a candidate gene pair; 2) a minimum of one junction spanning read mapping to 247 a junction that connected exons between the candidate gene pair, with its pair mate mapping to 248 the either of the two genes. Several filters were applied to remove false positives and artifacts, 249 of which the most prominent is based on significant sequence similarity between the two fusion 250 genes (using BLASTN, Expect value = 0.01). Gene expression was measured as ‘reads per 251 kilobase per million’ (RPKM) to normalize for gene length and library size. Specific details of the 252 PRADA pipeline are described elsewhere 7. 253 254 Structural Variant Detection 8 255 To detect structural variants, we applied SpeedSeq 11 (with default parameters) to whole 256 genome sequencing from both tumor and matching normal samples. We filtered somatic 257 variants by requiring at least 4 reads supporting evidence in a tumor and no reads in its 258 matching normal. 259 260 EGFR Intragenic Rearrangement 261 General User dEfined Supervised Search for intragenic fusion (GUESS-if), a module of PRADA, 262 was also used to search for EGFR intragenic rearrangements, as previously described 12. In 263 brief, using the same rationale as in PRADA gene fusion identification, GUESS-if looked for 264 spanning reads for abnormal junctions that were not present in known transcripts. To assure a 265 high accuracy, we required at least 10 reads spanning exon 1-8 of EGFR. 266 267 Validation of Somatic Single Nucleotide Variants 268 To validate our somatic single nucleotide mutation calls, we performed targeted resequencing at 269 high coverage (>1,400x). We selected 792 unique bases, which had been found to be mutated 270 in tumor, neurosphere, or xenografts but not in all of them. These sites corresponded to 271 1368sSNVs. In total, 1287of 1368mutations called from the exome sequencing data were 272 detected in the high coverage data, resulting in a true positive validation rate of 94%. Evidence 273 for recovered somatic mutation was observed in 1001 of 2646 wild type nucleotides. The variant 274 allelic fractions (VAFs), i.e. the number of reads harboring the variant allele divided by all reads 275 covering to that base, of exome and validation sequencing were highly correlated (Pearson 276 correlation = 0.92). 277 278 Somatic Single Nucleotide Variant Calling 279 Somatic single nucleotide variants (sSNVs) from tumor and patient-matched normal samples 280 were detected by using MuTect algorithm (version 1.14) with default parameters 13. The search 281 for somatic small insertion/deletions (Indels) was performed by using Pindel 14 with tumor and 282 patient-matched normal samples. All sSNVs and small indels were annotated by ANNOVAR 283 (version 2012-10-23) 15. Only exonic or splicing sSNVs were selected for analysis. Mutation 284 counts for individual samples are available in Supplementary Table 4. 285 286 Inference of Cellular Frequency and Mutational Clusters 287 We defined cellular frequency of a mutation as the fraction of cells harboring the mutation. 288 Estimation of cellular frequency was performed using PyClone version 0.12.7 16 . For each set of 9 289 patient, neurosphere, and xenograft samples, PyClone was run on the somatic mutations whose 290 sites were covered over all the samples using multi-sample joint analysis mode with PyClone 291 beta binomial density and parental copy number priors. Allelic copy numbers were estimated by 292 applying Sequenza 17 to exome sequencing data. Default options for PyClone were used. To 293 avoid potential artifacts from sequencing coverage, we limited the analysis to the mutations at 294 the sites covered with at least 50X over all samples from a same patient. PyClone inferred 295 clusters of mutations whose cellular frequencies co-vary over samples. Our analysis was limited 296 only to mutation clusters with at least two mutations. 297 298 Removing Putative Mouse Reads in Short Read Sequencing Data 299 Sequencing reads derived from xenograft samples are a mixture of reads from human and 300 mouse. We utilized Xenome 18 to select sequencing reads arising from human. Then, the 301 selected human reads selected were aligned to the human genome using the same pipeline as 302 in patient and neurosphere samples. 303 Identification of Copy Numbers from Low Pass Sequence Data 304 We used NBICSeq version 0.5.2 19 with bin size 1000 bps and BIC penalty 3 to estimate 305 somatic copy number alterations in low pass sequencing data from tumor and patient-matched 306 normal samples. 307 Detecting TERT Promoter Mutations 308 We evaluated whole genome low pass sequencing and whole exome sequencing for the 309 presence of TERT mutations in a supervised way using GATK pileup. We required minimum 2 310 variant alleles (combined from WGS and WES) for detection of TERT promoter mutations. Variant change Variant site Patients C228T 5:1295228-1295228 7 C250T 5:1295250-1295250 5 311 312 Detecting ATRX Indels 313 Indels were called using Pindel (Version 0.2.4t) with the default parameters except maximum 314 allowed mismatch rate being 0.1 14. Somatic indels were further filtered to require a minimum 5 315 supporting tumor reads. 10 316 Analysis of B-allele-frequency segments 317 B-allele-frequency segments were inferred by applying Sequenza (Version 2.1.1) 318 exome sequencing data with the default parameters. Analysis of B-allele fractions using whole 319 genome sequencing in our sample cohort revealed loss of heterozygosity (LOH) of chromosome 320 10 in two cases with diploid chromosome 10, suggesting these cases had first lost a single copy 321 of the chromosome which was subsequently duplicated (Supplementary Fig. 1). We evaluated 322 chromosome 10 LOH using Affymetrix SNP6 profiles from 320 IDH-wildtype TCGA glioblastoma 323 12 324 underscoring the importance of aberrations in chromosome 10 in gliomagenesis and evolution 325 (Supplementary Fig. 6). 326 Data used for longitudinal analysis in glioma patient tumors 327 Segmented copy number profiles for thirteen TCGA GBM patients and fourteen TCGA LGG 328 patients were were obtained from the TCGA portal https://tcga-data.nci.nih.gov/tcga/. Copy 329 number profiles for ten patients from MD Anderson Cancer Center (MDACC) and fourteen 330 patients from either Samsung Medical Center (SMC) or Seoul National University Hospital 331 (SNUH) were previously processed 20,21. Additional copy number data for seven patients from 332 MD Anderson were generated by applying NBICseq version 0.5.219 to low pass whole genome 333 sequencing. For fusion detection and structural variant calling, the same pipelines as described 334 in the corresponding method subsections were applied for unaligned RNA sequencing files and 335 whole genome sequencing BAM files from TCGA GBM, TCGA LGG, and MD Anderson 336 patients. Sequencing data for the TCGA cohort were downloaded from CGHub. Fusion calls for 337 Samsung Medical Center cohort patients were previously processed 21. Shown below is a 338 summary table of data used in the analysis. 17 to whole , and found that 27 of 52 tumors with diploid chromosome 10 similarly showed LOH, 11 Table: The number of patients used in the longitudinal analysis 339 Cohort CNV RNASEQ WGS MDACC 17 9 7 SMC/SNUH 14 0 0 TCGA GBM 13 6 10 TCGA LGG 14 14 13 Total 58 29 30 Note: Patients do not necessarily have both RNAseq and WGS. 340 341 Predicting double minute candidates 342 After visualizing segmented copy numbers in the Integrative Genomics Viewer (IGV) 22, we 343 manually scrutinized potential extrachromosomal DNA candidate regions by searching for 344 complex patterns of copy number amplifications. In cases where structural variations and gene 345 fusions were available, we projected those variation breakpoints onto the copy number IGV view 346 plots to get additional evidence on presence of potential extrachromosomal DNAs. 347 Statistical Analysis 348 We conducted all computations with R 3.0.13 and used standard statistical tests as appropriate 349 350 More explanation on CAPZA2-MET fusion transcripts 351 Chimeric RNA fusions have been previously reported in GBM 352 therapeutically targetable, in particular when involving receptor tyrosine kinases 353 We performed RNA sequencing and detected fusion transcripts in all samples except 354 for a single neurosphere line (HF3203) with disqualifying quality control values 7. From 355 this unbiased screen, multiple fusions joining the CAPZA2 coding start with the 5’ UTR 356 of MET were identified in the primary tumors of HF3035, HF3077 and HF3055 (Fig. 3b). 357 Additional CAPZA2-MET variants resulted in an in-frame transcript consisting of 358 CAPZA2 exon 1 and MET starting from exon 3 (HF3035, HF3077) and exon 6 23-25 and may be 26,27. 12 359 (HF3035). The CAPZA2-MET fusions associated with outlier gene expression of MET 360 while CAPZA2 expression was comparable between samples with and without 361 CAPZA2-MET fusions (Supplementary Fig. 7a). The presence of multiple parallel fusion 362 transcripts suggested complex chromosomal rearrangements, which associated with 363 focal amplification of a 200 kb area on 7q31 (Fig. 3b). Amplification of the 7q31 genomic 364 area carrying the adjacent CAPZA2 and MET genes has been previously reported in 365 glioma 366 glioblastoma we analyzed the DNA copy number profiles of 486 TCGA IDH wildtype 367 glioblastoma samples. A focal amplification of the MET locus ranging in size from 150kb 368 to 5.1 Mb which associated with a highly significant increase in expression relative to 369 samples with broad 7q amplification or diploid MET copy number was identified in ten 370 cases (2.1%) (Supplementary Fig. 7b). RNAsequencing data was available for one of 371 the ten TCGA cases and no fusions involving MET were detected in those samples. 372 CAPZA2-MET fusions have been infrequently reported in other cancers 373 response of a glioblastoma carrying MET amplification to MET and ALK inhibiting agent 374 crizotinib has been recorded 31. 28. To assess the frequency of MET-activating somatic alterations in 29,30. Clinical 375 In spite of convincing evidence supporting fusion events in the GBM samples 376 from HF3035, HF3055 and HF3077, no sequencing reads manifesting the presence of 377 CAPZA2-MET fusion transcripts or the focal 7q31 genomic amplification were identified 378 in the HF3055 and HF3077 neurospheres and only weak support was found in the 379 HF3035 neurosphere. However, identical CAPZA2-MET fusions and 7q31 DNA 380 amplifications resurfaced at high frequency in all xenografts derived from the HF3035 381 and HF3077 neurospheres, with identical breakpoints (Supplementary Fig. 3b). None of 382 the HF3055 xenografts carried CAPZA2-MET fusions or 7q31 amplification, in line with 383 the absence of focal 7q31 amplification in the primary HF3055 tumor. To exclude the 384 possibility that the CAPZA2-MET fusion events were artifacts resulting from sequencing 385 we validated the event in all samples from HF3035 using RT-PCR, which confirmed 386 both wildtype MET and CAPZA2-MET mRNA in the tumor and PDX but not 387 neurosphere (Supplementary Fig. 3a). MET protein was abundantly present in the 388 HF3035 and HF3077 tumors as measured using immunohistochemistry, undetectable in 389 the neurospheres and re-expressed in the PDX (Supplementary Fig. 3c). 13 390 14 391 SUPPLEMENTARY REFERENCES 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. Hasselbach, L.A. et al. Optimization of High Grade Glioma Cell Culture from Surgical Specimens for Use in Clinically Relevant Animal Models and 3D Immunochemistry. J Vis Exp 83, e51088 (2014). deCarvalho, A.C. et al. Gliosarcoma stem cells undergo glial and mesenchymal differentiation in vivo. Stem Cells 28, 181-90 (2010). Berezovsky, A.D. et al. Sox2 promotes malignancy in glioblastoma by regulating plasticity and astrocytic differentiation. Neoplasia 16, 193-206 e25 (2014). Graveel, C. et al. Activating Met mutations produce unique tumor profiles in mice with selective duplication of the mutant allele. Proc Natl Acad Sci U S A 101, 17198-203 (2004). Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754-60 (2009). McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing nextgeneration DNA sequencing data. Genome Res 20, 1297-303 (2010). Torres-Garcia, W. et al. PRADA: pipeline for RNA sequencing data analysis. Bioinformatics 30, 2224-6 (2014). Berlin, K. et al. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat Biotechnol 33, 623-30 (2015). Delcher, A.L. et al. Alignment of whole genomes. Nucleic Acids Res 27, 2369-76 (1999). Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. Basic local alignment search tool. J Mol Biol 215, 403-10 (1990). Chiang, C. et al. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat Methods 12, 966-8 (2015). Brennan, C.W. et al. The somatic genomic landscape of glioblastoma. Cell 155, 462-77 (2013). Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31, 213-9 (2013). Ye, K., Schulz, M.H., Long, Q., Apweiler, R. & Ning, Z. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 2865-71 (2009). Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38, e164 (2010). Roth, A. et al. PyClone: statistical inference of clonal population structure in cancer. Nat Methods 11, 396-8 (2014). Favero, F. et al. Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data. Ann Oncol 26, 64-70 (2015). Conway, T. et al. Xenome--a tool for classifying reads from xenograft samples. Bioinformatics 28, i172-8 (2012). Xi, R. et al. Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion. Proc Natl Acad Sci U S A 108, E1128-36 (2011). Kim, H. et al. Whole-genome and multisector exome sequencing of primary and post-treatment glioblastoma reveals patterns of tumor evolution. Genome Res 25, 316-27 (2015). Kim, J. et al. Spatiotemporal Evolution of the Primary Glioblastoma Genome. Cancer Cell 28, 318-28 (2015). Robinson, J.T. et al. Integrative genomics viewer. Nat Biotechnol 29, 24-6 (2011). Singh, D. et al. Transforming fusions of FGFR and TACC genes in human glioblastoma. Science 337, 1231-5 (2012). 15 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 24. 25. 26. 27. 28. 29. 30. 31. Zheng, S. et al. A survey of intragenic breakpoints in glioblastoma identifies a distinct subset associated with poor survival. Genes Dev 27, 1462-72 (2013). Bao, Z.S. et al. RNA-seq of 272 gliomas revealed a novel, recurrent PTPRZ1-MET fusion transcript in secondary glioblastomas. Genome Res 24, 1765-73 (2014). Yoshihara, K. et al. The landscape and therapeutic relevance of cancer-associated transcript fusions. Oncogene (2014). Mertens, F., Johansson, B., Fioretos, T. & Mitelman, F. The emerging complexity of gene fusions in cancer. Nat Rev Cancer 15, 371-81 (2015). Mueller, H.W. et al. Identification of an amplified gene cluster in glioma including two novel amplified genes isolated by exon trapping. Hum Genet 101, 190-7 (1997). Kim, H.P. et al. Novel fusion transcripts in human gastric cancer revealed by transcriptome analysis. Oncogene 33, 5434-41 (2014). Yoshihara, K. et al. The landscape and therapeutic relevance of cancer-associated transcript fusions. Oncogene 34, 4845-54 (2015). Chi, A.S. et al. Rapid radiographic and clinical improvement after treatment of a MET-amplified recurrent glioblastoma with a mesenchymal-epithelial transition inhibitor. J Clin Oncol 30, e30-3 (2012). 454 16

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Whole Genome Low Pass Sequencing