Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Supplementary methods Variant calling and annotation pipelines For tumors where no matching germline DNA was available, single-nucleotide variants (SNVs) and indels were identified using the intersection of GATK Unified genotyper(1) and Varscan(2). For tumor–normal paired samples variants were identified through the intersection of calls from at least two variant callers including Varscan, MuTect(3) and IndelGenotyper(1). For non-paired samples variants were filtered out if present on the Exome Variant Server, the 1000genome(4) database or from 60 unrelated normal samples run through the same targeted sequencing panel. Variants overlapping with repetitive sequences or covered by less than 10X in any of the tumours or available matching germline DNA were excluded. Variants were annotated with information from Ensembl Release 58(5), using the Ensembl Perl API including Variant Effect Predictor(6), and for their occurrence on COSMIC(7). The functional effect of missense variants were evaluated using in silico prediction tools SIFT(8) and PolyPhen-2(9). Merkel quantification and insertion site detection Targeted-capture data was used to determine the viral integration site in each sample. Raw sequencing reads were aligned to the hg19 reference, including MCV sequence. To detect the insertion site of the virus, Socrates was used to predict fusions on the aligned capture data (10). Reads captured for MCV overlapping the breakpoint were used to predict the integration sites. The Socrates output was then filtered for MCV and identified fusions with other parts of the genome. Whole genome copy number variation calling Copy-number calling was performed on the low coverage whole genome sequencing data (aligned to hg19 with BWA) using ControlFreec (version 6.7) (11). The tumors were analyzed without the aid of control samples and the genome was assessed on 50kbp bins. For the purposes of data cleaning, telomere, centromere and the rDNA repeat rich short arms of acrocentric chromosmes 13, 14, 15, 21 and 22 were excluded from the analysis. The data was then filtered for the ENCODE Project’s blacklisted regions of hg19(12). Finally, all bins that show read depth aberrations of the same type (loss/gain) in at least 50% of the samples have been removed as suspected 1 technical artifacts. The mutational burden of the samples was assessed by counting the number of distinct copy number calls for each sample as well as the proportion of the genome affected by these changes (Supplementary Table S8). A list of all copy number intervals for genes represented on the panel are shown in Supplementary Table S9 (filtered for recurrent events). Focal copy number profiles proximal to viral integration can be reviewed in Supplementary Figure S1 (copy number plots with Merkel insertion site highlighted). Cumulative gains and losses are displayed in Fig.1D (filtered for intervals of at least 250kbp). TERT promoter, PIK3CA, HRAS and AKT1 mutation detection To screen for mutations in the TERT promoter, samples were screened using an amplicon-based deep sequencing approach. Only TERT promoter mutations covering positions 124 and 146 bp upstream of the TERT translational start site were screened(13). The primer sequences for the TERT promoter were TERT_Forward: 5’ACACTGACGACATGGTTCTACACAGCGCTGCCTGAAACTCG-3’ and TERT_Reverse: 5’TACGGTAGCAGAGACTTGGTCTCGTCCTGCCCCTTCACCTTC- 3’. Using the FastStart High Fidelity PCR System (Roche diagnostics), the reaction mixture included 1x PCR buffer, 4.5 mM MgCl2, 250 nM of each primer, 200 µM of dNTPs, 5% DMSO, 0.5 U of FastStart High Fidelity Enzyme, 2ul of DNA (>1ng/ul) and PCR grade water in a total volume of 10 µl. PCR was run on a Veriti 96-well Thermal Cycler (Applied Biosystems) using the 48.48 access array standard protocol established by Fluidigm [Access Array™ System for Illumina Sequencing Platform User Guide (PN 100-3770)]. After barcoding DNA product with Fluidigm-based barcodes (Fluidigm), products were pooled and run on an Illumina Miseq according to manufacturer’s instructions (v2 150-bp kit). The same approach was also used to validate mutations in exon 2 and 3 of HRAS, exons 10 and 21 of PIK3CA and exon 3 of AKT1 using the following primers: HRAS_EX2_Forward: 5’ACACTGACGACATGGTTCTACAAGGAGACCCTGTAGGAGGAC-3’, HRAS_EX2_Reverse: 5’TACGGTAGCAGAGACTTGGTCTCTCTATAGTGGGGTCGTATTCG-3’, HRAS_EX3_Forward- 5’ACACTGACGACATGGTTCTACACCTACCGGAAGCAGGTGGTC-3’, HRAS_EX3_Reverse: 5’2 TACGGTAGCAGAGACTTGGTCTTTCACCTGTACTGGTGGATGTCC-3’, PIK3CA_EX10_Forward: 5’ACACTGACGACATGGTTCTACAGTAACAGACTAGCTAGAGAC-3’, PIK3CA_EX10_Reverse: 5’TACGGTAGCAGAGACTTGGTCTAATCTCCATTTTAGCACTTACCTGTGAC3’, PIK3CA_EX21_Forward: 5’ACACTGACGACATGGTTCTACAGATGACATTGCATACATTCG-3’, PIK3CA_EX21_Reverse: 5’TACGGTAGCAGAGACTTGGTCTTTGTGTGGAAGATCCAATCC-3’. AKT1_EX3_Forward: 5’ACACTGACGACATGGTTCTACACCCAAATCTGAATCCCGAGAG-3’, AKT1_EX3_Reverse: 5’TACGGTAGCAGAGACTTGGTCTAGTGTGCGTGGCTCTCAC-3’ Sanger sequencing Mutations in TP53 and CDKN2A were validated using high-resolution melt analysis and Sanger sequencing using previously published protocols (14,15). Other mutations shown in Fig. 2 were validated using Sanger sequencing (Supplementary Table S5 for primer sequences). When possible, matching normal DNA was run in parallel. PCR amplification was conducted on a LightCycler 480 (Roche Diagnostics). The reaction mixture included 1x PCR buffer, 2.5 μM MgCl2, 200 nM of each primer, 200 μM of dNTPs, 5 μM of SYTO 9 (Invitrogen), 0.5U of HotStarTaq polymerase (Qiagen), 10 ng DNA and PCR grade water in a total volume of 10 μl. PCR conditions included an activation step of 15 min at 95°C followed by 55 cycles of 95°C for 10 sec, annealing for 10 sec comprising 10 cycles of a touchdown from 65°C to 55°C at 1°C/cycle followed by 35 cycles at 55°C, and extension at 72°C for 30 sec. All analyses were performed in duplicate. Following PCR amplification, a 1:10 dilution PCR product was sequenced using the BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems). The sequencing products were purified with Agencourt CleanSEQ beads (Beckman Coulter), followed by capillary electrophoresis on an ABI 3730 DNA Sequencing instrument (Applied Biosystems). Data analysis was conducted with Sequencher software, version 4.6 (Gene Codes). 3 Cell culture and drug sensitivity analysis MCC13, MCC14 and MCC26 cells were grown in RPMI supplemented with 10% FCS. Stock concentrations of drugs were made as recommended. MEK inhibitors PD325901 and AZD6244 were purchased from Selleck Chemicals. For drug sensitivity curves, cells were treated with increasing concentrations of drug over 72hrs in a 96 well plate format. After fixing and staining of the cells with sulforhodamine B, absorbance was read on a Spectromax plate reader. Drug curves were determined relative to control vehicle treated cells using GraphPad Prism Version 6.01 software. Exact GI50 are detailed in Supplementary Table S11 References 1. 2. 3. 4. 5. 6. 7. 8. 9. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome research 2010;20(9):1297-303. Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD, Mardis ER, et al. VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics 2009;25(17):2283-5. Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nature biotechnology 2013;31(3):213-9. Genomes Project C, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, et al. An integrated map of genetic variation from 1,092 human genomes. Nature 2012;491(7422):56-65. Flicek P, Ahmed I, Amode MR, Barrell D, Beal K, Brent S, et al. Ensembl 2013. Nucleic acids research 2013;41(Database issue):D48-55. McLaren W, Pritchard B, Rios D, Chen Y, Flicek P, Cunningham F. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 2010;26(16):2069-70. Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, Beare D, et al. COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic acids research 2011;39(Database issue):D945-50. Ng PC, Henikoff S. Predicting deleterious amino acid substitutions. Genome research 2001;11(5):863-74. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nature methods 2010;7(4):248-9. 4 10. 11. 12. 13. 14. 15. Schroder J, Hsu A, Boyle SE, Macintyre G, Cmero M, Tothill RW, et al. Socrates: identification of genomic rearrangements in tumour genomes by re-aligning soft clipped reads. Bioinformatics 2014. Boeva V, Popova T, Bleakley K, Chiche P, Cappo J, Schleiermacher G, et al. Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics 2012;28(3):423-5. Consortium EP. An integrated encyclopedia of DNA elements in the human genome. Nature 2012;489(7414):57-74. Xie H, Liu T, Wang N, Bjornhagen V, Hoog A, Larsson C, et al. TERT promoter mutations and gene amplification: promoting TERT expression in Merkel cell carcinoma. Oncotarget 2014;5(20):10048-57. Mitchell G, Ballinger ML, Wong S, Hewitt C, James P, Young MA, et al. High frequency of germline TP53 mutations in a prospective adult-onset sarcoma cohort. PLoS One 2013;8(7):e69026. Lim AM, Do H, Young RJ, Wong SQ, Angel C, Collins M, et al. Differential mechanisms of CDKN2A (p16) alteration in oral tongue squamous cell carcinomas and correlation with patient outcome. International journal of cancer Journal international du cancer 2014;135(4):887-95. 5