* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download No evidence for viral sequences in lepidic
Survey
Document related concepts
Cre-Lox recombination wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Eukaryotic transcription wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Epitranscriptome wikipedia , lookup
Molecular evolution wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Gene expression wikipedia , lookup
RNA silencing wikipedia , lookup
Non-coding DNA wikipedia , lookup
Non-coding RNA wikipedia , lookup
Community fingerprinting wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Transcript
1 Supplemental Methods 2 3 Preparation of nucleic acid (DNA/RNA) and sequencing. 4 Ten cryostat sections (20-µm) of each tumor sample were submitted to nucleic acid 5 extraction. Presence of at least 50% tumor cells was verified in a light microscope using 6 standard HES-staining. They were initially grinded with the cell lyser and the nucleic 7 acids were extracted with the RNAeasy microkit according to the manufacturer’s 8 instructions (Qiagen GmbH, Hilden, Germany). The extracted RNA was treated with 9 TurboDNase (Invitrogen Inc., Carlsbad, CA) and then retrotranscribed into cDNA using 10 SuperScript III reverse transcriptase (Invitrogen Inc., Carlsbad, CA) and random 11 hexamer primers. The cDNA was amplified based on Phi29 polymerase as previously 12 described [16]. The different extracted DNAs were directly sequenced without prior 13 amplification whereas the extracted RNAs were amplified after a depletion step 14 intended to eliminate the maximum of ribosomal RNA. The Illumina Sequencing using 15 HiSeq 2000 was conducted with a mean depth per sample of 1.5×108 single reads of 100 16 nucleotides (nt) size. For the ATLL sample, the DNA and the RNA runs produces 1.54 17 and 1.71x108 reads respectively. For the PEL sample, the DNA and the RNA runs 18 produces 1.31 and 1.44x108 reads respectively whereas a mean depth per AIS sample of 19 1.8.x108 (range 1.5–1.9×108) single reads of 100 nucleotides (nt) size was obtained. 20 21 Pilot Study. 22 The 23 http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) then the sequences were 24 selected and trimmed according to their quality scores with cutadapt (web site : 25 https://code.google.com/p/cutadapt/). The human genome was filtered by mapping the 26 reads on the Homo sapiens hg19 reference with bowtie2 using the "sensitive" flag 27 option [17]. For the RNA runs, the percentage of reads mapped on the rRNA was 8.6% 28 for HTLV1 sample and 25% for the HHV8 sample. Bowtie2 was used on the remaining 29 reads to search there similarity with the two reference genomes (HTLV1 Acc J02029 and 30 HHV8 Acc AF148805) then the reads were assembled with SPAdes [18] to produce 4 set 31 of contigs. quality of the reads was assessed by FastQC (web site: 32 1 33 Patient study. 34 The quality of the reads was assessed by FastQC then the sequences were selected and 35 trimmed according to their quality scores with cutadapt. The human genome was 36 filtered by mapping the reads on the Homo sapiens hg19 reference with bowtie2 using 37 the "sensitive" flag option. This host filtering step eliminated an average of 99.5% reads 38 per sample (range 99.4-99.7%). The remaining reads of the two control samples were 39 assembled with SPAdes [18] to produce a set of control-contigs (43714 contigs with 40 5.7X106 bases). For the five other samples, the remaining sequences were further 41 filtered by mapping against the control-contigs using bowtie2 ("sensitive" parameters). 42 This second filtering step eliminated an average of 76% reads per sample (range 61- 43 91%). 44 At the end of the filtering steps, each sample consisted of a set of sequences containing 45 between 0.3 and 2.2x105 reads. These groups of reads were assembled individually and 46 mixed together by SPAdes assemblers [18]. The mixed assembly produce by SPAdes was 47 composed of 596 contigs with an average length of 312 bases (maximum length = 2982 48 bases). For all contigs and singlets an attempt at taxonomic assignment has been made 49 by similarity search with sequences of EMBL database (STD section) using BLASTN [19] 50 and with sequences of Uniprot database using BLASTX with an E-value equal to 1E-3. All 51 the possible assignments were counted with a weight equivalent to the number of reads 52 of 53 (taxoptimizer/rankoptimizer) and explored using the krona visualization system [20]. each contig. The results were sorted by homemade software 54 55 56 57 58 59 60 61 62 63 64 65 2 66 Supplemental figures. 67 68 69 70 71 Figure S1. Results of the control processes for the RNA library from the HHV8 sample (pilot study). A: Annotation of the HHV8 genome (Acc AF148805). B1: Bowtie2 mapping of the RNA library, reads used in sens are drawn in red ; reads used in anti-sens are drawn in green. The most expressed genes are noted (mainly K proteins). B2 : zoom of the same mapping. C: blastn mapping of the contigs built with the RNA library. 72 73 3 74 75 76 77 78 Figure S2. Results of the control processes for the DNA library from the HHV8 sample (pilot study). A: Annotation of the HHV8 genome (Acc AF148805). B: Bowtie2 mapping of the DNA library, reads used in sense are drawn in red ; reads used in anti-sense are drawn in green. C: blastn mapping of the contigs built with the DNA library. 79 80 4 81 82 83 84 Figure S3. Results of the control processes for the RNA library from the HTLV1 sample (pilot study). A: Annotation of the HTLV1 genome (Acc AJ02029). B: Bowtie2 mapping of the RNA library, reads used in sense are drawn in red ; reads used in anti-sense are drawn in green. C: blastn mapping of the contigs built with the RNA library. 85 86 87 5 88 89 90 91 Figure S4. Results of the control processes for the DNA library from the HTLV1 sample (pilot study). A: Annotation of the HTLV1 genome (Acc AJ02029). B: Bowtie2 mapping of the DNA library, reads used in sense are drawn in red ; reads used in anti-sense are drawn in green. C: blastn mapping of the contigs built with the DNA library. 92 93 94 6 95 96 Figure S5. Krona representation of the taxonomies incidence from the RNA library of the HHV8 sample. 97 98 99 7 100 101 Figure S6. Krona representation of the taxonomies incidence from the DNA library of the HHV8 sample. 102 103 104 8 105 106 107 108 Figure S7. Krona representation of the taxonomies incidence from the RNA library of the HTLV1 sample. 109 110 9 111 112 Figure S8. Krona representation of the taxonomies incidence for the branch "Viruses" from the RNA library of the HTLV1 sample. 113 114 115 10 116 117 Figure S9. Krona representation of the taxonomies incidence from the DNA library of the HTLV1 sample. 118 119 120 11 121 122 Figure S10. Krona representation of the taxonomies incidence for the branch "Viruses" from the DNA library of the HTLV1 sample. 123 124 125 12 126 127 Figure S11. Krona representation of the taxonomies incidence from global assembly of the 5 BAC samples. 128 129 13