* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Design considerations for highly specific and efficient
Gene regulatory network wikipedia , lookup
RNA interference wikipedia , lookup
Gene expression profiling wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Synthetic biology wikipedia , lookup
Non-coding RNA wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
RNA silencing wikipedia , lookup
Molecular evolution wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
SNP genotyping wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gene expression wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Secreted frizzled-related protein 1 wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Community fingerprinting wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome evolution wikipedia , lookup
GE Healthcare Design considerations for highly specific and efficient synthetic crRNA molecules Anja van Brabant Smith, Emily M. Anderson, Shawn McClelland, Elena Maksimova, Tyler Reed, Steve Lenger, Žaklina Strezoska, Hidevaldo Machado Dharmacon, part of GE Healthcare, 2650 Crescent Drive, Suite #100, Lafayette, CO 80026, US Introduction Off-target analysis should include gaps In order to understand the parameters affecting CRISPR-Cas9 gene editing efficiency, we systematically transfected synthetic CRISPR RNA (crRNA) and trans-activating CRISPR RNA (tracrRNA) reagents targeting components of the proteasome into a reporter cell line in which knockout of proteasome function results in fluorescence of a ubiquitin-EGFP fusion protein that is normally degraded by the proteasome pathway. We evaluated the functionality of > 1100 crRNA sequences in this system; using these data, we developed and trained an algorithm to score crRNAs based on how likely they are to produce functional knockout of targeted genes. We further tested our algorithm by designing synthetic crRNAs to genes unrelated to the proteasome and examined their ability to knock out gene function using additional phenotypic assays. To augment our functionality algorithm, we developed an optimized alignment program to perform rapid, flexible, and complete specificity analysis of crRNAs, including detection of gapped alignments. We have combined this comprehensive specificity check with our functionality algorithm to select and score highly specific and functional crRNAs for any given gene target. DMSO-treated cells B MG132-treated cells Transfect synthetic crRNA:tracrRNA Cas9 expression plasmid C Cas9:crRNA:tracrRNA-treated PPIB negative control PSMD7 siRNA-treated PSMD14 Figure 5. Potential off-target sites in the genome for any given crRNA include not just mismatches but gaps as well. Gaps can exist in the crRNA strand or in the DNA target strand. Many commonly used web-based crRNA specificity tools do not fully account for gaps when performing alignments. Gap in DNA strand NNNNNN N NN NNNNmNNNNNNN-NNNNNNN |||| ||| ||| ||||||| NNNNNNNN-NNNNNNNNNNN N NN NNNNNNNN DNA RNA Gap in RNA strand Complete off-target analysis is important for crRNA specificity Functional assay for crRNAs A Flaws m = mismatch - = gap VCP VCP siGENOME pool CRISPR target ‘A’ Figure 1. Recombinant U2OS cell line stably expressing a mutant human ubiquitin fused to EGFP. A Gly76Val mutation results in an uncleavable ubiquitin moiety fused to EGFP, which results in constitutive degradation of the protein and little detectable EGFP fluorescence. (A) Inhibition of proteasome activity with chemical treatment (MG132) results in EGFP fluorescence. (B) Schematic of a CRISPR-Cas9 experiment using synthetic crRNA and tracrRNA reagents. (C) Inhibition of proteasome activity with CRISPR-Cas9 or siRNAs targeting proteasome components results in EGFP fluorescence. P A M CRISPR ‘A’ incorrectly rated as GOOD Specificity CRISPR ‘A’ correctly rated as POOR Specificity Incomplete alignment to the genome Intended target ALL alignments to the genome P A M +++ P A M + P A M P A M + P A M P A M --- P A M - P A M + P A M ++ P A M ++ P A M 6 5 P A M Intended target Controls Exon 1 4 Exon 2 mismatch in alignment gap in alignment Exon 3 3 Exon 4 Exon 5 2 Exon 6 Exon 7 1 0 0 6 6 5 5 100 200 300 400 500 600 700 800 Likelihood of cleavage crRNA functionality is position- and sequence-dependent Controls Exon 1 4 4 Exon 8 Exon 2 Exon 3 3 Exon 4 Exon 5 2 Exon 6 Exon 7 1 0 GFP signal, fold over negative control 0 6 5 100 200 300 400 500 600 700 Figure 6. Schematic demonstrating the effect of incomplete alignment during off-target analysis for a crRNA. Exon 9 3 Exon 10 Exon 11 2 Exon 12 1 crRNA: - + - + - + - + 0 800 800 6 6 5 5 900 1000 1100 1200 1300 1400 1500 Target/Off-target Sequence PAM Intended target GGTCATCTGGGAGAAAAGCG TGG OT1 GGTCCTCTGGGAGAAAAGACG CAG OT2 GGT-ATCTGGGAGAAAAGCA TGG OT3 GGTC-TCTGGGAGAAAAG-G AAG Controls Exon 1 4 4 Exon 8 Exon 2 Exon 3 3 Exon 9 3 Exon 10 Exon 4 Exon 5 2 Exon 6 Exon 7 1 0 100 200 300 400 500 600 700 Exon 13 Exon 14 3 Exon 15 Exon 11 2 Exon 12 1 800 800 6 6 5 5 Exon 16 2 Exon 17 1 0 0 4 900 1000 1100 1200 1300 1400 0 1500 1500 1600 1700 1800 1900 2000 2100 2200 2300 2400 2500 Exon 8 Exon 9 3 Exon 10 Exon 11 2 Exon 12 4 Exon 13 Exon 14 3 Exon 15 Exon 16 2 Exon 17 1 1 0 1500 0 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300 2400 37 Target cut site distance from CDS start in transcript (nts) Figure 2. Ubiquitin[G76V]-EGFP U2OS cells stably expressing Cas9 were transfected with 266 synthetic crRNA:tracrRNA complexes targeting the coding region of the VCP gene. EGFP fluorescence was measured 72 hours post- transfection; an increase in EGFP fluorescence indicates functional knockout of the VCP gene resulting in disruption of proteasome function. crRNAs in different exons are indicated by the different colors. The data indicate that crRNAs vary in their ability to cause functional gene disruption. 4 Indel% 2500 6 0 16 OT1 0 OT2 OT3 Figure 7. Ubi[G76V]-EGFP U2OS cells stably expressing Cas9 were transfected with 25 nM synthetic crRNA:tracrRNA and genomic DNA was isolated 72 hours after transfection. Potential off-target (OT) sites containing mismatches as well as gaps in the RNA or the DNA strand were analyzed for off-target cleavage with a mismatch detection assay (using T7EI endonuclease). Red nucleotides indicate mismatches, red dashes indicate gaps, and underlined red nucleotides indicate insertions relative to the target DNA sequence. The off-targets were identified using the Dharmacon specificity tool and are not identified by other common web-based crRNA specificity tools. 5 4 Exon 13 Exon 14 3 Exon 15 Exon 16 The optimal crRNAs balance functionality and specificity Machine learning for algorithm development 2 Exon 17 1 0 1500 1600 1700 1800 1900 2000 2100 2200 2300 2400 2500 Figure 3. A training set consisting of 10 genes and 1115 crRNA target sites was used to train our functionality algorithm. Features examined include nucleotide composition, nearest neighbor effects, PAM sequence, position in exons, and distance from the start codon. Receiver Operating Characteristic (ROC) shows good fit of training set data. The ROC measures the area under the curve of True Positive Rate vs False Positive Rate. The ROC for our test set data is 0.78. Gene Exon 1 Exon 2 Exon 3 Exon 4 CDS region of exon Exon 5 Transcript A Exon 1 Exon 2 Exon 3 Transcript B CRISPR ‘A’ CRISPR ‘B’ Specificity Rating Functionality Rating CRISPR targets POOR AVERAGE GOOD CRISPR target ‘A’ – May function, but poor specificity CRISPR target ‘B’ – OPTIMAL PICK, scores well in both functionality & specificity Functionality of algorithm-designed crRNAs in other assays Normalized ApoONE assay BCL2L1 PLK1 Conclusions • crRNAs vary in specificity and in efficiency for creating functional gene knockouts WEE1 3 3 3 2.5 2.5 2.5 2 2 2 1.5 1.5 1.5 1 1 1 0.5 0.5 0.5 Figure 8. Schematic of how any given crRNA may differ with regard to its specificity and functionality. Our algorithm balances these two attributes to pick the crRNAs predicted to have the highest specificity and functionality. • We used a high-throughput fluorescence assay to develop and train an algorithm to score crRNAs for likelihood of producing functional gene knockouts • We developed an optimized alignment program to perform rapid and complete specificity analysis of crRNAs 0 0 H1 H2 • We demonstrate that targets with gaps in either the RNA or DNA strand can be cleaved and are therefore important to identify during specificity checking 0 H1 H2 H1 H2 crRNAs Figure 4. Box plot representation of the functionality of crRNAs targeting BCL2L1, PLK1 or WEE1 as determined by the ApoONE homogeneous assay (Promega) 48 hours after transfection. For the box plots, crRNAs were divided into bottom half (H1) and top half (H2) based on their functionality score. The medians as well as the distribution of data between the lower and upper quartile demonstrate that high-scoring crRNAs have increased functionality. gelifesciences.com/dharmacon GE, imagination at work and GE monogram are trademarks of General Electric Company. Dharmacon is a trademark of GE Healthcare companies. All other trademarks are the property of General Electric Company or one of its subsidiaries. ©2015 General Electric Company—All rights reserved. Version published July 2015. GE Healthcare UK Limited, Amersham Place, Little Chalfont, Buckinghamshire, HP7 9NA, UK