This post was contributed by guest blogger Søren Hough, the Head Science Writer at Desktop Genetics.One of the most important steps in the CRISPR experimental process is validating edits. Regardless of which CRISPR genome editing system you use, there remains a chance that the observed phenotype was caused by an off-target mutation and not an edit in the target gene.
The validation process, also known as CRISPR genotyping, is critical to demonstrating causal relationships between genotype and assayed phenotype. Verifying these connections can help alleviate the reproducibility crisis in biology. It is key to address these concerns as CRISPR use grows across the life sciences and to establish standardized validation techniques for academia, industry, and especially the clinic.
Popular validation assays are insufficient
As discussed in CRISPR 101: Validating Your Genome Edit, there are a variety of options for CRISPR genotyping. The most common options include mismatch cleavage assays, such as Surveyor™, T7E1, and Sanger sequencing. However, recent studies suggest that both Surveyor™ and Sanger may not be adequate standards for validating edits.
Mismatch cleavage assays rely on pairing between the edited strand and wild-type strand of the host DNA. When these strands hybridize, the nuclease can detect strands with mismatches and cleave them. The results are then visualized using gel electrophoresis.
Surveyor™ and T7E1 have been widely adopted due to their relative simplicity and low cost. The problem with these assays is that they do not provide sequence-level data. They also have a limit of detection of ~5%. This means they do not reliably detect editing events that occur in less that 5% of the population (Fu et al. 2013, Vouillot et al. 2015).
Meanwhile, Sanger sequencing is laborious, time consuming and cannot be applied to heterogenous populations (Bell et al. 2014). Further, Sanger sequencing has a lower detection limit of 50-20% (although this has been improved in some studies) (Davidson et al. 2012, Tsiatis et al. 2010). As the field moves toward standardized thresholds for validating CRISPR experiments, many are turning to next-generation sequencing options over older assays.
Biased sequencing methods
There are two primary methods of off-target detection: biased and unbiased. Biased techniques only sequence certain sites in the genome predicted to contain off-target cleavage events. Unbiased techniques search the whole genome for off-target sites irrespective of in silico prediction.
These techniques differ in important ways, but can also complement one another by providing both broad and specific details on genome sequencing. Used in concert, these approaches can provide the researcher with a reasonable level of certainly that the effects they see are not due to off-targets. This is a valuable step toward enhancing confidence and reproducibility of a study’s findings.
Table 1: Biased Genotyping Options
|Technique||Sequence Information||Detection Limit||Advantages||Disadvantages|
|Mismatch Cleavage Assay||Not Provided||5%||Inexpensive, simple||Low-throughput, low sensitivity|
|Sanger Sequencing||Provided||20% (variable)||Sequence-level data||Low-throughput, inappropriate for heterogenous populations, low sensitivity|
|Targeted Amplicon Sequencing||Provided||0.01% (variable)||Sequence-level data, extremely sensitive||Do not sequence all DSBs, may miss unpredicted off-target breaks|
Prediction algorithms: A good place to start for biased validation
At the moment, many software tools predict off-target effects of sgRNAs using computational methods. They identify possible off-target sites across the genome and pinpoint the location of mismatches based on the sequences of the genome and sgRNA. This is a good starting point for most researchers as it provides a list of putative off-target sites that they can later sequence for mutations.
One method a researcher can use to test predicted off-target sites following a CRISPR experiment is targeted amplicon sequencing. The information from targeted amplicon sequencing is highly sensitive with detection levels as low as .01% (Hendel et al. 2015). Low detection rates mean the investigator can be relatively certain that their samples don’t have off-target mutations if they remain undetected using these techniques.
Frequencies of off-target mutations are essential data points for investigators looking to definitively link genotype and phenotype. It is also key to perform these validations as translational researchers begin to use CRISPR as a therapy. Low frequency off-target effects may generate irreproducible data in a research setting, but these events could have disastrous health effects in the clinic. NGS-based methods provide the most complete information profile regarding putative off-target sites including both the edit rate and the repair product sequence.
Targeted amplicon sequencing doesn’t tell the whole story
Even though progress has been made with off-target prediction algorithms, their genome-wide search criteria are not exhaustive. Mismatch tolerance settings are often limited to off-target sites of <4 bp. The off-target list is also generally weighted by the position of the mismatch along the length of the gRNA given the stricter sequence requirement at the terminal 3’ PAM site (Fu et al. 2013; Pattanayak et al. 2013).
This approach misses larger mismatches (e.g. six nucleotides) that may still lead to off-target double-stranded breaks (Tsai et al. 2015). Additionally, current algorithms do not take into account other elements, including those relating to DNA structure (e.g. epigenetic modification, bulges) that may also impact off-target edits. As a result, only sequencing sites predicted by conventional algorithms may not provide a full picture of the impact of CRISPR editing in the model cell line or organism.
Several options exist for unbiased off-target detection, including Digenome-seq (Kim et al. 2015) for in vitro analysis, IDLV for in vivo detection (Gabriel et al. 2011, Wang et al. 2015, Osborn et al. 2016) and HTGTS (Frock et al. 2015) for cell-based experiments. These strategies can be used in concert with in silico prediction to create a more comprehensive list of off-target editing events. Two of the most common cell-based methods are genome-wide, unbiased identification of double-strand breaks (DSBs) evaluated by sequencing (GUIDE-seq) (Tsai et al. 2015) and direct in situ breaks labeling, enrichment on streptavidin and next-generation sequencing (BLESS) (Crosetto et al. 2013).
GUIDE-seq and BLESS detect double-stranded breaks and do not require high sequencing read counts making them fast and viable options for multiplex sequencing in many laboratories. Nevertheless, unbiased detection isn’t as sensitive as targeted amplicon sequencing. For example, GUIDE-seq seems to have a minimum detection limit of 0.1% (Tsai et al. 2015). This contrasts with detection frequencies of 0.01% in amplicon sequencing (Hendel et al. 2015), a significant difference as CRISPR experiments move closer to the clinic (Tsai and Joung 2016).
Table 2: Unbiased Genotyping Options
|GUIDE-Seq||0.1%||Cell-based||Searches the genome for all DSBs, doesn't require high read counts, fast multiplexing||Requires delivery of dsODN (potentially toxic)|
|Digenome Seq||0.1%||Cell-free (in vitro)||Works across all cell types||Must be verified with cell-based method|
|IDLV||1%||Cell-based||Programmable, can detect DSBs in live cells||Not as senstive as other unbiased methods, high background|
|BLESS||Not reported||Cell-based (in vitro)||Can be used on tissue from whole animal models, no exogenous component required (e.g. dsODN), doesn't require high read counts (fast multiplexing)||Requires large cell population, senstive to time since cell fixing|
|HTGTS||Not reported||Cell-based||Identifies translocations||Limited by chromatin configuration, produces many false negatives|
Combining sequencing techniques can ensure validated experiments
Unbiased detection methods are excellent for finding evidence of DSBs throughout the genome. However, their decreased sensitivity means that the best option moving forward may be to integrate both biased and unbiased approaches. As suggested in a review by Tycko et al., 2016, unbiased sequencing and in silico prediction should give a broad picture of all possible editing events in the genome; from there, amplicon sequencing can evaluate and validate off-target sites in a highly accurate manner.
Using both of these approaches may not be necessary for every CRISPR experiment. Off-target events due to >3 bp mismatches or that are sequence-independent are rare, but they are detectable using just genome-wide unbiased methods. However, most investigators use single cell clones for in vitro CRISPR experiments. The likelihood that a single cell clone derived from the pool contains both the rare off target event and the desired edit is low. Therefore, unbiased sequencing may not be worth the cost and labor when single clones are selected. Conversely, translational research may require the rigor of both forms of off-target analysis in order to meet clinical approval.
It is key to maintain a consistent set of standards as the field seeks to generate reproducible, quality data on the role of genetic networks in biological systems. NGS will also play significantly into the realm of clinical therapeutic development as CRISPR is used not only to study disease, but to treat patients, as well. For more information and a detailed overview of the aforementioned sequencing techniques, please see “Methods for Optimizing CRISPR-Cas9 Genome Editing Specificity” by Tycko et al. 2016 and “Defining and improving the genome-wide specificities of CRISPR–Cas9 nucleases” by Tsai and Joung 2016.
Many thanks to our guest blogger Søren Hough. We additionally thank Monica Sentmanat, Victor Dillard, and Ayokunmi Ajetunmobi for their contributions to this work.
Søren Hough is the Head Science Writer at Desktop Genetics, a company that provides free sgRNA design tools for CRISPR experiments. He spent many years in several biochemistry laboratories and now works on communicating the latest science both inside and outside the field.
1. Fu, Yanfang, et al. "High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells." Nature biotechnology 31.9 (2013): 822-826. PubMed PMID: 23792628. PubMed Central PMCID: PMC3773023.
2. Vouillot, Léna, Aurore Thélie, and Nicolas Pollet. "Comparison of T7E1 and surveyor mismatch cleavage assays to detect mutations triggered by engineered nucleases." G3: Genes| Genomes| Genetics 5.3 (2015): 407-415. PubMed PMID: 25566793. PubMed Central PMCID: PMC4349094.
3. Bell, Charles C., et al. "A high-throughput screening strategy for detecting CRISPR-Cas9 induced mutations using next-generation sequencing." BMC genomics 15.1 (2014): 1. PubMed PMID: 25409780. PubMed Central PMCID: PMC4246457.
4. Davidson CJ., et al. "Improving the limit of detection for Sanger sequencing: a comparison of methodologies for KRAS variant detection." Biotechniques 53.3 (2012): 182-188. PubMed PMID: 22963480. PubMed Central.
5. Tsiatis, Athanasios C., et al. "Comparison of Sanger sequencing, pyrosequencing, and melting curve analysis for the detection of KRAS mutations: diagnostic and clinical implications." The Journal of Molecular Diagnostics 12.4 (2010): 425-432. PubMed PMID: 20431034. PubMed Central PMCID: PMC2893626.
7. Pattanayak, Vikram, et al. "High-throughput profiling of off-target DNA cleavage reveals RNA-programmed Cas9 nuclease specificity." Nature biotechnology 31.9 (2013): 839-843. PubMed PMID: 23934178. PubMed Central PMCID: PMC3782611.
8. Tsai, Shengdar Q., et al. "GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases." Nature biotechnology 33.2 (2015): 187-197. PubMed PMID: 25513782. PubMed Central PMCID: PMC4320685.
9. Kim, Daesik, et al. "Digenome-seq: genome-wide profiling of CRISPR-Cas9 off-target effects in human cells." Nature methods 12.3 (2015): 237-243. PubMed PMID: 25664545.
10. Gabriel, Richard, et al. "An unbiased genome-wide analysis of zinc-finger nuclease specificity." Nature biotechnology 29.9 (2011): 816-823. PubMed PMID: 21822255.
11. Wang, Xiaoling, et al. "Unbiased detection of off-target cleavage by CRISPR-Cas9 and TALENs using integrase-defective lentiviral vectors." Nature biotechnology 33.2 (2015): 175-178. PubMed PMID: 25599175.
12. Osborn, Mark J., et al. "Evaluation of TCR gene editing achieved by TALENs, CRISPR/Cas9, and megaTAL nucleases." Molecular Therapy (2015). PubMed PMID: 26502778.
13. Frock, Richard L., et al. "Genome-wide detection of DNA double-stranded breaks induced by engineered nucleases." Nature biotechnology 33.2 (2015): 179-186. PubMed PMID: 25503383. PubMed Central PMCID: PMC4320661.
14. Crosetto, Nicola, et al. "Nucleotide-resolution DNA double-strand break mapping by next-generation sequencing." Nature methods 10.4 (2013): 361-365. PubMed PMID: 23503052. PubMed Central PMCID: PMC3651036.
15. Tsai, Shengdar Q., and J. Keith Joung. "Defining and improving the genome-wide specificities of CRISPR-Cas9 nucleases." Nature Reviews Genetics17.5 (2016): 300-312. PubMed PMID: 27087594.
16. Tycko, Josh, Vic E. Myer, and Patrick D. Hsu. "Methods for Optimizing CRISPR-Cas9 Genome Editing Specificity." Molecular Cell 63.3 (2016): 355-370. PubMed PMID: 27494557.
Additional Resources on the Addgene Blog
- Learn to Use Cas9 Activators
- Perform Genome-Wide Screens with CRISPR
- Generate Mouse Models with CRISPR/Cas9
Additional Resources on Addgene.org