This post was contributed by John Doench of the Broad Institute.
For more infomation on gRNA design, see our post: How to Design Your gRNA for CRISPR Genome Editing
Whether designing a small number of sgRNAs for a gene of interest, or an entire library of sgRNAs to cover a genome, the ease of programing the CRISPR system presents an embarrassment of riches of potential sgRNAs. How to decide between them? By taking into account both on-target efficacy and the potential for off-target activity, experiments utilizing CRISPR technology can provide a straightforward means of determining loss-of-function phenotypes for any gene of interest.
Predicting sgRNA efficacy
We have recently examined sequence features that enhance on-target activity of sgRNAs by creating all possible sgRNAs for a panel of genes and assessing, by flow cytometry, which sequences led to complete protein knockout (1).
Some sequences worked better than others, and we also saw that variations in the protospacer-adjacent motif (PAM) led to differences in activity: specifically, CGGT tended to serve as a better PAM than the canonical NGG sequence. By examining the nucleotide features of the most-active sgRNAs from a set of 1,841 sgRNAs, we derived scoring rules and built a website implementation of these rules to design sgRNAs against genes of interest, available here: http://www.broadinstitute.org/rnai/public/analysis-tools/sgrna-design.
Once sgRNA sequences most-likely to give high activity are identified, some filtering can be used to further winnow down a list. For example, basic features of the target gene can be used to eliminate some sgRNAs, such as those that target near the C’ terminus of a protein, as frameshifts are less-likely to be deleterious if most of the protein has already been translated. While every protein will be different, it seems reasonable that target sites in the first half of a protein will likely lead to a functional knockout. Indeed, for some of the genes we examined, even targets very close to the C’ terminus disrupted expression. Certainly, for any gene of interest, it would be unwise to make conclusions on the basis of the activity of a single sgRNA, and thus diversity of target sites across a gene should be examined.
Avoiding off-target sites
The off-target activity of sgRNAs is important to consider. Several papers have reached far-different conclusions regarding the extent of these effects, and certainly at least one reason for these observed differences is the expression levels of Cas9 and sgRNA used in these studies (2,3). Additionally, the ability to predict off-target sites in the genome is still in its infancy. While the basic landscape of mismatches that can lead to cutting has been established, and can be used to identify sites that are likely to give rise to an off-target effect, as yet there is not enough data to fully predict which sites will and will not show appreciable levels of modification. To further confound matters, it has recently been shown that bulges in either the RNA or DNA – that is, non-symmetrical basepairing of the strands – can give rise to off-target activity (4). Predicting such basepairing interactions is far-more computationally intensive, and thus existing algorithms ignore these potential off-target sites.
Importantly, recent whole-genome sequencing of cells modified by CRISPR indicates that the consequences of off-target activity, at least for the experimental conditions used, led to no detectable mutations (5). Indeed, when working with single-cell clones, the authors note that “clonal heterogeneity may represent a more serious obstacle to the generation of truly isogenic cell lines than nuclease-mediated off-target effects.” Further, several genetic screens using genome-wide libraries have shown high concordance between different sequences targeting the same gene, suggesting that off-target effects did not overwhelm true signal in these assays (6-8). Again, the experimental strategy is clear: for any gene of interest, one should require that multiple sgRNAs of different sequence give rise to the same phenotype in order to conclude that the phenotype is due to an on-target effect.
How can it go wrong?
Even with optimized on-target design, and proper avoidance of off-target effects both explicitly when designing sgRNAs and experimentally by the use of multiple sgRNAs, it is apparent that not all genes are equally amenable to targeting in all cellular contexts. One major reason is the chromatin state of a target site. For genes that are in more restricted chromatin or, potentially, different locations in the nucleus, Cas9 will be less effective at finding the target (9). Achieving biallelic knockout of such a gene in a high percentage of target cells might therefore not be practical. Here, single cell-cloning might be necessary, and complementary technologies such as RNAi may be a better experimental choice (while still relying, of course, on multiple different sequences of small RNA to interpret a phenotype!)
In sum, selection of sgRNAs for an experiment needs to balance maximizing on-target activity while minimizing off-target activity, which sounds obvious but can often require difficult decisions. For example, would it be better to use a less-active sgRNA that targets a truly unique site in the genome, or a more-active sgRNA with one additional target site in a region of the genome with no known function? For the creation of stable cell models that are to be used for long-term study, the former may be the better choice. For a genome-wide library to conduct genetic screens, however, a library composed of the latter would likely be more effective, so long as care is taken in the interpretation of results by requiring multiple sequences targeting a gene to score in order to call that gene as a hit. Indeed, existing genome-wide libraries have not taken into account on-target activity, and new libraries will surely incorporate such design rules in the near future.
This is exciting time for functional genomics, with an ever-expanding list of tools to probe gene function. The best tools are only as good as the person using them, and the proper use of CRISPR technology will always depend on careful experimental design, execution, and analysis.
For more information about designing CRISPR-Cas9 sgRNAs, read John's recent Nature Biotechnology article – "Rational design of highly active sgRNAs for CRISPR-Cas9–mediated gene inactivation".
Thank You to our Guest Blogger!
John Doench, PhD, is a Senior Group Leader at the Broad Institute. He really likes small RNAs.
- Doench, J. G. et al. Rational design of highly active sgRNAs for CRISPR-Cas9–mediated gene inactivation. Nat Biotechnol (2014).
- Fu, Y. et al. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat Biotechnol 31, 822–826 (2013).
- Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol 31, 827–832 (2013).
- Lin, Y. et al. CRISPR/Cas9 systems have off-target activity with insertions or deletions between target DNA and guide RNA sequences. Nucleic Acids Research (2014).
- Veres, A. et al. Brief Report. Stem Cell 15, 27–30 (2014).
- Shalem, O. et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science 343, 84–87 (2014).
- Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic screens in human cells using the CRISPR-Cas9 system. Science 343, 80–84 (2014).
- Koike-Yusa, H., Li, Y., Tan, E.-P., Del Castillo Velasco-Herrera, M. & Yusa, K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat Biotechnol 32, 267–273 (2014).
- Wu, X. et al. Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells. Nat Biotechnol (2014).
Read More About CRISPR-Cas9:
- How to Design Your gRNA for CRISPR Genome Editing (also by by John Doench)
- CRISPR-Cas9 FAQs Answered!