A detailed analysis of Jinek et al.’s seminal CRISPR publication


CRISPRs have attracted enormous attention since a recent publication by Jinek et al. The discovery has been subject to numerous reviews in both the scientific literature as well as popular and mass media publications:

  • The Independent (fancy picture)
  • The Economist ("[...]where doctors put normal genes into the cells of people who suffer from genetic diseases such as Tay Sachs or cystic fibrosis."- note how simply doctors these days just "put" normal genes into the cells of people, of course only the right cells... you've had yours today already, haven't you?)
  • NY Times (pragmatic report)
  • Swiss national television (fancy picture)

While providing information to the public about this discovery is of tremendous importance, many reports however seem to emphasize promising future applications without giving any understanding of how this development actually looked like when it was made in the laboratory. I.e. the real data obtained by the scientists in those breakthrough moments is rarely if ever shown. The eLife publication is open access so it can not be a matter of accessibility. It seems rather that journalists prefer to present fancy 3d-renderings of DNA and proteins with fluorescent numbers and pseudo-code on black background straight out of the Matrix - for whatever reason. But, the real science is actually just as appealing and therefore in this post, the real data is shown and explained.[1]


Subject of interest is a protein-RNA complex which can cut double-stranded DNA (dsDNA) at virtually any position and which is much cheaper and easier to use than any existing methods. This complex is called the CRISPR/Cas-9 system and it was discovered 1987[Ishino et al.] when its involvement in cutting dsDNA was yet completely unclear. The discovery back then looked like this:


A pattern is immediately visible (the point being science is not always that difficult). The underlined repeats are of dyadic symmetry and regularly separated by (non-uniform) short spacer sequences. The authors wrote:

[Ausschnitt aus Ishino et al]

This structure was "unusual" to them. I can only imagine how much time they spent wondering what this was before they wrote this paragraph. Only in 2004/2005 was it recognized that the non-uniform sequences were from foreign DNA[Pourcel el al., Mojica et al., Bolotin et al.]. The random looking sequences to the right of the dyadic elements in the above figure are the foreign DNA fragments of previous viral attacks on the cell. The dyadic elements are also referred to as palindromic because they can be folded and basepaired onto themselves[2]. Gradually, attention was rising and in 2010 the first Science and Nature reviews on the topic appeared.[Horvath et al., Marraffini et al.] Application to dsDNA was alluded to at that point, but nothing was certain. "[...]Other potential applications of CRISPRs await fur­ther development to determine their plausibility. For example, a crRNP complex in P. furiosus50 can cleave a target RNA at a specific site dictated by the sequence of the crRNA guide. This activity could in principle have applications in molecular biology to specifically cleave RNA molecules in vitro, and could be extended to DNA molecules if other crRNP complexes are proven to have DNA endonuclease activity.[...]" Remarkably, neither of these reviews cites any work by either Jinek, Doudna or Charpentier. Then the two breakthrough papers were published [Jinek et al. 2012, Jinek et al. 2013]. In the following paragraphs, the results of the 2013 paper are summarized.

The Jinek Publication[3]

The purpose of this article is to follow the reasoning of the original paper to illustrate how the experimental evidence led to conclusions. Assume as a scientist, you don’t know in detail how the CRISPR/Cas9 system works and what you can use it for. All you know is it has to be transferred to the nucleus and is programmed by RNA. How can you prove your claim? In the publication, the target gene is a clathrin gene, a protein participating in vesicle formation at the cell surface. First, it's good to show that the protein of interest can indeed be expressed by the target (human) cells.


The black line at 170 kDa[4] indicates a protein of roughly the weight one would expect for Cas9 being modified by a CMV promoter, an HA epitope (facilitating detection/purification), a nuclear localization signal (NLS) and a fluorescent signal (GFP). This experiment proves that transfection[5] of human embryonic kidney cells with the Cas9 construct works and that cells express the desired protein. Of course, the cell could be expressing other proteins with similar weight, but this experiment makes it highly plausible that the results of the subsequent experiments are indeed due to the Cas9 activity. Then, since the protein is believed to be active in the nucleus where the DNA is, can it be shown that the protein gets to the nucleus after human embryonic kidney cells ("HEK293T cells") have been transfected with a DNA-fragment („vector“) encoding the above construct? The below images show separately the GFPs, the cells, and an overlay. It is not easy to see but it appears as if the GFPs shine from within the cell nucleus, and therefore most likely the Cas9 is also in the nucleus.


The GFP image of four cells in the left bottom corner in particular overlaps with their nucleus positions. Now since RNA is required to program the nuclease, can the target HEK cell express the guiding RNA while at the same time be transfected with engineered Cas9? A Northern blot shows that indeed, third column from the left, guiding RNA („CLTA1 sgRNA“) of 62 nucleotide length (= 20 nt of guiding RNA + 42 nt of RNA required to bind the Cas9) is expressed by the cells.


In the fourth column from the left ("CLTA1 sgRNA + Cas9"), the signal is even a bit stronger, suggesting the stabilization of the sgRNA by Cas9. Possibly binding of the sgRNA by Cas9 protects it from degradation. So, the pieces are in place. The cells can be transfected and they're shown to transcribe sgRNA and translate foreign Cas9 simultaneously. It remains to show that Cas9 is operative.

What happens to the DNA if the sgRNA and Cas9 together are expressed in a cell?

Transfecting HEK293T cells with a Cas9 vector and a vector for Clathrin directing sgRNA followed by isolation of cell products resulted in the following cell-lysate image.


For the moment, only consider columns under the "Cas9-mCherry" label. In all of these columns Cas9 is present (mCherry is another fluorescent labeling function). To demonstrate the workings of Cas9, the authors use the so called Surveyor Assay method. This method allows to identify breaks in double stranded DNA, albeit only indirectly. When an agent like Cas9 breaks dsDNA the cell immediately fixes the DNA using its own fixing mechanisms (like non-homolguous end joining, NHEJ). These mechanisms are however error prone and can introduce mismatches, just like a genomic scar. In the Surveyor assay then, the transfected cells are lysated and the DNA is extracted, amplified with PCR and then incubated in vitro with the nuclease Cel-1. This protein identifies mismatches resulting from NHEJ and again breaks the dsDNA at these positions. The products of this subsequent nuclease activity are what is analysed finally and shown in the above figure. The authors show step-by-step what effect only Cas9, sgRNA + Cas9, sgRNA + Cas9 + Cel-1 have, respectively. If there is only Cas9 present (column 2, -/-), DNA of roughly 400 bp is found. Only Cas9 + Cel-1 again yields DNA of 400 bp. sgRNA + Cas9 also 400 bp (important: Cas9 was active here but in the absence of Cel-1 mismatches are not recognized). But then, in column 5 (+/+), a dim line at little less than 200 bp is visible. This dim line is what the whole excitement is about. The column with only sgRNA and Cas9 can be viewed as the control for the column with sgRNA, Cas9 and Cel-1 since it could be that Cel-1 itself has some sort of nuclease activity and cleaves DNA. But no, the third column (-/+) is negative. A positive confirmation is provided by comparison to the ZFN results. ZFN is a highly engineered, very expensive system that cuts at almost the same position in the DNA. From the image, the next question immediately arises: How can the dim line be made stronger? Is there enough Cas9? Is there enough sgRNA? Does the sgRNA bind sufficiently strongly to Cas9? Should the guiding sequence be longer? Checking for sgRNA availability, the authors found this:


When adding additional sgRNA, in column 5 (Cas9-HA-NLS-GFP plasmid + in vitro transcribed CLTA1 sgRNA added to lysate), the lines are indeed stronger. Even stronger than the ZFN signal. Here, the authors first transfected cells with Cas9 and sgRNA plasmids (just like before). The cells were then lysated (so the Cas9 protein was available in vitro) and then mixed with additional sgRNA. The signal gets stronger. So more sgRNA is better, however this doesn't explain why - is it because then that overall there is more active Cas9? Or is it just because expression of plasmid sgRNA is not efficient enough? The authors just added even more sgRNA:


The signal is strongest when both the sgRNA transcribed from plasmids as well as in vitro transcribed sgRNA are added to the system (right-most column). It is concluded that either sgRNA expression or its loading into Cas9 is the limiting factor of Cas9 nuclease performance. At the time of doing these experiments, probably the simplest second thing to do was to extend the region of sgRNA which is involved in binding to Cas9. So they did.


V1.0 represents the originally used system. In v2.1, the presumed Cas9 binding region is extended by 4 basepairs (red basepairs to the right of GAA) and the 3'-end was extended by 5 nucleotides. In v2.2, the Cas9 binding region was extended by 10 basepairs and the 3'-region by 5 nucleotides (compared to v1.0).

Again, a Surveyor Assay was carried out.


The results are not as clear as in the above case. Overall, v2.1 and v2.2 appear to have similar performance over v1.1 (7 - 8 % cleavage to 4 % cleavage in v1.1). This means that increasing the Cas9 binding region is more important than increasing the 3'-region of the sgRNA. Extension of the guiding sequence length was not examined. Furthermore, RNA can be stabilized in vivo by modifications at the 5'- or 3'-ends, both of which have not been further examined in these experiments. The authors suggest more research in this direction to be necessary.

Finally, the above results are all in agreement with a visual model like this:



Hopefully, the above could provide some understanding of the research and make the paper more accessible. In any case it will be interesting to see where this leads. However, and this is very important to consider: what are the ethical implications of this technology. Given superb selectivity and sensitivity, the system could hypothetically be used to engineer the genome of living humans at will (and these changes could be inherited by later generations). It is not yet conceivable, when the first genome engineering applications will be studied in clinical trials. To put it differently, who would want something injected that has the sole purpose of cutting the host DNA? Let it be clear that to date the only „test“ in a (dysfunctional) single-cell human embryo was not perfectly successful[Liang et al.]: "[...]Off-target cleavage was also apparent in these 3PN zygotes as revealed by the T7E1 assay and whole-exome sequencing.[...]". Most recently, in April 2015, at a conference in California a number of leading geneticists met to discuss the implications of Cas9-based technology. In a perspectives publication, the authors end by strongly recommending against ...germline genome modifications for clinical applications in humans as long as societal, environmental and ethical implications of such activity... are not conclusively discussed. We can but hope that these recommendations are being taken into account by researchers when starting new Cas9 based research. This post is adapted from the original post by Martin Hediger on If you want to learn more about the potential impact of CRISPR on research and medicine, watch the following discussion panel, in which Prof. Jinek provided his point of view alongside the bioethicist Effy Vayena:





General background on the CRISPR/Cas9 system can be found on Wikipedia. A good introduction to the topic is also found on Youtube


A base pair (bp) is a unit consisting of two nucleobases bound to each other by hydrogen bonds, see also


kilo Dalton; one Dalton is roughly the mass of a proton


Transfection refers to the process of introducing foreign DNA into a host, see also



Martin Hediger is currently working at DSM Nutritional Products AG in Sisseln, Switzerland. He completed his graduate studies in the Department of Chemistry at the University of Copenhagen and is interested in data science, life sciences and molecular biology.

Der vorliegende Beitrag gibt die persönliche Meinung der Autor*innen wieder und entspricht nicht zwingend derjenigen von Reatch oder seiner Mitglieder.

Zum Kommentieren mehr als 20 Zeichen im Text markieren und Sprechblase anklicken.