Category Archives: Journal Article

Control meiotic crossover by DNA methylation in Arabidopsis

We discussed a paper by Yelina et al. (2015) titled “DNA methylation epigenetically silences crossover hot spots and controls chromosomal domains of meiotic recombination in Arabidopsis” (PMID: 26494791; doi: 10.1101/gad.270876.115) in our journal club today. It’s a pretty interesting paper with intriguing topic and smart experimental designs.

The authors previously identified two meiotic crossover hot spots, 3a and 3b on subtelomeric regions of Chromasome 3 in Arabidopsis (Yelina et al., 2012). These crossover hot spots have low CG methylation compared to average genome methylation level or regions in between 3a and 3b. They then tested if increasing DNA methylation in 3a and 3b could suppress meiotic crossover rate by expressing inverted-repeat (IR) transgene, which would trigger RdDM in targeted regions. Interestingly, meiotic crossover rate significantly decreased in several IR expressed lines (Figure 1 and 2, Table 1). Other RdDM markers, such as increased H3K9me2 and denser nucleosome occupancy were also detected in IR targeted regions (Figure 3). There results indicate that crossover rate is negatively correlated with DNA methylation in euchromatic regions.

It is thus obvious to ask that if crossover rate would significantly elevate if genome-wide demethylation occurred. The authors used met1/+ plants to test this hypothesis. However, overall crossover rates were similar in Col/Ler vs met1 Col/Ler. Regional remodeling of crossover around subtelomeric and pericentromeric regions was observed (Figure 4C). They showed that the remodeling of crossover in met1 was dependent on crossover interference pathway (Figure 4G). The analysis of crossover in met1 mutant suggests that genome-wide demethylation has different effect on crossover in euchromatic and centromeric regions. Very interestingly, the met1 mutation causes increased crossover in euchromatic regions, but vise versa in pericentromeric regions (Figure 5D). They further showed that double strand DNA breakage (DSB) was similar in met1 and WT in Arabidopsis (Figure 6), which ruled out the possibility that crossover remodeling in met1 was due to altered DSB.

A few brilliant technique/experiments were used in this research. I think it’s very smart to study meiotic crossover by studying pollen DNA. More information about pollen typing could be found in Drouaud and Mézard (2011). The crossover detecting system by using GFP/RFP that inserts into different positions on same chromosome is also very neat. More information about the GFP/RFP lines can be found in Yelina et al. (2012) paper.


Drouaud, J., & Mézard, C. (2011). Characterization of meiotic crossovers in pollen from Arabidopsis thaliana. DNA Recombination: Methods and Protocols, 223-249.

Yelina, N. E., Choi, K., Chelysheva, L., Macaulay, M., De Snoo, B., Wijnker, E., … & Mezard, C. (2012). Epigenetic remodeling of meiotic crossover frequency in Arabidopsis thaliana DNA methyltransferase mutants. PLoS Genet, 8(8), e1002844.

Yelina, N. E., Lambing, C., Hardcastle, T. J., Zhao, X., Santos, B., & Henderson, I. R. (2015). DNA methylation epigenetically silences crossover hot spots and controls chromosomal domains of meiotic recombination in Arabidopsis. Genes Dev, 29, 2183-2202.


Arabidopsis RNASE THREE LIKE2 modulates the expression of protein-coding genes via 24-nucleotide small interfering RNA-directed DNA methylation

Elvira-Matelot E, Hachet M, Shamandi N, Comella P, Saez-Vasquez J, Zytnicki M, Vaucheret H

The Arabidopsis RTL2 is an RNASE THREE LIKE protein with one RNAseIII domain and two dsRNA-binding domains. Its transient over-expression in plants is known to enhance the production of exogenous siRNAs. Here the authors investigate its role in the production of endogenous siRNAs.

The ectopic expression of RTL2 stimulates the production of siRNAs from artificial and natural Inverted Repeats constructs. Both its domains are necessary for the RTL2-dependent production of siRNAs from dsRNAs. Contrarily, in other cases the over-expression of RTL2 reduces the production of siRNAs from dsRNAs. The opposite effect of RTL2 on dsRNA substrates likely depends on the structure and/or sequence of the dsRNAs.

Interestingly, the over-expression of RTL2 also stimulates the production of RNA molecules larger than 24 nts from artificial and natural Inverted Repeats constructs. The authors also demonstrate that RTL2 cannot substitute for the function of DCL2, DCL3 and DCL4. The suggested hypothesis is that RTL2 could process dsRNAs into RNA molecules longer than 24 nts, which could subsequently be processed by DCL proteins into siRNAs. In some cases the cleavage of RTL2 results in a better processing by DCLs, in some other cases it results in a worse processing by DCLs.

By a sliding window approach combined with the analysis of reads at each position of the genome, a total of 481 sRNA loci are found to be differentially expressed in the rtl2 mutant compared to wt: 183 are RTL2-dependent loci, 298 are RTL2-sensitive loci. These RTL2-targeted sRNA loci produce siRNAs that are mostly dependent on DCL2DCL3DCL4, Pol IV, Pol V and DRM2, indicating the involvement of these sRNA loci in the RdDM process.

Recent works reported that siRNA precursors, named P4R2 RNAs (Pol IV- and RDR2-dependent), preferentially start with an A or a G, and preferentially end with a U (Blevins et al. 2015 and Zhai et al., 2015). The authors found that the RTL2 loci show different 5’ and 3’ nucleobase preferences compared to the total Pol IV loci previously identified. This tendency is reverted to the same behavior observed for the total Pol IV loci in the rtl2 mutant, suggesting that RLT2 modifies the 5’ and 3’ end compositions of the target loci, making them similar to those of the total P4R2 RNAs.

Screen Shot 2016-02-09 at 11.58.55 PMScreen Shot 2016-02-09 at 11.59.00 PM

RTL2 targets mainly TEs and intergenic regions but also protein-coding genes, influencing the DNA methylation and mRNA expression level of the target loci.

The biogenesis of siRNAs is not yet completely understood and these findings suggest that the siRNA precursors might undergo multiple successive processings operated by different proteins. The one precursor one siRNA model might not be valid for all sRNA genes, leaving the question: what are the genetic/epigenetic features of the sRNA genes that differentiate their sRNA precursors production and processing?


A One Precursor One siRNA Model for Pol IV-Dependent siRNA Biogenesis

A One Precursor One siRNA Model for Pol IV-Dependent siRNA Biogenesis (Zhai J, Bischof S, Wang H, Feng S, Lee TF, Teng C, Chen X, Park SY, Liu L, Gallego-Bartolome J, Liu W, Henderson IR, Meyers BC, Ausin I, Jacobsen SE, PMID: 26451488)

In this work the authors demonstrate that the Arabidopsis Pol IV-dependent siRNA precursors, named P4RNAs, are not as long as it was previously assumed: P4RNAs are indeed 30÷40-nt. The characterization of the P4RNAs length and sequence composition give insights to the mechanisms of Pol IV transcription initiation and termination and of DCL processing of the P4RNAs into siRNAs.

P4RNAs are the precursors of Pol IV siRNAs

P4RNAs are 30÷40-nt, as shown by the size distribution of the PATH libraries, and are dependent on both Pol IV and RDR2, suggesting that in vivo the two enzymes work in tight association.

Multiple experiments confirm that these long RNAs are the precursors of siRNAs and not misprocessed siRNAs, for example in the dcl2/3/4 mutant, siRNAs are mainly lost while P4RNAs are increased in abundance but AGO4 still selectively binds to the remaining 22-24-nt siRNAs and not to the longer RNAs. At Pol IV siRNA loci, siRNAs and P4RNAs show positively correlated abundances and interestingly, restricting the analysis on the Pol IV siRNA loci with a strand bias of siRNA accumulation and DNA methylation, the P4RNAs accumulation shows the same strand bias. This result suggests that Pol IV-derived strands, rather than the RDR2-derived strands, are strongly favored to become the final 24-nt siRNAs.

Because of the small length of P4RNAs on average only one 24-nt siRNA is processed by each P4RNA precursor.

P4RNA 5’ end

Pol IV is demonstrated to have retained the same TSS preference from its evolutionary ancestor Pol II (Y/R rule) but the two polymerases are here shown to occupy different genomic territories.

At 5’ end, P4RNAs are enriched in A, as it is known for the siRNAs, and the majority appear to have a 5’ monophosphate: I think this last result was in some way expected because of the cloning technique used to construct the PATH libraries.

P4RNA 3’end

P4RNAs that perfectly match to the genome are shown to have an enrichment of ACU in their three last positions, but more than 50% of the total P4RNAs present mismatches at their 3’ ends and these non-templated P4RNAs have a different nucleotide pattern in their 3’ end. 3’ end mismatches are enriched in CG dinucleotides, being C the last matched base and G the first mismatched base, so where a C is found on the template DNA. The level of nucleotide mismatches at 3’ end is strongly decreased in ddm1/dcl3 compared to dcl3, proving the DNA methylation is influencing the misincorporation of nucleotides by Pol IV. In this model, the DNA cytosine methylation causes the termination of Pol IV transcription to give rise to the short siRNA precursors. What still remains unclear to me is: why exactly after 30÷40-nt? It would be interesting to know what is the frequency of finding a methylated C after a Pol II-like TSS in the genome.

By contrast to the P4RNAs, only 1% of the total siRNAs have mismatches at their 3’ end. This result, together with the shared 5’ A enrichment and strand bias between siRNAs and P4RNAs, suggest that siRNAs are preferentially cleaved from the 5’ portion of their P4RNA precursors.


Another recent work “Identification of Pol IV and RDR2-dependent precursors of 24 nt siRNAs guiding de novo DNA methylation in Arabidopsis” (Blevins T, Podicheti R, Mishra V, Marasco M, Wang J, Rusch D, Tang H, Pikaard CS, PMID: 26430765) confirms the short nature of the siRNA precursors but with a main difference: here, the precursors of siRNAs are found to have a strong preference for a 5’ purine but with similar frequencies for A and G. Compared to precursors with 5’ A, those with 5’ G have 3’ end pattern more similar to that of siRNAs, suggesting that these 5’ G precursors might be processed from their 3’ portion to give rise to siRNAs. It would be interesting to understand why these 5’ G siRNA precursors were not observed in the previous described work.


CRISPR/Cas9 ‘toolbox’ of vectors for plants

Since today’s seminar speaker here is Dr. Daniel Voytas from Minnesota, I went looking for some of his recent papers. I came across this recent one, from him and his collaborators, reporting a series of vectors optimized for plant transgenesis with various types of CRISPR activities, including multi-plexed targeting and transcriptional activation. At a glance the experiments look convincing, and the vector series is available at Addgene. Folks in my lab who are interested in multi-plexed CRISPR/Cas9 targeting might want to look at this study.

A CRISPR/Cas9 Toolbox for Multiplexed Plant Genome Editing and Transcriptional Regulation. (2015; Plant Physiology. doi: 10.1104/pp.15.00636)


miRNAs in Ectocarpus are a distinct, but share similarities to plants/animals

microRNAs and the evolution of complex multicellularity: identification of a large, diverse complement of microRNAs in the brown alga Ectocarpus

James E. Tarver, Alexandre Cormier, Natalia Pinzon, Richard S. Taylor, Wilfrid Carre, Martina Strittmatter, Herve Seitz, Susana M. Coelho and J. Mark Cock

PMID: 26101255

This paper focuses on miRNA analyses in Ectocarpus (esp) discussing the evolutionary background and considerations of the origin of miRNA loci for given lineages of organism.  I found it interesting because it took systems with less established miRNA backgrounds and used a broad context for its mechanism and role.  As brown alga are thought to have independently evolved multicellularity, which gives particular insight for the possible role of miRNAs in this process.  Additionally, ectocarpus has a large suite of homologues known to be associated with miRNA function in plants and animals, making the origin and similarities of this mechanism interesting (Table 2).

miRNA sequencing

To assess questions about the miRNA makeup in ectocarpus, the authors performed sRNA-seq on male and female NILs, aligning reads with bowtie-2 and characterizing loci with mirDeep (animal and plant versions).  They followed a strict set of requirements for identification of miRNAs, namely that:

  1. Must include at least a 15 bp pair within a hairpin
  2. Both mir and mir* must be present
  3. Precise dicing
  4. 3p product must extend 2 bp beyond the 5p product

The outcome of this analysis resulted in 63 families with a total of 64 loci, most of which were new and even filtering out many loci from previous studies which failed to meet requisites.  The clearly result of this is that nearly all miRNAs found within Ectocarpus have no other family members. Even when looking in only the seed region of a loci, the authors found that even low identity (>75%) cutoffs retained the vast majority of loci in separate families.  

One of the hypotheses of the paper was to indicate that there would be expression specificity between male and female individuals, which was not supported via northern blot (Figure 1).  

In the prediction of targets for miRNAs, the authors implemented the tool TAPIR, looking for high-complementarity targets.  This process yielded 160 targets, available in Table S3.  Despite the lack of family expansion, apparently several of the miRNA are found to redundantly target the same genes.

Origin of miRNAs

The authors suggest that the main genomic origin for miRNA loci in ectocarpus is likely from intronic regions of transcribed genes, but only by deduction.  A large proportion of miRNAs were found to be located within protein-coding genes, in intronic regions and commonly stranded with the gene.  miRNAs found within genes were not found to co-express significantly, hurting the case that these would be expressed simultaneously…  I don’t know if this evidence is damning, as there could be any number of factors affecting the measured expression levels…  

When looking at the evolutionary origins of esp-MIRs, the authors found some interesting results.  According to the authors, miRNA loci-loss occurs very slowly, and usually in only exceptional cases.  Is this true?  A paper is cited that has a differing opinion, but is discounted as an “over-estimation”.  To examine mirs in closely related species, they looked at two closely related alga, as well as two more distant diatoms.  Despite the relationship, NO mir loci matches were found through blast search.  Would we expect this?  This means that this lineage has evolved its own set of miRNAs.  

As for multicellularity, the authors contend that the presence of miRNAs is associated with developmental complexity.  They support this argument by saying that there is a correlation between number of cell types/ developmental characters and the complexity of miRNA systems in an organism.  Just looking purely at the number of families, they show that higher order plants and animals have more while lower organisms have less.  I’m not sure how convincing this is, as its highly speculative and doesn’t talk much about an evolutionary mechanism.

Mechanistically, esp-mirs seem to have commonalities with plant and animal miRNAs, having similar fold-backs to land-plants and 21-mers as the most common mature.  The paper indicates that esp-AGO2 is 40% identical with HsAGO2, but they don’t speak to AtAGO2.  Important differences include that there is a vastly higher ratio of mir:mir* in terms of read detection (>400 fold).  

Other thoughts…

If this is the case, it seems likely that the authors may be missing significant portions of miRNAs, considering the required depth for identifying a star sequence.  Similar to our lab’s philosophy with shortstack (setting a very high bar for miR discovery), the authors seem to be concerned with false positives, striving for only including confirmed miRs, and even pleading for higher standards for mir identification in the field.  Considering this, it seems interesting to me that they do not speak to a requirement for high expression levels for a mir, something that I thought was tacitly required, though we don’t implement this standard either.


High temps trigger 24mer siRNAs in Arabidopsis?

I came across this article, published about a year ago, in Plant Physiology and Biochemistry and was intrigued:

Insight into small RNA abundance and expression in high- and low-temperature stress response using deep sequencing in Arabidopsis

doi: 10.1016/j.plaphy.2014.09.007

The experiment was small RNA-seq on heat-stressed (HT .. 36C for one day), normal temperature (NT) and low-temperature (LT .. 4C for one day ) Arabidopsis plants. The basic observation that I found intriguing was that the high-temp stress library had quite a few 24mers, while the cold and normal temp. libraries had a lot fewer. (See Figure 1 from the paper below):


Does this imply that heat stress activates the 24nt siRNA pathway?

The major issue with the experiment however is a lack of replication and independent verification: Just one library per treatment so there are no biological replicates to assess the reproducibility, so the result I think is provisional at best.

However, there are some earlier reports that also show effects of heat stress on bulk small RNA functions. Ito et al. (2011) showed that the Arabidopsis ONSEN retrotransposon was transcribed and reverse-transcribed in response to heat stress, but that actual transpositions were prevented by the het-siRNA pathway. Zhong et al. (2013) showed that heat stress de-activated the trans-acting siRNA pathway (which mostly makes 21 and 22mer siRNAs that can also behave like heterochromatic siRNAs). I think that understanding the role of heat-induced small RNA profile changes will be quite interesting.

Annotation of Soybean small RNAs reveals non-canonical phased siRNAs

Siwaret Arikit (a,b), Rui Xia (a,b), Atul Kakrana (b), Kun Huang (a,b), Jixian Zhai (a,b), Zhe Yan (c), Oswaldo Valdés-López (d), Silvas Prince (e), Theresa A. Musket (e), Henry T. Nguyen (e), Gary Stacey (c), and Blake C. Meyers (a,b)

a Department of Plant and Soil Sciences, University of Delaware, Newark, Delaware 19711

b Delaware Biotechnology Institute, University of Delaware, Newark, Delaware 19711

c Division of Plant Science, University of Missouri, Columbia, Missouri 65211

d Unidad de Morfologia y Función, FES Iztacala, Universidad Nacional Autónoma de México, Los Reyes Iztacala, Tlalnepantla 54090, Mexico

e National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, Missouri 65211

PMID: 25465409

doi: 10.1105/tpc.114.131847


This article is an interesting take on the challenges associated with small RNA annotation from Blake Meyers.  Done on a large-scale basis in soybean, this project seeked to classify small RNA loci based on more modern interpretations, making use of both small RNA-seq libraries and degradome PARE sequencing.

First, the authors re-evaluated miRBase-20 genes based on several rules, namely trying to clarify genes that act canonically as miRNAs from ones that don’t.  This came in the form of several classes: (1) miRNAs that are weakly expressed but resemble siRNAs, (2) genes that are likely siRNAs, (3) genes that marginally meet the strict definition of miRNA and (4) well characterized and defined miRNAs.  530 plant miRNA aligned to the soybean genome, and fell under the following classifications: (1) 191 weakly expressed, (2) 203 siRNA-like, (3) 15 marginal miRNAs and (4) 121 highly expressed and canonical miRNAs.  This breakdown made some of the failings of miRBase pretty apparent, as so few of these genes could be clearly defined as miRNAs in soy.  Also, it seems clear that these genes make up a spectrum of classifications, as these classes had to be defined by some seemingly arbitrary cutoffs for strand and abundance ratios.  It is a challenge to define these classes.  The authors also used these cutoffs to filter and identify new candidate miRNAs, through which they found numerous canonical and novel genes.  The mapping procedure used in this study is a bit non-descript, as they just mention using Bowtie to map perfectly matched reads, and filtered out structural RNAs.  It looks like they allow multi-mapping reads with up to 20 alignments.  I would expect that if this procedure was refined using a method like butter, we might see less ambiguous and weakly expressed miRNAs.  Are these erroneous?

Another portion of this article I found interesting was their attempts to identify phasiRNA loci, where they identified 504 loci with a “stringent threshold” for their phasing P-value.  Almost all of the found loci overlapped protein coding genes.  The intriguing part about their PHAS loci identifications is that they found some non-canonical patterns of phasing from variants of TAS3 loci.  These included circumstances that required 3-hits from a miRNA to trigger phasi induction, as well as phasing in a downstream direction.  If we have PHAS loci like this in a dataset analyzed by shortstack, in my understanding it should be annotated without a problem… (is this correct?).

The most highly represented group of genes targeted by phasiRNA in soybean encode NB-LRR proteins, which have over 300 members characterized in legumes.  The authors cite several hypotheses for why this family is so plentiful as targets, hypothesizing that the phasiRNA act as regulators in the absence of a pathogen trigger, or that this is control over a rapidly expanding gene-family, citing studies by Shivaprasad et al., 2012 and Kallman et al., 2013, respectively.  Could it be both?  I will have to read some of their cited papers to get more context for phasiRNA gene-regulation.

This paper also has a huge amount of information on tissue specificity of phasi and micro-RNA genes, providing a more complete picture on this regulation in soy.  They saw wide diversity in tissue specific small RNA expression, finding several sub-groups of highly specific sRNA genes.  Overall, a very interesting article with a large amount of content, making it hard to show all of it here.

Rcount: dealing with multi-mapping reads in RNAseq data

Rcount: simple and flexible RNA-Seq read counting

Marc W. Schmid* and Ueli Grossniklaus

Institute of Plant Biology and Zu€rich-Basel Plant Science Center, University of Zurich, 8008 Zu€rich, Switzerland

Bioinformatics. doi:10.1093/bioinformatics/btu680, PMID: 25322836

Nate showed me this paper today which is of some interest to us given my obsession with finding a better way to deal with the issue of multi-mapping reads in small RNA-seq data (e.g., with the butter program). This paper describes a tool called Rcount, which is a counter for ‘normal’ mRNA-seq data. As described in the paper, Rcount takes in a BAM file, and deals with multireads. According to figure 1 (copied below), the way they do this is to use the density of local uniquely mapped reads and make a probability assessment… the more uniquely mapped reads in an area, the more likely it is that the multi-read also came from that location. They then place it, noting their calculated probability in the SAM line with a custom tag. Rcount then performs another task (dealing with counting reads that overlap more than one gene annotation) and counts up reads in annotated genes for the user.

Rcount is clearly geared toward counting reads in annotated genes with reference to mRNA-seq data. For that reason, I doubt the program itself will be that useful for small RNA-seq data, where we are not generally interested in counting reads in pre-defined intervals (like gene annotations). But it is striking that Rcount is using pretty much exactly the method that my butter program uses for assigning reads … using the density of the unique mappers to create a probability set used to guide decisions on multi-mappers. I think Nate is going to try and use Rcount for small RNA-seq data.

I don’t think this precludes continued development of butter or it’s successor, because Rcount is pretty clearly geared toward mRNA-seq data. But it is worth testing, if possible, against butter and other methods for small RNA-seq to try and determine for our own lab purposes an optimal method for aligning multi-mapped small RNA-seq reads that is both precise and reproducible.

– Mike Axtell

Exosome is not related to small RNA or RdDM based silencing in plants

DOI: 10.1371/journal.pgen.1003411   PMID:  23555312

I looked at the paper The Role of the Arabidopsis Exosome in siRNA– Independent Silencing of Heterochromatic Loci.  This was an interesting topic for me, as it connected some aspects of RNA metabolism and use I hadn’t originally thought as interacting.  The paper is examining the premise that the exosome might be indirectly involved in the regulation of small RNA production and RdDM in plants, as has been reported in yeast.  This effect was previously examined in Bühler M et al. 2008, in the yeast model organism S. pombe, where exosome deficient mutants were found to have vastly altered levels of siRNAs.  This is thought to be caused by a buildup of aberrant non-degraded ncRNAs interacting with small RNAs and siRNA machinery.

The authors used small RNA sequencing to determine the global make-up of annotated small RNAs in Arabidopsis, looking for a difference in exosome deficient plants.  The authors couldn’t find an effect in small RNA quantity or distribution, considering both the type of small RNA (miRNA, genic RNA, ncRNA…) as well as the possible target location (TEs, inverted/tandem/double repeats..)

Despite the apparent lack of exosome mutation on small RNAs, the authors did find a downstream effect, where mutant plants had an increase in the quantities of RdDM-regulated heterochromatic loci.  They examined this in the context of POL IV and V mutants, in which exosome deficiency lead to a dramatic loss of regulation in these (2) loci.  Following data shows that this is not from a decrease in methylation of these sites.  Histone association is seen to be lower in these loci in exosome deficient plants, as well as an association between exosome and flanking scaffold regions, leading the group to speculate that there is a cooperative effect between these structures, acting independently of RdDM.

Looking at the data in this paper, there are still several enigmatic results that are poorly explained by their model.  Their data indicate that there is a combinatorial effect between the exosome and RNA pol V in silencing heterochromatin loci, but later data contradicts this, showing that mutant plants containing mutations in both have higher enrichment when pulled down by histone (figure 6a).  I struggled to find reasoning for this observation in the paper, but perhaps I missed it in my readings.  Despite some of these confusing points, I thought the results presented provided an interesting context for alternative forms of heterochromatin regulation, rather than RdDM. Overall a broad and interesting read.  My take-away points are 1) that the exosome in plants (as opposed to yeasts) must have some layer of insulation between RNA degradation and RdDM machinery and 2)  there are alternative forms of RNA directed heterochromatin regulation.

Hope this isn’t too far off topic, just thought it was interesting.

“Clean” gene replacement in Physcomitrella is hard.

Paper: Recombination products suggest the frequent occurrence of aberrant gene replacement in the moss Physcomitrella patens by Wendeler et al.

The Plant Journal .. doi: 10.1111/tpj.12749 .. PMID: 25557140

This paper examines in detail what happen around the PpCOL2 locus of Physcomitrella patens during gene replacement experiments. They find that complex re-arrangements were very frequent. In particular, a number of transformed lines where PCR analysis across the predicted genome / replacement construct junctions was positive had other copies of the target locus in the genome. They find that this is RAD51-dependent (there are two non-redundant RAD51 genes in Physco, -a and -b).

The main take-away for me in this paper was that PCR analysis of junctions is insufficient to screen for gene replacement lines in Physco. For instance, in their southern blotting experiments, the authors find that “gene replacement with correct recombination junction fragments and deletion of the original sequence was obtained in only two [sic] out of 9 targeted lines” .. where these “9 targeted lines” were all positive for both junction PCRs. Yikes. The southern blot is in figure 3B.

This matches with our own past experiences of gene replacement in Physocmitrella. It is not trivial to actually get rid of targeted sequence. Perhaps with a-miRNAs and also perhaps CRISPR-Cas9, gene replacement in this organism might be deprecated.

– Mike Axtell