Monthly Archives: January 2015

AGO4 and AGO6 are more specific in mediating RdDM than we expect

Paper: Specific but interdependent functions for Arabidopsis AGO4 and AGO6 in RNA-directed DNA methylation by Duan et al.

EMBO J. doi: 10.15252/embj.201489453 PMID:25527293

The function of AGO6 has been considered redundant with AGO4 previously. This paper, however, shows that the redundancy of AGO4 and AGO6 in mediating RdDM is much smaller than we would expect. AGO4 and AGO6 dependent methylation is profiled by genome-wide bisulfite sequencing. Interestingly, DNA methylation in only a small subset of loci is redundantly regulated by AGO4 and AGO6. In more than half of the hypomethylation loci, DNA methylation is similarly reduced in either ago4-6 or ago6-2, and no significant reduction is observed in double mutant. This result indicates that AGO4 and AGO6 have related yet specific function in RdDM.

The authors also want to show the distinct function of AGO4 and AGO6 by studying their subcellular localization. The conclusion of the paper is that AGO4 and AGO6 show different co-localization patterns with DNA dependent RNA polymerases. However, I am not quite convinced by these immuno-staining figures. The localization of AGO4 and AGO6 are scattered in the nucleus and the co-localization signal with Pol IV or Pol V is not obvious. Even though the co-localization data is not convincing to me, I do agree with the authors that studying the localization of AGO4 and AGO6, especially the co-localization pattern with Pol IV or Pol V, is very important.

The other thing I am interested in is that the authors studied the accumulation of Pol V transcripts as well as Pol V occupancy in ago4 and ago6 mutant. A very interesting result is that Pol V occupancy at most tested IGN loci obviously decreases in ago6 mutant. The accumulation of most tested Pol V transcripts decreases in ago6 mutant but increases in ago4 mutant. These results indicate that AGO6 is required for Pol V recruitment. It is very intriguing that AGO4 and AGO6 show such distinct effect on Pol V occupancy. In my own study, I am trying to pull down AGO4 associated Pol V transcripts. It might be interesting to see if Pol V transcripts could also be pull down by AGO6. We have to notice that only a small number of Pol V transcripts are studied here. It remains unclear that whether this small subset can represent the real pattern.

Last thing to mention, another paper from the Slotkin lab (McCue et al. 2015) shows that AGO6 can load 21-22nt siRNAs and establish RdDM, which is also distinct from AGO4. In conclusion, these two proteins may have more specific functions than we expect.

Areas Around Micro RNA Targets Sites Are Typically Unstructured So As to Not Hinder RISC

One of my projects is to observe what (if any) effects the sequences flanking the miRNA target sites and the structure of the RNA transcript has on the miRNAs efficacy.  I found a rather old paper (published in about 2013) that has found that areas flanking miRNAs target sites are typically unstructured.  This paper uses a computational approach to the determined the aforementioned results.  It goes without saying that I think this paper is interesting because my experiment could very well substantiate with experimental data or disagree with this study.

In this article, Selection on Synonymous Sites for Increased Accessibility around miRNA Binding Sites in Plants, the researchers retrieved the genomes and miRNAs for Arabidopsis thaliana, Zea mays, Oryza sativa, and Populus trichocarpa.  They also downloaded expression data for miRNAs and their targets in A. thaliana from the Massively Parallel Signature Sequencing project.  Using RNAFold, the researchers determined delta G open (the difference between the free energy of all secondary structures and the free energy of all structures in which the target site is unpaired), delta G local (the free energy of the local secondary structure of the miRNA target sites), and the GC content as typically higher GC content typically correlates with higher structure.  They also calculated the Z-scores of these values as well.

Compared to the randomized sites, it was found that the area near the miRNA target sites are depleted  in GC nucleotides and are typically unstructured.  This trend is true regardless of the expression level of the mRNA target and the miRNA which targets it.  However, what is really interesting is that the Z-score of delta G open (which is essentially the measure of how much energy is needed to “open” the miRNA target site) shows an obvious trend of decreasing as one moved closer to the miRNA target site and increasing as one moved away (in either 5′ or 3′ prime) from the same in all the species analyzed.  However, this trend was apparent but much “weaker” in Arabidopsis.  Also of note,targets of miRNAs that repress their targets by cleavage or translational repression show the exact same trend.  I wonder if there is anything worth experimenting on this issue or if it’s merely an artifact of the data.

Again, I found this study interesting but some problems jump out at me.  Firstly, programs like RNAFold are not totally accurate in determining the structure of transcripts in vivo.  I wonder how the data will change if they used Sally Assmann’s DMS-seq data for Arabidopsis.  Another issue is that the study only takes the 17 nucleotides upstream and 13 nucleotides downstream of the target into account when doing these analyses.  This is because Kerterz et al. 2008 found that this region played an important role in animal miRNA repression efficiency.  I wonder how this squares up with the collaboration we did with Christophe in which he hypothesized that because plants miRNA extensively binding with their target, flanking sequence context doesn’t change the miRNA efficacy.

In conclusion, I still think this paper is worth reading (or at least skimming through).  One way or another, the experiment I mentioned will be important to this study.  There are ways to transiently express genes in other species (such as Arabidopsis & rice), so it may be worth testing out the transient expression in these systems and see if the data is different from Nicotiana transient expression.


Note the papers can be found here:


doi: 10.1093/molbev/mss109

plantDARIO – a web-based tool for small RNA-seq analysis in select plant genomes

Patra et al. (2014). plantDARIO: web based quantitative and qualitative analysis of small RNA-seq data in plants. Frontiers in Plant science.

doi: 10.3389/fpls.2014.00708
PMID: 25566282

This manuscript describes a web-based service for the annotation of small RNA-producing genes in Arabidopsis thaliana, Beta vulgaris, and Solanum lycopersicum (the authors also state that they plan to extend the number of plant species to “…include most of the available plant genomes.”. Users provide aligned small RNA data in BAM or bed format, and the authors provide a script for condensing reads aligned to the same position. Thus the authors reduce the burden of large data transfers. The web server parses the aligned small RNA data with respect to several pre-loaded annotation tracks, including known miRNAs (from miRBase), known tasi-RNAs, tRNAs, and other ncRNAs from Rfam. Global stats are spit out for the library. Clusters of reads that don’t overlap any annotated regions are flagged, and some miRNA finding and snoRNA finding programs are run. Results can be integrated onto other publically available genome browsers for the species of interest, located on other servers.

I found this manuscript interesting for a couple of reasons. First, I had often wondered about how to make my own small RNA-seq program, ShortStack, available as a web-service. I have not done this, primarily because the input for ShortStack is raw small RNA-seq data, or BAM files of aligned small RNA-seq data, along with the reference genome. This would be tedious to upload for users because of the file sizes. The large file sizes could also place a big demand of the server, as could the intense number of CPU cycles that might be run. It looks like the authors of plantDARIO have gone around this issue by outsourcing the alignments to the user, and enforcing a read-condensation scheme.

The second thing I found interesting about this work was a brief mention of the alignment methods. In particular, the authors state “Unlike many other mapping tools, segemehl has full support for multiple-mapping reads which is very important for small RNA-seq”. I am quite interested in improving the treatment of how multi-mapped small RNA-seq reads are placed and used (see butter). I have not heard of the program “segemehl” before. The relevant paper is Otto et al., 2014, which I will need to put on my reading list.

The third thing I was interested in was the method for annotating small RNA clusters that didn’t overlap a known gene. The authors are using a tool called “blockbuster”, which was described in another earlier paper from this group, Langenberger et al. 2009. Will have to check this out too.

My final thoughts on this paper have to do with comparing a web-based service like plantDARIO to a stand-alone program like ShortStack. The authors of this paper make a plug for a web-based service and ding stand-alone programs by stating “The other sncRNA prediction tools need to be downloaded, installed and run locally, requiring more than basic computer skills.” Well yes, this is true. But there are significant advantages of a stand-alone vs. their approach to web-based analysis. With a standalone, you can use any genome assembly or assembly version you want. But with their approach, you are limited to whatever they have pre-configured. Moving to new species, or even updating with a newer genome assembly version, is not possible except by requesting the authors to update their site. There is a lot more flexibility to be gained with a standalone.

In any event, an interesting read. I’m looking forward to trying out the tool, and to reading some more of the background methods, especially alignments and de-novo cluster finding.

PS. One error: My ShortStack paper is erroneously cited as “Allen et al. (2013)” instead of “Axtell (2013)”. The author lists of my paper and a 2004 paper from the Carrington Lab, with Ed Allen as lead author, appear to have been swapped in the ref. cited section.

–Mike Axtell

“Clean” gene replacement in Physcomitrella is hard.

Paper: Recombination products suggest the frequent occurrence of aberrant gene replacement in the moss Physcomitrella patens by Wendeler et al.

The Plant Journal .. doi: 10.1111/tpj.12749 .. PMID: 25557140

This paper examines in detail what happen around the PpCOL2 locus of Physcomitrella patens during gene replacement experiments. They find that complex re-arrangements were very frequent. In particular, a number of transformed lines where PCR analysis across the predicted genome / replacement construct junctions was positive had other copies of the target locus in the genome. They find that this is RAD51-dependent (there are two non-redundant RAD51 genes in Physco, -a and -b).

The main take-away for me in this paper was that PCR analysis of junctions is insufficient to screen for gene replacement lines in Physco. For instance, in their southern blotting experiments, the authors find that “gene replacement with correct recombination junction fragments and deletion of the original sequence was obtained in only two [sic] out of 9 targeted lines” .. where these “9 targeted lines” were all positive for both junction PCRs. Yikes. The southern blot is in figure 3B.

This matches with our own past experiences of gene replacement in Physocmitrella. It is not trivial to actually get rid of targeted sequence. Perhaps with a-miRNAs and also perhaps CRISPR-Cas9, gene replacement in this organism might be deprecated.

– Mike Axtell