Review for "The rust fungus Melampsora larici-populina expresses a conserved genetic program and distinct sets of secreted protein genes during infection of its two host plants, larch and poplar"

Completed on 21 Dec 2017 by Benjamin Schwessinger . Sourced from https://www.biorxiv.org/content/early/2017/12/06/229971.

Login to endorse this review.


Comments to author

Author response is in blue.

I love this work. It is such a fundamental biological
question of how an obligate biotrophic fungus infects two highly distinct plant
species. Some real fascinating biology.

Here are some thoughts and comments on this
manuscript:

· Some active voice in the
abstract would make it more accessible.

· Line numbers would have been
great.

Intro:

· ‘is qualified of macrocyclic’
should read ‘as’ not ‘of’

· the citation for Germain et
al., 2017 needs to be corrected

MM:

· This also refers to the
results. It would be great to see how many of the RNAseq reads did not map to
the reference gene models. Could you identify novel genes that were previously
missed from the annotation as the adequate expression data for both hosts were
missing? Is this novel RNAseq data being incorporated to new rounds of
annotations? If you had poplar RNAseq data if would be great to compare the RNAseq
mapping rate overlapping with gene models poplar vs. larch.

· PLEASE deposit all analysis scripts
on github and NOT on demand. This would be great for people that want to
compare RNAseq and microarray data. I really liked your quantile comparison
approach. Scripts need not be perfect. Every little helps!

· do I understand correctly that
you only included genes in your diff analysis that were expressed both in
RNAseq and microarray analysis?

Results/Discussion:

· for the KOG enrichment analysis,
it would be nice to show how many genes miss any annotation. I guess this will
be around 50%. This reverse to the KOG analysis in ‘Secreted proteins is the
only overrepresented category among DEGs detected on larch’. I see that these
are mentioned later on.

· I was wondering if the increase
in specifically expressed SP genes on larch vs. poplar (Figure 6 B) could be an
artefact of the micro-array vs. RNAseq analysis. Were all these larch specific
genes expressed in the microarray at all? It would be much more convincing and
reassuring to see some qRT-PCR analysis of the differentially expressed SP in
poplar vs. larch. Using the identical technique would very much strengthen the
argument made.

· in addition to the SSP gene
family expression analysis (which could be really alleles of each other as
well) did you observe any allele specific expression using SNPs as markers.
E.g. only one of the SNPs is expressed in one host vs. the other.

· One important consideration for
comparing acieal vs telial phases of rusts is that in the acieal phase rusts
are mono-karyotic haploytes. Hence all genes required for this life-phase need
to be allelic aka have two copies in the diploid phase. In future, when fully
phase Mlp genomes are available, it would be interesting to see if some of the
poplar specific SSP are singletons with a corresponding allele in one haploid
genome.

And I forgot figure 4A would be better as an upset plot e.g. http://vcg.github.io/upset/....



I love this work. It is such a fundamental biological question of how an obligate biotrophic fungus infects two highly distinct plant species. Some real fascinating biology.
——
Many thanks Benjamin from your interest in this work and for your very interesting and constructive comments. We were enthusiastic to receive comments -and not only retweets- on this version submitted to BioRxiv. We added a note about that in our paper now accepted at MPMI.
Here are our answers to your comments.
——
MM:
· This also refers to the results. It would be great to see how many of the RNAseq reads did not map to the reference gene models. Could you identify novel genes that were previously missed from the annotation as the adequate expression data for both hosts were missing? Is this novel RNAseq data being incorporated to new rounds of annotations? If you had poplar RNAseq data if would be great to compare the RNAseq mapping rate overlapping with gene models poplar vs. larch.
——
Answer: The number of unmapped genes can be determined from the supplemental Table 1. Regarding identification of new genes, this will be taken care of in another study that will find its way into publication hopefully later this year. As you are aware, JGI has been able to set a new version of the reference genome anchored onto a genetic map. A new annotation has been set too. The RNAseq data reported here have been used to support this new annotation. We are now performing a detailed analysis of this new genome version with more comparative analyses than for version 1, with more rust genomes available. Comparison of exp. data between versions 1 and 2 of the genome will also be included.
As for poplar RNAseq data, we do have some as well under analysis but at very early time points of infection, i.e. only partial transcriptomes (tip of the iceberg issue) that cannot be fully used for direct comparison. In a near future we are looking forward obtaining complete transcriptomes by RNAseq all along the cycle (not only infection). Then we should be able to address your question.
——
· PLEASE deposit all analysis scripts on github and NOT on demand. This would be great for people that want to compare RNAseq and microarray data. I really liked your quantile comparison approach. Scripts need not be perfect. Every little helps!
——
Answer: You are right. Cecile and Clemence are seeing to resolve this. The link to Github will be integrated in the final proof of the article.
——
· do I understand correctly that you only included genes in your diff analysis that were expressed both in RNAseq and microarray analysis?
——
Answer: For comparison, we did not consider the genes for which expression is not available on microarrays. Indeed, only ~13K genes were supported by oligonucletides on the Nimblegen DNA array. Others were not or oligonucleotides could not discriminate between close transcripts within gene families. It does not mean they are not expressed. In the RNAseq data, based on the profiles of our saturation curves, we considered genes with 0 mapped reads to be not expressed.
——
Results/Discussion:
· for the KOG enrichment analysis, it would be nice to show how many genes miss any annotation. I guess this will be around 50%. This reverse to the KOG analysis in ‘Secreted proteins is the only overrepresented category among DEGs detected on larch’. I see that these are mentioned later on.
——
Answer: The number is in fact a bit higher, which is not surprising for a rust fungus. This information is available in one of the large supplemental table.
——
· I was wondering if the increase in specifically expressed SP genes on larch vs. poplar (Figure 6 B) could be an artefact of the micro-array vs. RNAseq analysis. Were all these larch specific genes expressed in the microarray at all? It would be much more convincing and reassuring to see some qRT-PCR analysis of the differentially expressed SP in poplar vs. larch. Using the identical technique would very much strengthen the argument made.
——
Answer: Indeed, and we had and still have the same question. The best to answer this point would be to obtain RNAseq on poplar, something we are looking forward to. Yet, some are really specifically expressed and other preferentially expressed (with all the gradation as you can expect). The quantile normalization clearly helps to compare the datasets, but we are still cautious and prefer to always refer to the original dataset for each approach. As for the RTqPCR. This was planned but the RNA we had from the oligoarray series were too old and in bad shape for this. A brand new time course series on poplar is planned for next spring.
——
· in addition to the SSP gene family expression analysis (which could be really alleles of each other as well) did you observe any allele specific expression using SNPs as markers. E.g. only one of the SNPs is expressed in one host vs. the other.
——
Answer: this is something we are also very interested in. We did not perform any mapping of SNPs from the RNAseq data onto alleles yet, this will be part of the analysis of the new version of the genome later this year.
——
· One important consideration for comparing acieal vs telial phases of rusts is that in the acieal phase rusts are mono-karyotic haplotypes. Hence all genes required for this life-phase need to be allelic aka have two copies in the diploid phase. In future, when fully phase Mlp genomes are available, it would be interesting to see if some of the poplar specific SSP are singletons with a corresponding allele in one haploid genome.
——
Answer: the version 2 is still not phased, but we are looking forward to it (hopefully). We can also hope for single cell genomics on pycniospores one day!
——
And I forgot figure 4A would be better as an upset plot e.g. http://vcg.github.io/upset/....

——
Answer: This is a very interesting link Benjamin. I think that this « upset » way of looking at data can clearly add compared to a single Venn diagram. Allow us some more time to get adapted to it. In order to not ‘upset’ our readers, we will not include an upset version of the data right now and will go with the classical Venn diagram view, but getting used to upset plotting is now on our wish list for 2018!
——