Preprint reviews by Benjamin Schwessinger

The rust fungus Melampsora larici-populina expresses a conserved genetic program and distinct sets of secreted protein genes during infection of its two host plants, larch and poplar

Cecile Lorrain, Clemence Marchal, Stephane Hacquard, Christine Delaruelle, Jeremy Petrowski, Benjamin Petre, Arnaud Hecker, Pascal Frey, Sebastien Duplessis

Review posted on 21st December 2017

I love this work. It is such a fundamental biological

question of how an obligate biotrophic fungus infects two highly distinct plant
species. Some real fascinating biology.

Here are some thoughts and comments on this
manuscript:

· Some active voice in the
abstract would make it more accessible.

· Line numbers would have been
great.

Intro:

· ‘is qualified of macrocyclic’
should read ‘as’ not ‘of’

· the citation for Germain et
al., 2017 needs to be corrected

MM:

· This also refers to the
results. It would be great to see how many of the RNAseq reads did not map to
the reference gene models. Could you identify novel genes that were previously
missed from the annotation as the adequate expression data for both hosts were
missing? Is this novel RNAseq data being incorporated to new rounds of
annotations? If you had poplar RNAseq data if would be great to compare the RNAseq
mapping rate overlapping with gene models poplar vs. larch.

· PLEASE deposit all analysis scripts
on github and NOT on demand. This would be great for people that want to
compare RNAseq and microarray data. I really liked your quantile comparison
approach. Scripts need not be perfect. Every little helps!

· do I understand correctly that
you only included genes in your diff analysis that were expressed both in
RNAseq and microarray analysis?

Results/Discussion:

· for the KOG enrichment analysis,
it would be nice to show how many genes miss any annotation. I guess this will
be around 50%. This reverse to the KOG analysis in ‘Secreted proteins is the
only overrepresented category among DEGs detected on larch’. I see that these
are mentioned later on.

· I was wondering if the increase
in specifically expressed SP genes on larch vs. poplar (Figure 6 B) could be an
artefact of the micro-array vs. RNAseq analysis. Were all these larch specific
genes expressed in the microarray at all? It would be much more convincing and
reassuring to see some qRT-PCR analysis of the differentially expressed SP in
poplar vs. larch. Using the identical technique would very much strengthen the
argument made.

· in addition to the SSP gene
family expression analysis (which could be really alleles of each other as
well) did you observe any allele specific expression using SNPs as markers.
E.g. only one of the SNPs is expressed in one host vs. the other.

· One important consideration for
comparing acieal vs telial phases of rusts is that in the acieal phase rusts
are mono-karyotic haploytes. Hence all genes required for this life-phase need
to be allelic aka have two copies in the diploid phase. In future, when fully
phase Mlp genomes are available, it would be interesting to see if some of the
poplar specific SSP are singletons with a corresponding allele in one haploid
genome.


And I forgot figure 4A would be better as an upset plot e.g. http://vcg.github.io/upset/....

show less

See response


Comparative analysis highlights variable genome content of wheat rusts and divergence of the mating loci.

Christina A Cuomo, Guus Bakkeren, Hala Badr Khalil, Vinay Panwar, David Joly, Rob Linning, Sharadha Sakthikumar, Xiao Song, Xian Adiconis, Lin Fan, Jonathan M Goldberg, Joshua Z Levin, Sarah Young, Qiandong Zeng, Yehoshua Anikster, Myron Bruce, Meinan Wang, Chuntao Yin, Brent McCallum, Les J Szabo, Scot Hulbert, Xiaming Chen, John P. Fellers

Review posted on 02nd August 2016

I overall enjoyed reading the manuscript at hand "Comparative analysis

highlights variable genome content of wheat rusts and divergence of the mating
loci." by Cuomo et al.. I am especially excited by the mating type analysis of
wheat rust fungi, something I started doing myself for Puccinia striformiis f.
sp. tritici. There is a clear lack of understanding mating types in rust fungi
and their role during infection and for generating genetic diversity.
I am a little bit disappointed by the lack of information in the Material and
Methods (MM) section. Especially information about how genome analysis was
performed is lacking. Indeed I thought I would be able to learn something for my
own project but this was not the case. I encourage the authors to explain their
analysis in more detail and provide scripts for reproducibility/teaching reasons.

Line numbers would be really helpful during the review process.

Please find below a specific comments on the manuscript.
Major changes (++) Minor changes (+)

Abstract:
Nicely summarizes the main finding of the manuscript

Introduction:
page 4: "When biologically stressed, the fungus enters the sexual cycle and
survival teliospore structures are produced" This needs a reference. I am not
sure this holds true for the center of origin such as the Himalayan region where
the sexual host is continuously present and rust are reproducing sexually all
the time.
"These complex interactions result in the production of up to five different
rust spore types, requiring very discrete developmental programs, resulting in
altered gene expression profiles." Needs reference.
Page 5:
"Genetic studies by crossing individual strains is not trivial due to the
difficulty of breaking teliospore dormancy in order to infect the alternate
hosts." Author should consider recent work by Rodriguez-Algaba J et al. 2014
http://www.ncbi.nlm.nih.gov/pu....
"Wheat leaf rust, caused by Puccinia triticina Eriks (Pt), is the most commonly
occurring cereal rust disease worldwide." Needs a reference.
-> why are teliospores called a "survival structure"?
-> The authors are clear experts in fungal mating types. Providing an
illustration of the different mating type proteins pheromone, pheromone receptor
and homedomain-containing transcription factors would aid less familiar readers
in the field.

Results:
page 10: "The assembly of Pst totaled 117.31 Mb; this is comparable to
previously reported values (Cantu et al. 2011; Zheng et al. 2013)." This
statement is not correct. The assemblies in Cantu et al. 2011 and 2013 are only
describing contigs and the total length of these assemblies is around 50-70Mb.
-> I wonder how the high number of scaffold N's, esp. for Pst, influences the
interpretation of relative repeat content in the three genomes. My understanding
is that N's are most likely caused by repeats that are unable to be assembled by
short read sequencing technologies. Please comment.
-> the authors should consider including information provided in Cantu et al.
2013 when talking about analysis of Pst.
The part assessing assembly quality in regards to heterzygousity could be
explained better. It starts with "Regions of high heterozygosity could carry
enough differences to prevent haploid assembly and could inflate the gene count
for such regions, as alleles would appear as duplicated genes."
-> I encourage to the authors to be more careful with the following statement
"Overall this suggests that independent assembly of both haplotypes is minimal
in all three wheat rust pathogens, as expected given the choice of assembly
strategies that take". It might well be that conserved genes are less
heterozygous in general and this could confound the authors analysis. Indeed the
authors were able to identify two alleles for their mating type HD genes which
might well indicate independent assembly of both haplotypes. The authors
actually suggest this interpretation. In addition simply present absence
polymorphism between the two haplomes will also confound this analysis. Please
comment.
-> the nomenclature referring to candidate secreted effector proteins (CSEPs) is
not consistent from page 14-17. Sometimes they are referred to as effectors.
Please be consistent.
-> Nice to see expression data on the alternate host incorporated in this
manuscript.
-> The mating-type analysis would benefit from identifying SNPs in STEs and mfa
sequences. Are these loci heterozygous or homozygous. The later would indicated
that genes might be only present in one haplome and not the other.
-> "Pt HD genes are functional in U. maydis" might be an over statement. One
HD-domain protein can substitute for one U. maydis ortholog but when expressing
both rust orthologs in a double knockout (eg. Uh553 (a1 b0)) the mating -type
could not be rescued.
-> "Pt mating-type genes are functional during wheat infection" section needs a
control showing that mating type genes are actually targeted in within Pt (e.g.
qPCR).++

Discussion:
"Notably, we find that Pst has the highest level of heterozygosity and that this
measure is larger than previously reported (Zheng et al. 2013). While some of
this difference could be attributed to the isolate sequenced, the much larger
size of the Pst-130 genome used in this previous study may result in an
under-estimation of heterozygosity, such as in cases where both alleles of a
gene were assembled independently." Reference Zheng et al. 2013 refers to CY32.
The genome size of assembly Pst-130 is actually only 65Mb. Please correct statement.
-> I really enjoyed the discussion.

MM:
MM referring to page 11 to page 16 could be improved. In particular the
following. ++
-> ortholog analysis and synteny analysis are totally missing.
-> heterozygosity analysis could be improved by providing details on setting for
BWA. It is unclear how genic and intergenic SNP rates were calculated.
-> The part assessing assembly quality in regards to heterzygousity could be
explained better. No detail is provided on this part in the MM section.
-> any MM referring to "Core protein comparisons and orthology" is missing.
-> it were great if the author were to provide their CSEPs annotation pipeline
and scripts.
-> providing alignment for Figure 4 phylo tree would be good.

show less