Open preprint reviews by Willem van Schaik

Multifactorial Chromosomal Variants Regulate Polymyxin Resistance In Extensively Drug-Resistant Klebsiella pneumoniae

Miranda Pitt, Alysha Elliott, Minh Duc Cao, Devika Ganesamoorthy, Ilias Karaiskos, Helen Giamarellou, Cely S. Abboud, Mark A. T. Blaskovich, Matthew A. Cooper, Lachlan J. M. Coin

This is an interesting whole-genome sequencing based study to identify mechanisms that contribute to colistin resistance in K. pneumoniae. Mutations in mgrB, phoPQ and pmrAB are identified and complemented to confirm their role in colistin resistance. The major weakness of this study is that the authors are limited in their choice of isolates: they do not have the susceptible counterpart of each resistant strains, so it is impossible to identify all SNPs and indels that have accumulated in the resistant strain. This limits the scope of the study as the authors now only study the ‘known knowns’ outlined above. It would be good if the authors include this limitation of their study in the discussion.

Some additional comments and suggestions are outlined below:
The abstract lacks quantitative data. l. 30 Please provide an exact number, l. 31. ‘most common’: provide number of strains.
The relevance of the ST2401 K. quasipneumoniae strain in the context of this study is unclear. It does not merit inclusion in the abstract, in my opinion.
l. 49: better to write plasmid-encoded carbapenem resistance genes
l. 54. The mortality associated with polymyxin-resistant Klebsiella infections seems awfully high. I believe the attributable mortality due to PMX-resistance is still not clear. See this interesting blog post: for further insights on this topic.
l. 58. I apologize for being a pedant, but the disturbance of the LPS leaflet of the outer membrane will not allow PMX to act on intracellular targets. For that to happen, the inner membrane needs to be disrupted as well.
In the discussion on mgrB it may be good to refer to Kidd et al., 2017. EMBO Mol Med who were the first to systematically study the role of this gene in K. pneumoniae.
l. 67. Specify that mcr-1 confers colistin resistance. It may also be relevant to note that mcr-1 appears to be relatively rare in Klebsiella.
l. 97. ‘glycerol was added to 20% (v/v)’ may be a better way of phrasing this line
l. 107. I assume cation-adjusted Muller-Hinton broth was used? Please specify.
l. 130 – 132. I would really like to see a maximum-likelihood core genome tree here with additional reference isolates (downloadable from public databases), rather than a Neighbour-Joining tree of seven concatenated MLST alleles. It now is impossible to assess whether some of these strains (having the same ST) are truly clonally related.
l. 166. Incision should probably be replaced by introduction
l. 203. Provide exact number.
l. 225. It is not immediately obvious what is meant by (65, 66% variant allele frequency)
l. 235 – 237. Is it also not a possibility that in these strains mgrB has reverted to its wild-type state by excision of the IS element?
l. 252 – 270. This section is difficult to follow. While some mutations are proposed to act as suppressors, it appears that experimental evidence cannot confirm this, so it may be better to rewrite this paragraph to reflect this key finding.
l. 275 – 276. I am not entirely sure that it is correct to single out Brazil and Greece here.
l. 292. I am not entirely sure whether this claim of primacy is relevant. Clearly, a truncation is a loss-of-function mutation and those have been complemented previously.

show less

Streaming algorithms for identification of pathogens and antibiotic resistance potential from real-time MinION sequencing

Minh Duc Cao, Devika Ganesamoorthy, Alysha Elliott, Huihui Zhang, Matthew Cooper and Lachlan Coin

I have previously manuscript this manuscript on and my report and the authors' response to my previous comments can be found here

I believe this revised version is an improvement over the previous version I reviewed, but I still have a number of concerns that remain to be addressed. I remain poorly qualified to assess the bioinformatic and computational approaches and therefore focus on the interpretation of the data.

In my opinion, the section on MLST still needs some editing. Bacterial typing is a term that describes the use of methods to identify relatedness between strains. MLST is a sequence-based typing method that was first introduced in the late 1990s ( and has been widely used since. It has created a vocabulary which describes high-risk 'clones' based on their sequence type (ST). However, for MLST it is essential that STs are assigned with absolute certainty and clearly this is not the case in this dataset. Based on MinION data, the STs of two of three strains cannot be resolved. I feel the only valid conclusion from these analyses can be that the method to identify sequence types may work, but that sequence quality or sequence coverage are essential and need to be higher than in the data sets that were generated as part of this study.

In the next section ('strain typing by presence of absence of genes') is confusing. The authors frequently use the word 'strain type' where they should have used 'sequence type' (e.g. in the line 'and identified their strain types using the relevant MLST schemes'), so please check carefully when strain type should be replaced with 'sequence type' or when 'strain type' means something else than 'sequence type'. The authors set up a method to classify strains on the basis of presence/absence of genes. I have expressed my concerns about this methodology in the previous version of the manuscript and the authors have responded to these concerns as follows: 'The gene presence/absence typing approach is designed to provide preliminary strain information extremely rapidly, using both 1D and 2D reads. It is primarily designed for the situation in which an exemplar strain has already been sequenced. We argue that this does have applicability, for example in an outbreak situation where it is very useful to know if a known strain is present in a new sample.' I believe this is a valid potential application but this is not well explained in the manuscript and I believe this point should be made more clearly. The limitation of the approach (i.e. that gene content does not necessarily correlate with sequence type) should also be highlighted.

Fig 1 adds little information because they key steps of the algorithms (the arrows to 'species typing, strain typing and resistance profile') are not explained in any detail. I believe that adding a schematic overview of this part of the pipeline would make this figure considerably more informative.

Minor comments:
The discussion section is long and unfocused.
Fig 3 serves little purpose in my opinion and may be better placed in the Supplementary data.
Fig 6. Panel c) text is very difficult to read.
p. 1, l. 52 Correct 'when to when to'
p. 2, l. 57: write quasipneuminiae
p. 8, line 44. Correct 'the our real-time analysis'
p. 16, line 40. Write 'an affine gap'

show less

Streaming algorithms for identification of pathogens and antibiotic resistance potential from real-time MinIONTM sequencing

Minh Duc Cao, Devika Ganesamoorthy, Alysha Elliott, Huihui Zhang, Matthew Cooper and Lachlan Coin

The manuscript by Cao et al. describe a pipeline for the real-time analysis of sequencing data produced by Oxford Nanopore’s MinION system. This is an exciting development, potentially with major implications for clinical microbiology. Please note that I am not a computer scientist or bioinformatician, and I have therefore focused on the microbiological interpretation of the data in the manuscript.

The manuscript is interesting, but is, in places, poorly organized. I am also not convinced that the typing approach (based on gene absence/presence) developed by the authors will be useful in real-life situations. On the other hand, the near-real time identification of species and antibiotic resistance genes is exciting and is a convincing illustration of the power of the MinION sequencer.

Major points
p. 4, column 1: The authors should also reconsider whether the discussion on the correct identification of K. pneumoniae ATCC700603 fits in the ‘Results’ section. The finding that 20% of reads map to the closely related species K. pseudopneumoniae is unsurprising, as horizontal gene transfer is extremely common in the genus Klebsiella and this may lead to reads of one genome sequence mapping to multiple species. The authors do not satisfactorily explain how they assigned ST-489 to this strain, as in Table 3 two STs have equally high scores (ST-489 and ST-851). It is also confusing to read about STs being assigned to ATCC700603 in this part of the manuscript: the approach to assign STs is explained in the next section of the manuscript. I would urge the authors to re-structure their manuscript, to first give a more general outline of the approaches and pipeline and then to illustrate this by presenting and discussing ‘real-life’ data.

The assignment of STs to the genomes based on in silico MLST using MinION data is only partially successful: the authors can probably assign strains to clonal complexes, but sequence quality is (still?) too low to reliable assign an ST. I believe the authors should make this point more clearly (see also my remarks on K. pneumoniae ATCC700603 above). The use of gene absence/presence for typing purposes is somewhat problematic as genes can be gained and lost quite easily (e.g. by gain or loss of a plasmid) and this would hide the close evolutionary links between closely related isolates. I can see why it may be interesting to perform this analysis in this context (i.e. ‘do we detect the strain which we know we are sequencing?’) but I am highly sceptical whether this is useful for practical typing purposes. The discussion on the different pan-genome sizes of K. pneumoniae, S. aureus and E. coli is naïve. As the authors rightly remark, these values are importantly skewed by the number of genome sequences that have been sequenced. This is illustrated by the recent analysis of 32 S. aureus genomes by Hennig et al. (doi:10.1186/1471-2105-16-S11-S3) resulting in a pan-genome size of 8647 with 1846 core genes (=21%). Based on these points, I would urge the authors to remove this section from their manuscript or, alternatively, to discuss the limitations of this approach.

Minor issues:
p. 2. What is meant by ‘At the high level’?
p. 2. ´With the emulation, we was able to stream the sequencing data with a hypothetical throughput of 120 times higher what we obtained.´. Change ´we was´ to ´we were´. This line is also somewhat confusing in this context, as the reader may wonder why this high ´hypothetical throughput´ was not reached in one of the runs and/or why a new run was not performed to test whether this hypothetical throughput could be reached. These points are better explained on p. 10 and could be summarized that the pipeline is scalable and could be adapted to much higher data throughputs (i.e. for those that are expected from the PromethION platform)..
p. 2. ‘where bioinformatics analysis methods were established’ should read ‘where bioinformatics analysis methods are well-established’ or something along those lines.
p. 4. Please correct K. variicolla to K. variicola.
p. 7. I believe the concept of a ‘probabilistic Finite State Machine’ needs more introduction for the non-expert audience at this point in the manuscript.
p. 13. resFinder should be ResFinder.
p. 15. ‘flank sequences’ should probably read ‘sequences flanking the antibiotic resistance genes’

show less