Completed on 9 Jan 2018
Login to endorse this review.
Thank you for Chiron! I enjoyed reading the paper and using the tool, and I was impressed by its accuracy. I think research into basecalling algorithms is very important, as they play such a major role in the usefulness of Oxford Nanopore sequencing compared to other platforms. Chiron, and its novel neural network structure, should help to push this field forward.
I was happy with the paper overall, but I had two major comments: the speed performance metrics are misleading, and there is no data on consensus accuracy. My specific comments are below.
Your "CPU rate" tests for Albacore and Chiron seem to be using a single thread. It is confusing to include these results with the "GPU rate" tests, which presumably use an entire GPU. It would be more realistic to compare an entire GPU with an entire CPU, which on most modern systems is at least four threads, often eight or more.
We tested the CPU rate on 4 threads and 8 threads and 20 threads, and full CPU utility is observed, the basecalling speed is increased proportionally as more threads are used, so we believe the CPU resource is used efficiently under a multi threads situation. We added a 8-core CPU rate to table 2, and updated the table legend as follows:
“Single core CPU rate is calculated by dividing the number of nucleotides basecalled by the total CPU time for the basecalling analysis. 8 core CPU rate is estimated by multiplying single core cpu rate by 8, based on observed 100% utility of CPU processors in multi-threaded mode on 8 cores. “
In the conclusion, you state that Chiron using a GPU is "faster than current data collection speed". However, at 450 bp/sec/pore (the current Nanopore sequencing rate), Chiron would only be able to keep up with about three in-strand pores. A MinION run can generate over 5 Gbp of reads, which would take over a month to basecall using your quoted GPU rate.
We have deleted this statement as it was misleading.
In general, I think you need to be more upfront about the speed performance difference between Chiron and Albacore. Your results show that Chiron on a GPU is comparable in speed to Albacore on a single CPU thread. However, on a computer with 8 CPU threads and a single GPU, Albacore will be 10 times faster than Chiron, and if no GPU is available it will be more than 100 times faster. Chiron is therefore only a viable alternate to Albacore for small volumes of data.
We have included the following paragraph in the discussion to acknowledge the speed limitations of Chiron:
“Our model is substantially more computationally expensive than Albacore and somewhat more computationally expensive than BasecRAWller. This is to be expected given the extra depth in the neural network. Our model can be run in a GPU mode, which makes computation feasible on small to medium sized datasets on a modern desktop computer. “
In addition to the error rate metrics for basecalled reads, I would like to see error rate metrics for the consensus sequences produced by each basecaller's reads. For researchers who work with assembly or other high-read-depth analyses, consensus accuracy may be more important than individual-read accuracy.
I would suggest using either Racon or Canu to measure consensus accuracy, as they are widely used tools in the Nanopore sequencing community. I realise this would only be possible for your bacterial and viral read sets, where depth is sufficient for assembly and sequence consensus.
Thank you for this suggestion. We have calculated the consensus rate for bacterial and viral datasets, using Miniasm + Racon. We describe the approach used in the methods section:
“We assessed the quality of assemblies generated from reads produced by different base-callers. For each base-caller, a de-novo assembly is generated by the use of only Nanopore reads for the M. tuberculosis, E. coli and Lambda Phage genome. We use Minimap2 and Miniasm to generate a draft genome, then Racon is used to polish on the draft genome for 10 rounds. "
The results are presented in Table 2 and Figure 3, and summarised in the text as follows:
“In order to assess the quality of genomes assembled from reads generated by each basecaller, we used Miniasm together with Racon to generate a de-novo genome assembly for each of the bacterial and viral genomes (see Methods). The results presented in Table 2 demonstrate that Chiron assemblies for Phage lambda and E-coli samples have approximately half as many errors as those generated from Albacore (v1 or v2) reads. For M. tuberculosis, Chiron has fewer errors than Albacore v1, but slightly more than Albacore v2. The identity rate and relative length for each round of polishing with Racon are shown in Figure 3.”
The abstract says, "the first deep learning model", but then the intro says, "one of the first". These comments seem to contradict each other. Can you be clearer?
We now revise the statement in the introduction to say:
“In this article we present Chiron, which is the first deep neural network model that can translate raw electrical signal directly to nucleotide sequence.”
We also state in the same paragraph that:
“Oxford Nanopore Technologies have also developed a segmentation free base-caller, Albacore v2.0.1, which was released shortly after Chiron v0.1.”
I am wary about including cloud-based Metrichor results in your comparison, as they aren't replicable. Is a version number possible for the Metrichor data? Or if (as seems likely) Metrichor uses similar code to Albacore, is there an equivalent Albacore version? At the very least, it would be useful to provide the date the reads were basecalled in Metrichor.
“The data is basecalled on Metrichor on Jun 3rd 2017(Lambda), May 18th 2017(E. coli), Jun 4th 2017(M. tuberculosis), and June 20th 2017(NA12878-Human).”
The performance comparison section says, "…with Chiron-BS in ??". Was this an issue in my PDF or is there missing text?
Now it's displayed correctly.
The read accuracies are shown using fractions (in the table, e.g. 0.1056) and as percentages (in the discussion, e.g. 2%). Please use a consistent formulation (my preference would be percentage).
Formulation has been changed to percentage.
I was confused by this phrase in the table caption: "against three other segmentation-based Nanopore basecallers." Albacore v2 is not segmentation-based but is in the table, so I think the caption should simply read "against four other Nanopore basecallers."
The description has been corrected.
Which version of Albacore did you use for the speed performance tests? I found v1.1.2 and v2.0.1 to have similar speed performance, but it would still be clearer to explicitly state the version.
We use v1.1.2 for speed performance, the detail has been added into the description of the speed table.