Preprint reviews by Joshua Miller

Draft genome of the Reindeer (Rangifer tarandus)

Zhipeng Li, Zeshan Lin, Lei Chen, Hengxing Ba, Yongzhi Yang, Kun Wang, Wen Wang, Qiang Qiu, Guangyu Li

Review posted on 21st July 2017

In this Data Note the authors describe the first genome assembly for the reindeer. I am impressed by the amount and variety of analyses undertaken to demonstrate the quality of the assembly. That said, I think there are places in the manuscript where the methods could be more fully explained, and broader context given to the results. Specific comments are below


Line 20: could be fair to mention that the amount of usable sequence was actually 615 Gb (line 66)

Lines 42-45: these two sentences should be re-worded for clarity.

Line 49: replace "special" with "this"

Table S1: what is the difference between sequence and physical converge?

Lines 69-71: a fuller explanation of the k-mer analysis would be useful. Also, I noted that the distribution in Figure S1 is bimodal. Is this expected? Is it a problem for the analysis? Finally, why not use the traditional c-value estimate of genome size, or at least provide a comparison of the two estimates?

Lines 87-89: it is stated that the accumulation curves in Figure S2 are similar, but to me it looks like the slope for the reindeer is much steeper and more linear than the other genomes. Are they statistically the same? If the reindeer one is different why might that be?

Lines 89-96: why was the goat genome chosen for syteny analyses? Is not the cow genome more complete?

Figure S3: please expand the figure legend so that it contains more information as to what is being shown.

Table S4: indicate where % corresponds to % of the genome versus % of elements found.

I would suggest moving the reference to Table S6 from the end of Line 128 to the end of the sentence on Line 127. As it stands now when I went to look at the data I was expecting to see a summary of the functions annotated, not a comparison of how the different software's did. That said, a table summarizing the functions annotated would also be interesting.

Lines 130-131: state how many variants were found.

Lines 151-153: is this divergence time in line with previous estimates? Please provide citations.

show less