Preprint reviews by Andrew Parker Morgan

Introgression patterns between house mouse subspecies and species reveal genomic windows of frequent exchange

Kristian Karsten Ullrich, Miriam Linnenbrink, Diethard Tautz

Review posted on 03rd August 2017

Ullrich, Linnenbrink & Tautz make the interesting claim that gene flow between mouse populations with varying degrees of differentiation is highest at loci containing genes with functions in olfaction and adaptive immunity. This finding is intuitively very appealing and potentially exciting. It accords well with known roles of odorant receptors in mouse social behavior [1,2], and has echoes of recent descriptions of adaptive introgression of alleles of immune genes from Neanderthal and Denisova into early modern humans [3].

However, I have some rather serious technical concerns that temper my enthusiasm for the manuscript's key results.


(a) The authors have chosen to create a single, synthetic consensus sequence to represent each population. They do so by taking the major allele at each variable site over fixed windows of 25kb. Genetic distances between populations are estimated from these sequences. Although phylogenetic trees are not explicitly constructed for each 25kb window, the distances imply a tree and indeed this is how they are interpreted by the authors. I struggle to understand what any of this means. The authors have thrown away the most useful information in the data by ignoring both allele frequencies and LD. A more rational approach would be based either on phased haplotypes (eg. chromosome painting) or on allele frequencies at approximately unlinked sites (eg. D-statistics).

(b) Even If we accept that a single, synthetic consensus sequence for each population is a meaningful entity, the "dK80" statistic does not necessarily capture "introgression" as claimed. It simply captures (a subset of) departures between the global phylogeny shown in Fig 1 -- which is assumed to exist and to be correct -- and the local phylogeny. An alternative hypothesis to introgression is incomplete lineage sorting (ILS). The authors seem to have done little to distinguish ILS from introgression. Previous work on house mice indicates that ILS is not rare in the mouse genome, although exactly how widespread remains a matter of vigorous debate [eg. 4].

(c) The close overlap between "mutually introgressed" regions and copy-number variable segmental duplications is a big red flag for me. SNV calling was performed with exceedingly liberal filters and little masking for copy number (line 385). The authors are completely correct that previous work has mostly ignored CNV regions -- but this is done with good reason. The approach in this paper conflates allelic and paralogous variation, yielding a local tree that may be distorted not only in branch length (line 394) but also topology. We cannot assume that individual copies of a duplicated gene will have the same evolutionary history, and so cannot lump them together.

A case in point is the Cwc22 gene on chr2 (line 334). The authors point to this locus as an example of "mutual introgression" facilitated by meiotic drive. However, we have recently shown (in great detail) that phylogenetic discordance in this locus is mostly due to non-allelic gene conversion between paralogs of Cwc22 [5].

A second putative example mentioned by the authors is the repeat-heavy long arm of the Y chromosome (line 330). In this case we can definitively rule out the hypothesis that the patterns observed by the authors are due to introgression, because the Y is inherited as a non-recombining unit. Using the same sequence data as these authors, we have shown in a recent preprint that Y chromosomes are completely differentiated between Fra, Ger, Ira, MUS and SPRE [6], in agreement with much previous work [eg. 7]. The same is probably true of CNV regions on the X chromosome (eg. SLX gene family, line 329): inter-(sub)specific gene flow is much lower on most of the mouse X chromosome than the genomic background, at least in part because of the outsized role of the X in hybrid sterility [8].

Again, the authors have made a really interesting observation that may be (and to my mind, probably is) true. The amylase pseudogene story is especially compelling. But I feel that the major conclusions have otherwise run ahead of the evidence. I hope that the authors are willing to undertake a re-analysis of their data with more well-established methods for detecting and describing gene flow.

[1] Hastie et al (1979) Cell.
[2] Godfrey et al (2004) PNAS.
[3] Abi-Rached et al (2011) Science.
[4] White et al (2009) PLoS Genetics.
[5] Morgan et al (2016) Genetics.
[6] Morgan & Pardo-Manuel de Villena (2017) bioRxiv.
[7] Geraldes et al (2008) Molecular Ecology.
[8] Teeter et al (2008) Genome Research.

show less

See response