William J Harrison, Peter J Bex
Dear Will and Peter,
Congratulations on your new paper! We have read it with great interest. We listed below our concerns and comments on it. We hope you will find these comments useful, we wrote them with a very constructive spirit hoping to improve the manuscript.
1. You mentioned that three general classes of mechanism have been advanced to account for crowding (positional uncertainty, feature averaging and source confusion). How do you consider grouping? Another mechanism? When do you think it occurs? Any assumption would have strong constraints on the way the model is built.
2. Lines 172-176. It is not clear why mixture modeling based on maximum likelihood would fail to predict the underlying distribution of a data set. This technique has been widely used in the visual short term memory literature as the author properly cited. Some of us have also been using it for explaining visual masking and its interaction with spatial attention (Agaoglu, Agaoglu, Breitmeyer, & Ogmen, 2015; Agaoglu, Breitmeyer, & Ogmen, 2016).
3. Categorizing errors based on their distance to the nearest model prediction is technically equivalent to mixture modeling with three circular Gaussians, each sitting at the error predicted by each model (averaging, substitution etc.). So the method used here is qualitatively similar but quantitatively seems rather arbitrary. The current way of analysis implicitly assumes that the best way to account for crowded responses is a mixture model with (at least) three components, and then goes onto quantifying the weight of each component as a function of target-flanker spacing.
The novel contribution of this study is a bit unclear to us. If it is to show that a population code of orientation selectivity can generate all types of errors, what is exactly the difference between your previous paper (CB 2015) and this manuscript?
Poder & Wageman 2007 study is highly relevant to this work. Also Ester and colleagues' studies used a similar approach, and the differences in model parameters between similar and dissimilar flankers in Ester et al. (2015) and the differences between one-gap flanker and two-gap flanker conditions in this study would be very interesting to compare.
In a recent study using the stimulus paradigm that you used previously (Agaoglu & Chung, 2016), we have shown that this particular stimulus paradigm is prone to eccentricity confounds. Perceptual errors are highly affected by the absolute orientation of the target and flankers, not just relative to each other. It is unclear how this affects the results reported here.
Line 34. It is fair to ask to cite our relevant work (Agaoglu, Chung, & Ogmen, 2016) where you described previous work on crowding and eye movements, since we presented a different point of view. The same holds for Pachai, Doering & Herzog 2016 (you cited only the reply to the reply). As scientists, we can agree to disagree, we hope.
Line 143. Except for N1, perceptual error does not seem to follow a linear trend. For A2 there is an increase in perceptual error only for the smallest flanker size. You may want to revise that sentence.
Line 270. We have a supporting evidence for this sentence. The role of masking is indeed increasing random guessing and slightly decreasing stimulus encoding precision (Agaoglu, Agaoglu, Breitmeyer, & Ogmen, 2015). However, ruling out metacontrast masking only because of this seems weak. Since the stimulus duration was 500 ms, we don't think there is any masking at all. You might also want to mention that to support the claim made in this sentence.