Daniele Fanelli, Rodrigo Costas, Ferric C Fang, Arturo Casadevall, Elisabeth M Bik
Review posted on 26th April 2017
Thanks for the opportunity to review and comment on this very interesting research. I've got one suggestion and a question for you. (I offer these in addition to supporting Chris Mebane's suggestions, particularly those with respect to further explication of methodology and more detailed description of the Lee & Schrank hypothesis)
Suggestion: At various points in the paper you use of the term "questionable" or "questionable practices" or "questionable research practices," to refer to the image manipulation falling into your Category 2. However, from the description in the paper of what you classified into that category, that type of thing clearly violates currently accepted standards for
allowed types of image manipulation, whether due to ignorance,
carelessness or malfeasance, meaninging the behaviors themselves are not really questionable at all. So, in keeping with the recently introduced terminology of the NASEM report - Fostering Integrity in Research (disclosure: I was a member of the authoring panel of the report), it would seem more correct to reference these as "detrimental," "detrimental practices," or "detrimental research practices."
Question: It's great to see efforts to empirically test hypotheses derived from a variety of theoretical perspectives, positing potential influences arising at multiple levels ranging from systemic, to local, to intra-individual. At the same time, this raises an interpretation question in particular about the country-level associations observed, but perhaps also about the team-level associations observed. It's not clear to me how the methodology you've employed helps to avoid making the exception fallacy - the error of "exceptional cases" leading to conclusions being reached about the larger groups from which the cases are drawn. In most multi-level analyses this concern is typically addressed through the use of multi-level models where the units of observation are distinguished from the units of analysis and the latter are specified at two, three or sometimes four different "levels" in the context of generalized linear modeling of some sort. It may be arguable whether, or to what extent, such methods help to avoid the exception fallacy, but they do represent an explicit recognition of the issue. Perhaps I missed it, but I don't think you've employed such multi-level modeling techniques here, nor do such techniques appear to have been employed in the 2017 Fanelli et al. PNAS publication? To give just one example of how this might lead to an interpretation problem, if a case arises from, say, a country in which there are institutional level policies about misconduct, one doesn't know whether the individual who engaged in the image manipulation was employed at an institution with or without such a policy. And regardless of whether their institution had such policies in place, one doesn't really know the extent to which the individual was even "exposed" or subject to the influence of the policy's presence or absence. If I'm right, then more caution is warranted in the interpretation of the associations beyond those at the individual-level.