Martin Wegrzyn, Joana Aust, Larissa Barnstorf, Magdalena Gippert, Mareike Harms, Antonia Hautum, Shanna Heidel, Friederike Herold, Sarah M. Hommel, Anna-Katharina Knigge, Dominik Neu, Diana Peters, Marius Schaefer, Julia Schneider, Ria Vormbrock, Sabrina M. Zimmer, Friedrich G. Woermann, Kirsten Labudda
Review posted on 28th June 2018
It’s been a while since I read a manuscript as inspiring as “Thought experiment: Decoding cognitive processes from the fMRI data of one individual”. The authors had very limited resources yet managed to design a study shedding light into ability to decode different cognitive states in a single individual. The study design has all the hallmarks of excellence: preregistration of hypotheses, out of sample prediction, as well as transparency of methods via code and data sharing. The dataset they provided along the manuscript is likely to become an important benchmark as well as a valuable educational tool. Conditional on a few minor fixes I wholeheartedly endorse this manuscript.
- I applaud the authors for sharing code, statistical maps and preprocessed data. It is really setting the example in the field. I was especially impressed by the detail and clarity of the notebooks. However, I also feel that there is a huge potential for this dataset becoming an important part of many neuroimaging courses as well as a benchmark for other decoding methods. For this to happen one needs to share the raw data. I strongly encourage authors to format the raw data in the Brain Imaging Data Structure (BIDS) and share it on a domain specific repository such as http://OpenNeuro.org or FCP/INDI.
- Abstract: the idea of human experts making predictions is not explained well - I was confused when I first read “four blinded teams”
- I found the term “superordinate domain” more confusing than helpful, sticking to just “domain” or “cognitive domain” might help with accessibility of the text
- Please provide the full text of the instructions given to the participant.
- It is difficult to follow the study design. It would be very helpful if you included a figure depicting the design with length of blocks and runs as well as the sequence of all different categories.
- Authors z-scored the data across the train and test sets. This procedure causes information leakage. Z-scoring should be done separate on train and test sets or rescaling derived from the train set should be applied to the test set.
- The purpose of having human experts making predictions was not motivated well. Was the idea that humans might be able to make better predictions or combine sources of information thus suggesting improvements in automatic methods are possible?
- Page 4: “similar blocks” repeated sentence
- What was the motivation for including the anatomical terms from NeuroSynth if the goal of the prediction was to perform cognitive decoding?
- Page 4 section 3.1: missing full stop after “Fig 4”
- Authors decided to use simple time shift and a boxcar function to average z-scored volumes. What motivated this decision in contrast to HRF convolution, GLM modelling (with explicit modelling of instruction TRs) and contrasts?
- Page 4 “Feature selection for correlation analyses” it’s not clear if the prediction accuracy is for domains or content
- Figure 1, 5 and 9: labels are hard to read due to overlapping, caption would benefit by explaining that colors correspond to K-means derived clusters
- Page 7: the fact that using only most significant voxels gave you better prediction accuracy is quite surprising and at odds with Sochat et al. 2015 (https://www.frontiersin.org/articles/10.3389/fnins.2015.00418/full). Would be interesting to hear your opinion why it is so.
- Page 7: it would be good to express the prediction accuracy of the Neurosynth method in the same way as as the correlation method so the reader could compare them easily
- The manuscript would benefit from a figure or table directly comparing the domain and content prediction accuracy made by the correlation, Neurosynth and human experts.
- Is there any information available on what strategies each team of experts used?
- Figure 7 and 10 are missing a colorbar.
- If I understood the manuscript correctly correlation method was able to predict domain slightly better than any team of human experts, but human experts predicted content better than any automated method. Even though not mathematically impossible this result is counter intuitive and worth discussing.
- “Analysis of fMRI data could benefit from splitting the data” - is this really supported by your analysis? Wouldn’t a one sample t-test also capture consistency of BOLD response across time?
- “If, on the other hand, the activity pattern for the language task resembled a default mode network, one would conclude that the task was not performed at all and not use the map to determine the degree of lateralization.” This is a tricky argument. First of all the reason why preoperative mapping is necessary is due to abnormal anatomy of patients (because of slow growing tumors for example). This makes assumptions about normal/expected activation patterns hard to uphold. Furthermore, the patterns might not be as reliable as you say - consider the secret rest run during which DMN pattern was not present.
- “The present study showed how brief periods of self- generated thought can be decoded regarding the superordinate neuropsychological domains involved.” The use of “self-generated” is misleading here - only the secret run was truly self-generated and that one you were not able to decode. The other ones were following instructions and thus should not be considered self-generated (at least this is how this term is used in the mindwandering literature).
John A. Borghi, Ana E. Van Gulick
Review posted on 26th March 2018
The paper "Data management and sharing in neuroimaging: Practices and perceptions of MRI researchers" characterizes results of a survey attempting to look at how neuroimaging data is being managed and analyzed.
- Found mixing in comments about "open access publishing" and "pre-registration" confusing in context of data management. Those seem to be topics separate from research data management.
- Similarly talking about "data analysis parameters" seem to extend the definition of research data management beyond my personal intuition.
- Lines 101-102. This statement seems to be too broad. I don't believe data sharing has been proposed as a way to deal with suboptimal designs.
- Line 104. As far as I know Journal of Cognitive Neuroscience did require data sharing (via fMRIDC).
- Line 107: I had a brief look at the references cited to support the statement "The majority of researchers now appear to support the concept of sharing data", and I found them not to be as categorical on this topic.
- Line 350: In context of the sentence mentioning "majority" it is unclear what 40% refers to.
- Line 375-378: The definition of "data sharing" proposed by the authors it so broad that it has limited use (for example publishing a paper or giving a talk is considered data sharing). This makes interpreting the rest of the paper confusing.
- I appreciate that the authors share the raw results of the survey.
- I'm glad that the authors acknowledge the potential biases in the results.
Jessica-Lily Harvey, Lysia Demetriou, John McGonigle, Matthew B Wall
Review posted on 21st March 2018
"A short, robust brain activation control task optimised for pharmacological fMRI studies" by Harvey et al. proposes a new standardized task fMRI paradigm optimized for short duration and ease of use. It has a potential for broad adoption in clinical studies.
- Line 45: not clear what is the difference between “reliability coefficients for the tasks” and “Voxel-wise reliability metrics”. Maybe more intuitive labels could be used (ROI vs voxel)?
- I really enjoyed the outline of properties of a good pharmacological fMRI task. Maybe it would be good to highlight it in a box or a table. It makes a great guideline for future developments in this area.
- It might be beneficial to accompany the manuscript with video recordings of the visual and auditory stimuli of an example run for the two tasks. Such videos could be a great way to showcase the task to prospective users.
- It is not very clear how the instructions were delivered and if they are standardized. I could not find the instructions in the accompanied code. If this task is supposed to be broadly used in a clinical setting instructions should be explained better or automated.
- The manuscript would benefit from a figure describing the experimental paradigm.
- Why is the buffer time (10s) at the end rather than the beginning? Putting it at the beginning would help with potential non-steady state effects often found in EPI sequences.
- Line 175: missing comma after “31ms”
- Line 182: please specify which template was used
- Line 190: missing citation for ICC (Intraclass Correlations : Uses in Assessing
Rater Reliability Patrick E. Shrout and Joseph L. Fleiss)
- Line 219: referencing figures is usually done with capitalized form (“Figures” vs. “figures”)
- Line 187: It seems that session effects were not modeled explicitly. It might be worth considering a model that removes session mean?
- Figures 1 and 2: I noticed that effects were tested and reported only in one direction (positive). This made me thinking – How would the negative effect look like? It would probably be task-negative network/default mode network. It might be worth looking at “any task vs null” contrast to see if this short protocol can also be used to reliably map that network.
- Figure 3: please consider drawing lines linking corresponding sessions 1 and 2 for each participant. You might also want to reconsider using yellow on white – it has poor contrast. (example: https://www.nature.com/articles/sdata201454/figures/3)
- Big kudos for sharing statistical maps on NeuroVault
- Line 364: It's great that authors share the code, but I don't think figshare is the best place to do it. Uploading the code to GitHub and using zendo for long-term preservation would allow users of this paradigm to submit improvements and bugfixes.
- Source code should specify all of the dependencies with their versions. This could be easily done with "pip list" or "conda list"
- Finally, I strongly encourage the authors to share the raw data from this study. This should be relatively easy on a dedicated platform such as https://OpenNeuro.org
Samuel A. Nastase, Yaroslav O. Halchenko, Andrew C. Connolly, M. Ida Gobbini, James V. Haxby
Review posted on 24th February 2018
The paper "Neural responses to naturalistic clips of behaving animals in two different task contexts" describes a new benchmark dataset for brain decoding. It brings a breath of fresh, unique quality in the context of similar currently available datasets. It will by no doubt be recognized as a valuable resource for the years to come. Nonetheless, a few improvements would make the manuscript better.
- Please provide stimuli files in the /stimuli folder linking them to the individual events via the stim_file column in _events.tsv files. See the BIDS specification for details. This is probably the most important improvement to the dataset I came across.
- Consider distributing preprocessed version of the datasets. This would allow scientists to run analyses using this dataset without the need to perform preprocessing themselves. In my experience providing a preprocessed version of the data increases its reuse potential. You can just run FMRIPREP on directly OpenNeuro (I recommend using the "--use-syn-sdc" option since the dataset does not include fieldmaps), and it will be available alongside your dataset. The manuscript should include information about the availability of this data and a brief description of FMRIPREP outputs (it's redundant, but convenient for the reader).
- Providing a figure with example frames from each category of stimuli would greatly help readers in understanding the paradigm.
- Similarly plotting the distributions of selected QC parameters would also improve the manuscript.
- The manuscript would benefit from division into sections such as Introduction, Methods, Results, Discussion (where a comparison to other publicly available datasets could be added) and Conclusion.
- It might be useful to consider making it explicit in the title that this paper is a data descriptor.
Javier Rasero Daparte, Hannelore Aerts, Jesus M. Cortes, Sebastiano Stramaglia, Daniele Marinazzo
Review posted on 20th February 2018
Functional connectivity as measured by resting state fMRI could one day be an important clinical biomarker. This paper attempts to push forward our understanding of intrinsic brain connectivity during rest and while performing a cognitive task.
"Predicting functional networks from region connectivity profiles in task-based versus resting-state fMRI data" by Rosaro et al. has the potential to contribute to our understanding of resting state connectivity meaningfully, but is held back by a confusing and unclear presentation.
After reading the abstract, introduction and the methods section, I was not clear what the authors attempted to predict from connectivity measures. My best guess is that the task was to predict which brain network a brain region belongs to given a vector of its connectivity measures with all other brain regions. A task formulated this way is, however, straightforward if we assume correspondence of connectivity measures across all the input samples. This assumption means that the first value of the vector always corresponds to connectivity with region A, second with region B, etc. for all input samples. The consequence of such encoding is that the connectivity vector for region A will have a correlation value of 1 at first value of the connectivity vector. In other words, the identity of brain regions is represented as noisy one-hot-encoding. All the network or the classifier has to do is to figure out which brain regions correspond to which networks - something that could be done without any knowledge of brain connectivity.
This is just speculation since I was not able to grasp the details of the analysis to confirm what was being predicted and how connectivity was encoded.
To improve clarity in the future revision of the manuscript, I recommend adding a conceptual figure presenting the prediction task in terms of dependent and independent variables (features and labels).
More specific comments:
- The abstract is confusing. "In this study we use a large cohort of publicly available data to test to which extent one can associate a brain region to one of these Intrinsic Connectivity Networks looking only at its connectivity pattern, and examine at how the correspondence between resting and task-based patterns can be mapped in this context." This sentence too long and convoluted.
- Page 3: "we will explore..." -> "We will explore..."
- Page 4: missing citation for the HCP project
- Page 4: "has been proved to increase the quality of the original data" citation needed.
- Page 4: "connectivity map" might be a better term than "correlation image"
- Page 4: How was the assignment of each brain region to a brain network performed? Shen and Yeo's parcellations differ in region definitions.
- Page 5: "Finally, the 282 resulting individual FC matrices were concatenated together" it's unclear if this was done separately for task and rest or the data was combined first. What dimension were the matrices concatenated along?
- Page 5: was the cross-validation performed across participants or nodes? Or both? If so why?
- Page 6: Table 1 is missing the MLP results
- Prediction accuracy on another dataset (with different acquisition parameters) would be good evidence of the robustness of your findings.
Matan Mazor, Noam Mazor, Roy Mukamel
Review posted on 16th December 2017
Mazor and colleagues in their manuscript titled “Using experimental data as a voucher for study pre-registration” propose a solution that potentially prevents bad actors from pretending that they preregistered the analysis protocol prior to data analysis. This is an interesting approach to an increasingly important problem. Some technical issue might prevent this solution from being practically applicable.
- The proposed solution only works if the authors share raw data (which would be great, but sadly is not common).
- The verification process requires reviewers to reanalyze the data which seems like an unrealistic expectation.
- Differences between processing pipelines used by the authors and the reviewers could result in slightly different results (see Carp et. al. 2012) and raise false concerns about changes to the preregistered protocol. This could be exploited further by the randomization scheme that has a very limited set of orders resulting in very similar results.
- “Bob then uses the Python script that he found in the protocol folder to generate a pseudorandom sequence of experimental events, based on the resulting protocol-sum.” Isn’t the fact that the code to translate the checksum to random order provided by the authors? What if it always gives the same answer? Am I missing some detail?
- A more sophisticated attack would involve modifying already acquired data to temporary rearrange so it would comply with a protocol defined post hoc. This would, however, require a highly motivated bad actor.
- RPR approaches do not necessarily provide time locking. One could imagine a situation, when a bad actor collects data, picks analysis protocol post hoc, submits to first stage of registered report pretending they did not acquire any data yet. This way they could game the system, but only assuming reviewers will not require changes in acquisition protocol.
Jason D. Yeatman, Adam Richie-Halford, Josh K. Smith, Ariel Rokem
Review posted on 09th October 2017
The paper entitled “AFQ-‐Browser: Supporting reproducible human neuroscience research through browser-‐based visualization tools” is a beautifully written description of a software tool that takes outputs a specific of a specific diffusion MRI analysis method (AFQ) and creates interactive visualizations that make data exploration easy. The tool implements some truly innovative ideas such as piggy backing on GitHub as a service for hosting data and visualizations and representation of data in a form that is appealing to data scientists with no prior MR experience. I hope that other tools will emulate those features. The manuscript also includes thoughtful discussion of exploratory vs hypothesis driven methods.
- The abstract gives the reader the wrong impression that the AFQ-Browser tool is more generic than it really is. It should be clarified that the tool only allows users to visualize and share outputs of AFQ analyses.
- When describing BrainBrowser and its involvement in MACACC dataset surely you meant “visualization” not “analysis”.
- It might be worth to introduce the publication feature earlier in the paper. I was quite confused when reading about reproducibility and data sharing without knowing that AFQ-Browser is not just a visualization tool.
- Please mention in the paper the license under which the tool is distributed and any pending or obtained patents that would limit its use or redistribution.
- If all AFQ users start uploading their results to GitHub using AFQ-Browser it might be hard to find or aggregate those results. It might be worth considering (and discussing) a centralized index (also hosted on GitHub) of all publicly available AFQ-Browser generated bundles. This index can be automatically updated during the “publish” procedure.
- GitHub is a great resource, but have few guarantees in terms of long term storage. A solution to this would be depositing the bundles into Zenodo which could be done directly from GitHub. Would be worth implementing and/or discussing this in the manuscript.
- It’s a technical detail, but it took me a little time to figure out why the tool requires user to spin up a local server (presumably to be able to access CSV and JSON files). Might be worth elaborating.
- Saving the visualization “view” (or “browser state”) seems cumbersome when done via a file. Could the view be encoded in the URL (via GET parameters)? Sharing of such views would be much easier and natural.
- Some example analyses include information about group membership or demographic information such as age. How is such information stored and conveyed to AFQ-Browser? Does it also come as output of AFQ?
- In the manuscript you mention that AFQ-Browser allows users to compare their results with normative distributions. Where are they coming from a central repository (please describe how it is populated) or do users need to provide such distributions themselves?
- It might be worth considering a crowdsourcing scheme such as the one employed in MRIQC Web API (https://mriqc.nimh.nih.gov/) to generate normative distributions of AFQ outputs.
- Is the way you store data in CSV files and their relation to the JSON files (beyond the “tidy” convention) described somewhere in detail? It would be useful for users.
- Please describe the software testing approach you employed in this project.
Tim van Mourik, Lukas Snoek, Tomas Knapen, David Norris
Review posted on 27th September 2017
Porcupine by van Mourik et al. is extensible cross platform desktop application that allow users to quickly design neuroimaging data workflows via a graphical user interface. Lack of graphical user interface has been a deeply needed feature for Nipype and Porcupine fills this gap.
Porcupine is designed in a very smart and flexible way allowing it to be extended to new code generation backends. Furthermore, since the output is the source code of the pipeline the processing can be customized via editing the code. Reproducibility of the produced pipelines is increased, by the generation of Dockerfiles.
It’s hard to understate this contribution since Porcupine since it will expose a large community of researchers that prefer graphical interfaces to reproducible neuroimaging pipelines.
- The manuscript at some point mentions saving MATLAB code, but I don’t believe such plugin exists yet.
- It might be worth mentioning NIAK as potential output plugin.
- In context of computational clusters it might be worth clarifying that Docker images can be run via singularity.
- “Nypipe” -> “Nipype”
- It’s unclear why the user is required to make modifications to the output Dockerfile – it seems that it should be possible to generate a complete Dockerfile without a need for any modifications.
- “It should be noted that Porcupine is not meant for low-level functionality, such as file handling and direct data operations.” What does that mean? Could you give an example?
- In context of graphical workflow systems: did you mean JIST instead of CBS Tools?
- “providing a direct way of creating this is high on the feature list of Porcupine” –> “planned features list”?
- The license under which Porcupine is distributed is not listed in the manuscript
Jacob Jolij, Els Van Maeckelberghe, Rosalie Koolhoven, and Monicque Lorist
Review posted on 15th September 2017
Jolij and colleagues argue in their paper that it is unethical and soon it will be illegal in the EU to publicly share data describing human participants of academic research experiments. Their perspective is deliberately biased to “spark a debate”. The authors strongly urge researchers not to share data.
There are several issues with the paper:
• The title and the summary are misleadingly broad and suggest a thorough review of the legal status of data sharing around the world. However, the paper only analyzes data sharing under a new not yet implemented European Union regulation with strong emphasis on the legal system in the Netherlands.
• The authors purposefully take the strictest possible interpretation of ethical guidelines. I find this approach of very limited use. For example, the excerpt from the Declaration of Helsinki they quote: “Every precaution must be taken to protect the privacy of research subjects and the confidentiality of their personal information” in its strictest interpretation would make doing any research impossible. If taken literally (which the authors seem to encourage) all human derived data – whether anonymized or not – would have to be stored on encrypted temper proof computers. Passwords would have to be entered in prescreened empty rooms to ensure eavesdropping would not be possible. One could even say that displaying the data in a room with windows is a danger of eavesdropping so such situations should be eliminated – as a precaution. This is obviously impractical, but it shows how strictest possible interpretation can be manipulated into absurdity making any research unethical.
• Furthermore, some argue that there is another aspect of the ethics of data sharing – that researchers have the ethical obligation to maximize the contribution of their participants. See Brakewood B, Poldrack RA. The ethics of secondary data analysis: considering the application of Belmont principles to the sharing of neuroimaging data. Neuroimage [Internet]. 2013 Nov 15;82:671–6. Available from: http://dx.doi.org/10.1016/j.neuroimage.2013.02.040
• I am not a scholar of law to judge if the authors interpretation of ‘General Data Protection Regulation’ is correct. It is, however, unclear if it is also illegal to share data with other researchers within the same institution or institutions outside of the EU. Such analysis would be useful to the reader.
• I might be mistaken, but judging from the affiliations none of the authors is experienced in practicing law. If I am not mistaken, adding a collaborator with a law background would strengthen the paper.
• It’s not even clear why the topic of anonymity needs to be discussed since under “strictest possible interpretation” of the rules if one cannot control the purpose of data processing in context of public data sharing and thus making data sharing illegal whether they are properly anonymized.
• The section on anonymity is a mixed bag. The point of that one can re-identify anyone if equipped with the right information is not very revealing. It is also not clear what is the purpose of the example of the author identifying himself from a public database using information only available to himself. The argument that EEG recordings or fMRI scans greatly increase the chance of re-identification, because of their high dimensionality is mute, because acquiring matching data by a third party would be very hard. A date of birth or a zip code even though includes less information is much more useful for reidentification.
• It is not clear if the rulings of the Dutch Council of State are legally binding in all of the EU (I suspect they are not).
• The section about the risk posed by potential re-identification is purely hypothetical and lacks any analysis or example of actual harm that was inflicted due to reidentification of research participants.
• The consent form section is also confusing. Why is the claim that participants don’t always read consent forms a problem only in context of data sharing? Does GDPR enforce researchers to do mandatory consent form comprehension checks? Would the type of a consent form done by The Harvard Personal Genome Project make public data sharing legal under GDPR? Would it be ethical? Was Russ Poldrak’s MyConnectome study ethical?
• The reference cited in support of “anecdotal (…) sharp drop in willingness to participate in experiment of which data may be published openly” is incorrect. There is no such journal as “Belief, Perception, and Cognition Lab”. I did find this piece in Winnower - https://thewinnower.com/papers/the-open-data-pitfall-ii-now-with-data A reader that is not careful enough might miss the fact that this piece (never peer reviewed) describes the same first author as the reviewed manuscript asking his students if they would participate in a study which data is going to be publicly shared. I have a mixed feeling about using this reference. On one side, I appreciate that the author acknowledged the ad hoc nature of it and lack of scientific merit, but finding those comments required some effort and are not clear in the currently reviewed manuscript.
• Finally, authors failed to reference the following five analyses of GDPR in context of research data:
Chassang G. The impact of the EU general data protection regulation on scientific research. Ecancermedicalscience [Internet]. 2017 Jan 3;11:709. Available from: http://dx.doi.org/10.3332/ecancer.2017.709
Rumbold JMM, Pierscionek BK. A critique of the regulation of data science in healthcare research in the European Union. BMC Med Ethics [Internet]. 2017 Apr 8;18(1):27. Available from: http://dx.doi.org/10.1186/s12910-017-0184-y
Stevens L. The Proposed Data Protection Regulation and Its Potential Impact on Social Sciences Research in the UK. European Data Protection Law Review [Internet]. 2015;1(2):97–112. Available from: http://edpl.lexxion.eu/article/EDPL/2015/2/4
European Society of Radiology (ESR). The new EU General Data Protection Regulation: what the radiologist should know. Insights Imaging [Internet]. 2017 Jun;8(3):295–9. Available from: http://dx.doi.org/10.1007/s13244-017-0552-7
Rumbold JMM, Pierscionek B. The Effect of the General Data Protection Regulation on Medical Research. J Med Internet Res [Internet]. 2017 Feb 24;19(2):e47. Available from: http://dx.doi.org/10.2196/jmir.7108
• Big plus for sharing the analysis code (in the future I recommend putting it in Zenodo or similar archive for long term preservation).
Overall the manuscript ends on a recommendation not to share data and statement that it is coincidentally the best thing for one’s scientific career which implicitly suggest that the ethical and legal reasons (and strictest interpretation of guidelines) is merely an excuse not to share data and maintain competitive edge. I am not sure if this was the intention of the authors, but this is how the manuscript reads now. Independent of legal and ethical arguments I am not convinced those are the values we want to foster in science.
I really wish this paper was more constructive in its nature and explore how scientists who want to or are required to publicly share human data could use consents forms to inform their participants of the risks. In the past, we have recommended a ready to use text that could be included in consent forms to ethically enable public data sharing: http://open-brain-consent.readthedocs.io/en/latest/ultimate.html. Considering that the new EU law will take effect in May 2018 this is the right time for researchers around EU to start adding such clauses to their consent forms.
Dongtao Wei, Kaixiang Zhuang, Qunlin Chen, Wenjing Yang, Wei Liu, Kangcheng Wang, Jiang-Zhou Sun, and Jiang Qiu
Review posted on 25th August 2017
Only a small percentage of neuroimaging data is being shared openly. The number of datasets expanding beyond caucasian white population and spanning wide range of ages is even smaller. Therefore this dataset is a valuable contribution to the field and merits a publication conditional on certain improvements.
- I strongly recommend distributing the dataset in the Brain Imaging Data Structure (http://bids.neuroimaging.io) format instead of the current custom file organization. This will greatly increase the ease of reuse and validation of the dataset.
- The dataset should be validated using bids-validator (https://github.com/INCF/bids-validator) to check for missing scans and consistency of scanning parameters across all subject.
- It is not clear if the anatomical and resting state scans were acquired during one or two separate sessions.
- "Image acquisitions" section mentions task data, but no other details are provided and files are missing. Is task data suppose to be part of this release?
- Context of the resting state scan should be explained - was it performed after or before a particular task?
- Please share the code/scripts/config files used to perform the analyses
- No "known issues" are reported in the paper. Is it really try that in such a large sample there were no scans that caused your concern?
- DPARSF is misspelled as DPARF
- Please provide which version of DPARSF was used
- The "Sex" column in the demographics Excel file appears twice
- I would advice against using Jet colormap in Figure 5 since it's perceptually inaccurate https://www.youtube.com/watch?v=xAoljeRJ3lU
- Labels on the axes of figures 2 and 3 are unreadable
Looking forward to reviewing a revised version of this paper.
Gia H. Ngo, Simon B. Eickhoff, Peter T. Fox, R. Nathan Spreng, B. T. Thomas Yeo
Review posted on 04th July 2017
In the manuscript “Beyond Consensus: Embracing Heterogeneity in Neuroimaging Meta-Analysis” Ngo et al. apply a previously published variant of the author-topic model to two new sets of labeled data: peak coordinates aggregated from three previously published meta-analyses somehow related to “self-generated thoughts” and a subset peak coordinates from studies overlapping with the IFG.
Even though I found the manuscript to be interesting and the presented application intriguing, the overall feeling it left me with was of an “identity crisis”.
On one hand, the reader might think the paper is proposing a new method. This would be suggested by the general nature of the title and the fact that the two example applications have very little to do with each other (cognitively or neuroanatomically). However, authors clearly state that all of the methods used in the paper (the vanilla author-topic model, the coordinate author-topic adaptation, and finally the variational Bayes estimation method) were already presented in previously published papers. What is more the paper lacks the usual parts present in a methods paper: null simulations/permutations, out of sample prediction, comparison with existing methods etc.
On the other hand, the paper might appear as reporting new cognitive finding. This perspective is also murky. There is no clear statement of hypotheses and combination of studying “self-generated thoughts” and IFG is not justified in the manuscript. Furthermore, details such as the inclusion criteria for the “self-generated through” analysis are not included.
To add to the confusion the manuscript includes 11 pages of mathematical derivations that authors themselves suggest should’ve been supplementary materials for their PRNI paper.
I propose two directions to improve the manuscript:
- Route 1: Turn the paper into a full-fledged methods paper. This will require investigating how the model performs when presented with realistic noise (null simulations or permutations), looking at out of sample predictions and evaluating the amount of variance explained. Other ideas include comparison with other factor decomposition methods (PCA, ICA) that do not take into account labels as well as comparing the maps obtained with a “meta-analysis” subset of coordinates to maps obtained from a model using the full BrainMap database in the previous paper. For this approach, it might be beneficial to pick a brain region that has been previously evaluated using similar methods (for example looking at insula and comparing with this paper https://academic.oup.com/cercor/article/23/3/739/317372/Decoding-the-Role-of-the-Insula-in-Human-Cognition). This would allow to contrast an compare different approaches and highlight the advantages of the author-topic mapping.
- Route 2: Focus on cognitive findings. This would require splitting the two analyses into two manuscripts and focus more on the cognitive implications of the findings. If hypotheses about the resulting maps exist they should be clearly stated – if not the exploratory nature should be noted. Interpretation (reverse inference) of the output maps can be improved by using the neurosynth cognitive decoder. Inclusion criteria for the meta-analysis need to be elucidated in more detail.
- I have performed a very simple reanalysis of the data used for the “self generated though” meta-analysis. Taking average activation maps from the 7 categories (navigation, autobiographic memory, ToM story, ToM non-story, narrative comprehension, and task deactivation) and running ICA on it gave me two components that were very similar spatially to the ones presented in the paper (one for navigation and one for everything else - see https://gist.github.com/chrisfilo/0722b520bc56da8c55aa6bba22eb85aa). This begs the question if the more complex author topic model was necessary? What advantages does it provide? Is it more interpretable? More “accurate”? Those issues should be discussed in the paper. This insight into the manuscript was only possible because authors decided to share the data (at least for half of their analyses) for which they should be applauded.
- When describing the author-topic model I would recommend putting “authors” in inverted quotes when referring to an entity in the original model rather than researchers authoring a paper. This should minimize confusion.
- Selection criteria for the meta-analyses and the individual studies have not been clearly defined for the “Self generated thought section”. For example why where studies labelled as “navigation" included? This needs to be justified since the selection of studies going into the model can greatly influence the end result.
- Not all studies used in the two example meta-analyses were cited in the paper. Citations are the important way of showing academic credit – all of the studies used in the paper should be appropriately credited. It is unusual for a paper to cite that many studies, but the work you are doing is cutting edge and require unusual means to accommodate appropriated credit dissemination.
- Only left Inferior Frontal gyrus is investigated – this should be a) justified b) made explicit each time IFG is mentioned in the abstract, methods and discussion.
- Please add L/R labels to all brain figures.
- “Reading” is listed twice in Table S1.
- The fact that perfoming the meta-analytic connectivity analysis requires a collaborative agreement with the BrainMap team (and thus the inclusion of a member of the brainmap project as a collaboration) should be explicitly mentioned in the discussion. Unfortunately, limited accessibility to this dataset is a limitation of the presented method. Alternatively, the authors might explore using other more open labelled coordinate datasets such as the neurosynth dataset.
- Please add a more thorough description of what code and data are available on GitHub.
- Sharing of the estimated spatial component maps. To improve transparency and reusability of the results presented in your paper please share the unthresholded spatial maps of the estimated components on ANIMA, BALSA, or NeuroVault (the last will make comparing them to other spatial maps such as Smith 2009 very easy).
Minor comments (aka pet peeves):
- Visualizations use cluster size and cluster-forming threshold. This might (or might not) be obscuring the true pattern. Presenting unthresholded pattern would be more accurate.
- The use of the jet color-map is unfortunate. It imposes unnatural contrast between some ranges of values thus introducing another level of perceptual thresholding. Using a luminescence calibrated colormap such as perula will improve interpretability of your figures.
I applaud the authors for sharing code and data. The only gripe I have is that I wish it was done before the manuscript was submitted for review. The same way one would never submit a manuscript with a missing figure the same way we should try not to submit papers with placeholder links to code and data.
Finally, I was not able to fully evaluate the mathematical derivations in the appendix. I hope another volunteer reviewer will be able to verify their accuracy.
I am looking forward to reviewing a revised version of the manuscript.
Christopher R Madan
Review posted on 11th June 2017
This short commentary introduces the reader to recent advancements in publicly available neuroimaging datasets with a special focus on data useful for evaluating anatomical features. It includes a brief historical perspective and provides an overview of the advantages of using publicly shared data.
The paper provides some unique and important points – for example, the fact that polling data from multiple sources gives researchers an opportunity to access previously inaccessible populations thus extending conclusions beyond the typically studied groups of participants. Even though all of the statements in the manuscripts are to my knowledge factually accurate I do find it a bit one sided. Advantages of using shared data are presented extensively, but disadvantages are almost completely omitted. It would be worth discussing aspects such as – 1) scientific questions are limited, by what data and metadata are available; 2) when combining data from multiple sites special care needs to be taken to account for scanner/sequence effects etc. I feel that addition would make this community more leveled.
Matthias Stangl , Jonathan Shine and Thomas Wolbers
Review posted on 07th February 2017
GridCAT is a much-appreciated attempt to provide computational tools for modeling grid-like patterns in fMRI data. I am by no means an expert in grid cells, but I can provide advice and recommendations with regards to brain imaging software:
- Please mention the license the software is distributed under.
- Please mention the license the data is distributed under. To maximize the impact of this example dataset (fostering future comparisons and benchmarks) I would recommend distributing this dataset under public domain license (CC0 or PDDL) and putting it on openfmri.org
- I was, unfortunately, unable to run your software because I do not possess a valid MATLAB license. This costly dependency will most likely be the biggest limitation of your tool. There are two way to deal with this problem: make it compatible with Octave (free MATLAB alternative) or provide a standalone MATLAB Runtime executable (see https://www.mathworks.com/products/compiler/mcr.html)
- I would encourage the authors to add support for input event text files formatted according to the Brain Imaging Data Structure standard (see http://bids.neuroimaging.io/bids_spec1.0.0.pdf section 8.5 and Gorgolewski et al. 2016)
- Please describe in the paper how other developers can contribute to your toolbox. I recommend putting it on GitHub and using the excellent Pull Request functionality.
- Please describe in the paper how users can report errors and feature requests. I again would recommend using GitHub or neurostars.org.
- Is there a programmatic API built in your toolbox? In other words a set of functions that would allow advanced users to script their analyses. If so please describe it and provide an example.
- Please describe how you approached testing when writing the code. Is there any form of automatic tests (unit, smoke or integration tests)? Are you using continuous integration service to monitor the integrity of your code?
- For the GLM1 modeling step: is it possible to provide nuisance regressors (for example motion)? If so are you reporting information about colinearity of the fitted model?
- For the ROI feature - it would be useful to show users the location of their ROI on top of the BOLD data. This would provide a sanity check that can avoid using masks that are not properly coregistered.
- It would be beneficial for the paper to include some figures of the GUI from the manual and maybe list the plethora of different analysis option available on different steps in a table.
- Please add error bars to figure 5.
Israel Almodóvar-Rivera and Ranjan Maitra
Review posted on 06th February 2017
Authors present and appealing methodological improvement on the Adaptive Segmentation (AS) method. The main improvement is alleviating the need to set input parameters (bandwidths sequence). Those parameters are fitted from the data in an optimal way.
Even though the paper has the potential to be a meaningful contribution to the field it lacks thorough comparison with the state of the art. The following steps to improve the situation should be considered:
- The selection of pattern used in the simulation seems to be motivated by the nature of fMRI data which is good, but at the same time, it does not highlight the specific issues that FAST is solving. Have a look at the simulations included in Polzehl et al. 2010 showing how smoothing across neighboring positive and negative activation areas can cancel the effect out. It would be beneficial to construct simulations that highlight the specific situation in which FAST overcomes the limitations of AS.
- Neuroimaging is strongly leaning towards permutation based testing methods due to the reduced number of assumptions. I would recommend adding cluster and voxel based permutation based inferences to your analysis. Please mind that permutation based testing is not the same as finding cluster cut offs via simulations.
- I would also recommend adding threshold-free cluster enhancement (Smith and Nichols 2009) to the set of compared methods. It is also a multiscale method that has been successfully used in many studies. This method also works best in comparison with permutation tests.
- It would be good to assess the rate of false positive findings in your comparison. This could be done by applying a random boxcar model to resting state data and evaluating how many spurious activations you find (see Eklund et al. 2012).
- Speaking of false positive and false negative voxels. It seems that the evaluation of your method in the context of the state of the art presented in Figure 4 is very sensitive to the threshold (alpha level) chosen for each method. I would suspect that AS and CT would perform better if a different alpha level was chosen. To measure the ability to detect signal more accurately I would recommend varying the alpha level to create a receiver operator curve (based on false positive and false negative voxels rather than Jaccard overlap) and calculate the area under it.
- In the figure, you use the TP11 acronym to denote the adaptive segmentation algorithm, but in the rest of the paper, you use AS. It would be good to normalize this.