Review for "FMRIPrep: a robust preprocessing pipeline for functional MRI"

Completed on 24 May 2018 by Tim van Mourik .

Login to endorse this review.


Significance

The authors introduce FMRIPrep as an automatic preprocessing pipeline for many types of MRI data. They try to combine methods from all existing tools to optimize performance, and have set up an open way of collaborating with a large part of the neuroimaging community. The paper is a clear description of the preprocessing problems that neuroscience is facing, and the way in which fMRIPrep helps to overcome these problems. The preprocessing steps and the vision of FMRIPrep are thoroughly explained. The high quality of the analysis is convincingly shown on adequate data. I truly commend the structured approach of this project, the community building around it, and the amount of software engineering: all well beyond current neuroscientific standards.


Comments to author

Author response.

*Disclaimer*: I was already familiar with the project, but I have not used fMRIPrep hands-on. Some of this is direct feedback on the manuscript, but as this is not a regular journal review, I would also like to put out some thoughts forward that the authors may or may not want to incorporate in their work.

The authors introduce FMRIPrep as an automatic preprocessing pipeline for many types of MRI data. They try to combine methods from all existing tools to optimize performance, and have set up an open way of collaborating with a large part of the neuroimaging community. The paper is a clear description of the preprocessing problems that neuroscience is facing, and the way in which fMRIPrep helps to overcome these problems. The preprocessing steps and the vision of FMRIPrep are thoroughly explained. The high quality of the analysis is convincingly shown on adequate data. I truly commend the structured approach of this project, the community building around it, and the amount of software engineering: all well beyond current neuroscientific standards.

We thank the reviewer for all their feedback.

FMRIPrep distinguishes itself primarily features:

-Automatic data type recognition

-Rich support of data types

-Rich visual performance feedback

-Increased performance by best-of-breed approach

The paper describes these points in detail and is thereby primarily focused at the users of fMRIPrep. The paper mentions to discuss to “[...] describe the overall architecture, software engineering principles, [...]”, line 76-77, but I feel that is rather overpromised. That’s a shame, because it leaves out some key features of fMRIPrep. Deciding to include it will shift the target audience towards developers and not only users, and it is largely a matter of taste (and journal choice) if the authors want this.

In full agreement with the reviewer, we consider that its overall architecture, software engineering principles, and a comprehensive validation of the tool are crucial to fMRIPrep. We revise them in the paper and describe them in further detail within the Online Methods. In sight of the reviews received, we feel that the organization we chose adapts well to the expected readership of the paper. However, to accommodate the expectations of the most technical audience (potential developers that may join fMRIPrep in the future), we started a section on the documentation (https://github.com/poldracklab/fmriprep/issues/1215) targeting them, where these elements can be outlined in greater detail.

In the current section software engineering, line 195-210, the following points are mentioned, but in my opinion too limited. FMRIPrep:

- Is well documented because contributors pledge to do so

- Usually, contributors do not pledge such a thing voluntary, so is it a hard requirement? And (how) is it enforced.

-Is collaborative. FMRIPrep does not have a closed model

- FSL is also a collaborative project. How is this different? The antithesis of the closed model remains unspecified.

- Has continuous integration

Similarly to question OR2-1, further details about these points are given in the Online Methods document. We understand that the additions to the new section “COMMUNITY-CENTERED, PEER-REVIEWED DEVELOPMENT AND USAGE” address the first four bullet points. This new section is reproduced below. For the last point (“Has continuous integration”) we have considered it in combination to the next point OR2-3.

COMMUNITY-CENTERED, PEER-REVIEWED DEVELOPMENT AND USAGE
Community-building to ensure a sustainable impact
Inspired by the rOpenSci experience [30] , the development of fMRIPrep has been driven by open-source principles, with a strong focus on community building from the start. To measure how centralized a software project is, and therefore the risk for the project to turn over a decline slope to death, Avelino et al. proposed the so-called bus- or truck-factor [31]. As they define it, the truck-factor “is a measure of concentration of information in individual team members” and it “designates the minimal number of developers that have to be hit by a truck (or quit) before a project is incapacitated” [31], conveying that community-building is fundamental for open-source projects to ensure survival. Over 30 researchers have already contributed with code, reported issues, or with communications through the GitHub platform. GitHub is an effective platform for sharing code and information built atop the Git versioning system. An instance of its effectiveness to discuss and drive the progress of fMRIPrep is presented here [footnote link]. A larger number of researchers has also engaged within the community through the NeuroStars forums [footnote link]. The quick growth of the community also illustrates the success of the “glass-box” principle, as “black-boxes” deter participation of third parties with their obscurity. Statistics about current adoption and usage of fMRIPrep are given below.

Peer-reviewed development
GitHub provides with a powerful tool to easily screen the quality of code changes by contributors
called “pull-requests”. Pull-requests have a dedicated forum accessible via web, where collaborators can interact, enabling the necessary conversation and oversight to ensure the quality of contributions. Collaborators are aided by a review template to ensure all relevant aspects of the feature have been properly implemented and documentation has been included. The review template and the process of pull-request review are inspired by those of The Journal of Open Source Software [32]. An example is available here [footnote link].

Current usage of fMRIPrep
Since the inclusion of a monitoring code with fMRIPrep version 1.0.12 (May 3, 2018), the tool has been invoked over 15,000 times (∼5,000/month) by researchers around the globe, illustrating its fast adoption. These trends on adoption are also suggested by the number of unique, worldwide visitors to the documentation website (see Figure S9) and more than ten thousand pulls of fMRIPrep’s Docker image [footnote link]. On OpenNeuro.org, fMRIPrep has accumulated more than 240 run requests, accounting for the ∼40% of all analyses requested so far in the platform. Some of these requests were executed on data uploaded by users – not found in OpenfMRI.

We have incorporated fMRIPrep in the processing workflow of all datasets in our laboratory. One example of the robust performance of fMRIPrep on idiosyncratic datasets is presented in Figure S10. FMRIPrep performed with high-accuracy on a challenging dataset with simultaneous electrocorticography (ECoG) recordings.

- The supplemental figure is useful, but could potentially also describe what type of things are and are not tested.

We have added a new subsection called “Continuous integration, a technique for recurrent, unsupervised assessment of software” to address this comment.

Continuous integration, a technique for recurrent, unsupervised assessment of software. Leveraging the free tier of CircleCI (https://circleci.com) for open source projects, fMRIPrep has an established test and deployment framework. With every iteration of the code development (a bug fix, a new feature), a new test cycle (Figure S5) is triggered automatically. The test cycle first builds the software, assessing the correct packaging and containerization. Then, using substantially sub-sampled data for tests, three variants of the fMRIPrep workflow are run. The first variant runs fMRIPrep without surface reconstruction on one participant from DS000005. A second variant runs with surface reconstruction re-using pre-calculated results, on one participant from DS000054. A third variant runs the workflow without surface reconstruction on a multi-echo dataset (DS000210). These three full executions of fMRIPrep are completed with a complete rebuild of the documentation pages, and some unit-tests. By adding code coverage collection to the continuous integration tests, we expect to identify those sections of fMRIPrep that should be more tested. Finally, when a new version of fMRIPrep is being built, the continuous integration workflow finishes with the upload of new Python packages and container images to their corresponding repositories.

There are several software aspects that may be worth reflecting on:

- Dependencies on other software and consequences. Primary dependency is Nipype of course. What happens if Nipype updates its API? Could you easily update along? Or could they at some point be so intertwined that they hold other in an update gridlock? Or is the dependency one way? Noteworthy secondary dependencies? Maybe this is too detailed, but it’s a line of thought worth thinking about.

As described in lines 211-219, fMRIPrep leverages containers to “freeze” software environments, making sure that a particular fMRIPrep release is always run with a given set of version of the dependencies. Spot on, the reviewer points out Nipype as a potential failure point because of the rapid evolution of the tool that may lead to API changes. In such instance, users can “pin” software to specific versions that are known to work (very much as any other software product). However, Nipype is fundamental to fMRIPrep and that strategy might turn unacceptable. On one hand, given the overlap of developers between both projects, Nipype and fMRIPrep have been feeding each other since the inception of fMRIPrep. On the other hand, when the API of Nipype has not advanced fast enough for fMRIPrep to implement features, we have used a fork strategy, maintaining a fMRIPrep-”flavored” version of Nipype for internal use of the tool. This technique (including Nipype within fMRIPrep itself) has been used for more than 1 year during the development of fMRIPrep.

- It may be too detailed to talk about, e.g., software design patterns when discussing the architecture, but at the moment the programming language in which fMRIPrep is written isn’t even mentioned.

As for OR2-1, we considered that these details are sufficiently explained within the documentation website and that we could omit them from the manuscript. However, we agree with the reviewer on the relevance of these details.

- Is fMRIPRep Nipype only, or is there also a lot of custom glue code? Any policy/preference for doing it ‘the Nipype way’ or just an ‘anything that works’ way?

FMRIPrep mostly uses “nipype-like” code, a code-style guidelines state the policy for contributions.

- What criteria do you have for contributions?

We have added a particular section to the Online Methods in order to address this question.

Peer-reviewed development
GitHub provides with a powerful tool to easily screen the quality of code changes by contributors
called “pull-requests”. Pull-requests have a dedicated forum accessible via web, where collaborators can interact, enabling the necessary conversation and oversight to ensure the quality of contributions. Collaborators are aided by a review template to ensure all relevant aspects of the feature have been properly implemented and documentation has been included. The review template and the process of pull-request review are inspired by those of The Journal of Open Source Software [32]. An example is available here [https://github.com/poldracklab/fmriprep/pull/1170#pullrequestreview-135163343].

- The paper contains with a comprehensive explanation of the components of fMRIPrep. This is useful, but I feel a small side note is warranted. The authors rightfully pride themselves with adopting new insights promptly (line 203-204), but should thereby acknowledge that the current description may be rendered invalid soon as well. I do not at all see this as a weakness of the tool, but it is worth mentioning.

We have integrated this thoughtful comments editing lines 202-203 as follows:
This paradigm allows the fast adoption of cutting-edge advances on fMRI preprocessing, which tend to render existing workflows (including fMRIPrep) obsolete.

- The authors claim an increased adaptivity and flexibility, but with that, there is an implied responsibility to keep maintaining the work. But how is the future of this project guaranteed? What happens if the lead developers, e.g., get snatched by Google or, more pessimistically, what’s the truck factor of this project (https://peerj.com/preprints/1233.pdf)?

The reviewer rightly points to the importance of sustainability of open source projects. This issue partially overlaps the previous question OR2-2, and thus we think it is covered with the following new paragraph:

Community-building to ensure a sustainable impact
Inspired by the rOpenSci experience [30] , the development of fMRIPrep has been driven by open-source principles, with a strong focus on community building from the start. To measure how much a given project depends on just a few individuals, Avelino et al. proposed the so-called bus- or truck-factor [34]. As they define it, the truck-factor “is a measure of concentration of information in individual team members” and it “designates the minimal number of developers that have to be hit by a truck (or quit) before a project is incapacitated” [34], conveying that community-building is fundamental for open-source projects to ensure survival. Over 30 researchers have already contributed with code, reported issues, or with communications through the GitHub platform. GitHub is an effective platform for sharing code and information built atop the Git versioning system. An instance of its effectiveness to discuss and drive the progress of fMRIPrep is presented here [footnote link] . A larger number of researchers has also engaged within the community through the NeuroStars forums [footnote link]. Statistics about current adoption and usage of fMRIPrep are given below.

We calculated a truck factor of 1.0 for this project, although we had to use an abandoned version of the code (https://github.com/yamikuronue/BusFactor) since the current version failed for fMRIPrep and other projects (given as examples to demonstrate the TF calculation code). We also unsuccessfully tried to include fMRIPrep within the gittrends.io database, but the project seems abandoned as well. For these reasons, we did not have confidence enough to insert the Truck Factor calculation into the paper. However we provide now the figures that support the sustainability of the project and the surprising amount of peers that have participated in the community.

I would like to raise a point concerning the glass box philosophy. A mechanism/tool/script (a ‘box’) can be transparent in function and/or transparent in its output. Now I suppose I got tangled with my own preconception that you’d mean the former, but the authors define it more or less as the latter in your introduction. This is in line with the extensive amount of visual reports that are produced by fMRIPrep. But however clear the output looks, if the tool does not invite to inspect the insides of the machinery, the user may still remain in darkness as to what happens under the hood (e.g. FEAT and its spagetthi pipeline https://timvanmourik.github.io/Porcupine/examples/FEAT-example). I think you’re in the position to claim that you’re also improving this, otherwise it couldn’t have been such an collaborative project, but it does not become clear from the paper.

We acknowledge the concern and, we have made extensive changes to the concluding remarks of the discussion. In particular, we attempted to provide a more clear definition of the “glass box” principles and how fMRIPrep implements them:

The rapid increase in volume and diversity of data, as well as the evolution of available techniques for processing and analysis, presents an opportunity for significantly advancing research in neuroscience. The drawback resides in the need for progressively complex analysis workflows that rely on decreasingly interpretable models of the data. Such context encourages “black-box” solutions that efficiently perform a valuable service but do not provide insights into how the tool has transformed the data into the expected outputs. Black-boxes obscure important steps in the inductive process mediating between experimental measurements and reported findings. This way of moving forward risks producing a future generation of cognitive neuroscientists who have become experts in using sophisticated computational methods, but have little to no working knowledge of how data were transformed through processing. Transparency is often identified as a treatment for these problems. FMRIPrep ascribes to “glass-box” principles, which are defined in opposition to the many different facets or levels at which black-box solutions are opaque.

The visual reports that fMRIPrep generates are a crucial aspect of the glass-box. Their quality control checkpoints represent the logical flow of preprocessing, allowing scientists to critically inspect and better understand the underlying mechanisms of the workflow. A second transparency element is the citation boilerplate that formalizes all details of the workflow and provides the versions of all involved tools along with references to corresponding scientific literature. A third asset for transparency is the thorough documentation which delivers additional details on each of the building blocks that are represented in the visual reports and described in the boilerplate. Further, fMRIPrep is open-source since its inception: users have access to all the incremental additions to the tool through the history of the version-control system. The use of GitHub (https://github.com/poldracklab/fmriprep) grants access to the discussions held during development, allowing the retrieval of how and why the main design decisions were made. GitHub also provides an excellent platform to foster the community and provides useful tools such as source browsing, code review, bug tracking and reporting, submission of new features and bug fixes through pull-requests, etc. The modular design of fMRIPrep partakes on its flexibility and helps transparency, as the main features of the software are more easily accessible to potential collaborators. In combination to some coding style and contribution guidelines, this modularity has enabled multiple contributions by peers and the creation of a rapidly growing community difficult to nurture behind closed doors. A number of existing tools have implemented elements of 'glass-box' philosophy (for example visual reports in FEAT, documentation in C-PAC, open source community of nilearn), but the complete package (visual reports, educational documentation, reporting templates, collaborative open source community) is still rare among scientific software. FMRIPrep’s transparent and accessible development and reporting aims to better equip fMRI practitioners to perform reliable, reproducible, statistical analyses with a high-standard, consistent, and adaptive preprocessing instrument.
We also agree on the perception that, by strongly linking transparency (the “glass-box”) to the visual reports we were dismissing other types of transparency (“... can be transparent in function and/or transparent in its output ... ”). For that reason, in addition to the rewrite of the final part of the discussion, we have made more precise wording when linking reports and transparency. For example, line 70 stated that “These reports exemplify the “glass-box”...”, and now it reads “These reports contribute to the “glass-box”...”. We removed the modifiers to “maximal transparency” of lines 288-289 and left the precise definition to the new text in the Discussion. Similarly, the aspects contained in lines 299-304 are now included later in the Discussion. We have also rewritten the caption for Figure 2 to reiterate on this aspect:

Figure 2. The visual reports ease quality control and help understand the processing flow. The reports partake on the transparent operation of the tool, because they reflect the multiple steps that the workflow comprehends, and how these elements are intertwined.

The reason I’m pointing it out is that, paradoxically, the ease with which one can use a tool may lead to a decreased understanding, much related to Marder et al 2015 [111]. In this light, it may be worth reflecting on the dangers it may bring to make a tool analysis-agnostic. The fact that fMRIPrep automactically discovers the types of scans is very useful, but it raises the question: is preprocessing only a means to an end, or is it desirable that the user knows what’s happening under the hood?

We profoundly agree with the reviewer on that ease-of-use makes it also easy to use the tool as a “black-box”, not because it is opaque from outside, but because the researcher uses it blindly (without looking inside the box). However, we think this paradox belongs in a higher-level discussion out of the scope of this particular paper, as it addresses one of the most important threats to reproducibility. We touch on that higher-level conversation in the Discussion (lines 359-370 of the revised manuscript). We feel that, aiming at the maximal transparency of fMRIPrep (lines 371-389) is the only way we can address this paradox.

The process by which fMRIPrep is updated (illustrated in Fig. 4) is self-reflexive and it is further convincingly shown and well illustrated that performance is superior than the standard feat analysis.

I had some minor comments but I see all typos are already addressed in the review of Samuel Nastase. The ones I found are only a subset of his.