Completed on 9 Apr 2017 by David C. Norris. Sourced from http://biorxiv.org/content/early/2017/04/03/123240.

Login to endorse this review.

As I understand it, this paper addresses the following problem in current pharmacometrics practice: For models with any substantial degree of complexity, the unavailability of formal theoretical results regarding parameter identifiability necessitates the development of pragmatic, simulation-based approaches for exploring the identifiability properties of models. The paper presents such an approach, represented in Figure 2.

In this Figure, four models are examined *in mean*, through a simulation-based sensitivity analysis procedure. This is a most healthy suggestion. I believe generally that the world would be a much better place if no-one ever estimated the parameters of a statistical model without the 'due diligence' of time spent simulating from that model. Certainly, several egregious examples of statistical malpractice I have encountered in recent years (in the social sciences) simply could not have occurred if those guilty had considered their models at the level of depth required to simulate from them.

I believe these 4 models examined by Andrew Stein are formally *nested*; that is, each successive model in the sequence (**Full**, **QSS**, **QSS-CT**, **MM**) can be derived from its predecessor by imposing a suitable constraint. With each imposed constraint, a parameter is expunged. Thus, **QSS** is derived from **Full** by imposing the quasi-steady-state constraint given in the (unnumbered) equation at the top of page 5. With the imposition of this constraint, the parameters *k_on* and *k_off* are replaced by a new parameter *Kss := (k_off + k_eCR)/k_on*. (It seems to me that one may view this as formally equivalent to taking the limit *k_off --> 0* and substituting *k_eCR/k_on* for *Kss* everywhere it appears in **QSS**, so that this restriction effectively amounts to zeroing the *k_off* parameter of **Full**.) The passage from **QSS** to **QSS-CT** is indeed described explicitly (at the bottom of page 5) as implementable by a simple equality constraint *k_eCR := k_eR*. In terms of the 'lumped parameters' laid out at the bottom of page 6, this would seem equivalent to the constraint *R_0 := Rtotss*, which explains why *R_0* 'drops out' in the transition from **QSS** to **QSS-CT** in Figure 2. It is not entirely clear to me whether or how **MM** derives from **QSS-CT** in a similar manner, but my intuition is that **MM** amounts to neglecting altogether the presence of receptors that are synthesized and endocytosed. This leads me to suspect that **MM** might be obtained somehow in a limiting process as *Rtotss --> 0*, provided that other quantities are suitably constrained. Is this the case?

If the 4 models are in fact (as I suspect) formally nested, the author should explicitly state this and indicate how this is so. Ideally, the 4 models might be exhibited in equations lined up so as to reveal this nestedness formally. If I am wrong about the nestedness of these models, then this misunderstanding should be warded-off explicitly. For example, can an example of **MM** be produced which is demonstrably *not* a special case of **QSS-CT**?

Whether or not I am correct about the formal nestedness of the 4 models, Figure 2 strongly suggests that the models are *for all practical purposes* nested, since the figures in each row seem to be identical no matter how closely I inspect them. (The sole exception to this observation proves the rule: the slight difference for the **QSS-CT** plot in the *Rtotss* row would, I believe, vanish under a reparametrization that replaced *R_0* with *Rdiff := Rtotss - R_0*. In this case, the passage from **QSS** to **QSS-CT** would be accomplished by a simple 'dropping out' *Rdiff --> 0* rather than the more entangled constraint *R_0 := Rtotss*.) In light of this great redundancy in Figure 2, the 4th column (**Full**) seems to suffice for the whole 2-D array presented. A more compact 1-D presentation of Figure 2 might be preferred. Alternatively, perhaps the 2 available dimensions on the page could be more fully exploited to generate an even more informative 2-D exploration of the models. (I wonder if a 2-D plot array would be more appropriate for comparing a larger set of models where more complex patterns of nesting impose only a partial ordering.) Highly informative arrays of plots are an excellent way to present the inner workings of models; I am a great fan, for example, of the 'partial effects plots' that can be produced from Frank Harrell's R package 'rms'. I encourage the author to press further in this direction to find ever more vivid representations of the pragmatic, simulation-based model exploration approach he advances here. I wonder, in particular, whether some useful inspiration might be found in the 'pairs' plots typically used as MCMC diagnostics, an example of which can be found here http://www.rforscience.com/.... (Certainly I have myself discovered identification problems in my own model specifications from examining such plots.)

Although these (I presume) nested models are presented in the Methods section in order of generality from greatest to least, the parameters employed in the exercise seem to derive (quite appropriately, given the author's intent to nurture intuition) from the simplest models, and bubble upward into the more general ones. In light of (what I presume is) the nested nature of these models, a term like 'shared parameters' might be more intuitive than 'lumped'. (But if 'lumped' constitutes a standard terminology in pharmacometrics that I am unfamiliar with, then I certainly yield this point.)

Turning now to some larger conceptual and even epistemologic issues, I would like to question whether the definition Andrew Stein employs for 'practical identifiability' serves him well, or might be replaced with a better one. Stein draws his definition from his citation #5, a 2009 paper by Raue et al. where a *profile likelihood* approach to identifiability is taken. The relatively abstract and arbitrary definition of 'practical identifiability' employed in that paper seems chosen specifically to make this technical approach 'practical'. I am rather inclined to plumb the word 'practical' for its truer meaning in a context where pharmacometrics aims to support *business decisions* in pharmaceutical development -- on the understanding, of course, that these business decisions ultimately anticipate *medical* decision-making. Thus, I would be inclined to regard a parameter as 'practically identified' if we can learn enough about it to make a **decision**. The form of such decisions -- which in general may range from deciding on a design of the next experiment to deciding whether to terminate a drug development program -- need not be prejudged or prefigured in advocating a pragmatic approach such as Stein advocates in this paper. The form of the decision can remain extrinsic to the approach, to be 'plugged in' by the pharmacometrician in each application of the approach. It seems to me that, in certain circumstances, confining log(param) to a half-line -- rather than to a closed interval as required under the Raue et al notion of 'practical indentifiability' -- may well suffice to meaningfully inform a business decision. To me, in fact, the most exciting thing in this paper is the rejection of large steady-state receptor densities by the data's falsification of the (simulation-based) prediction of a terminal phase as shown in the *Rtotss* row of Figure 2. (What if the upper bound on *Rtotss* thus available tells us that we have to design a more expensive experiment to get more information on the actual receptor density? We have surely learned something, in that case.)

This exciting development brings to mind the decision-theoretic/falsificationist *modus operandi* celebrated in Platt's famous piece on 'Strong Inference' http://science.sciencemag.o.... What is commonly called the 'hypothetico-deductive' model of scientific inquiry contrasts with the 'inductivist' spirit evident in much of this paper's language. Andrew Stein discusses pharmacometric modeling in terms that suggest a view of it as an exercise in curve-fitting, by which a model slurps up knowledge from data into its parameters. Karl Popper over his many years formulated many arguments to the effect that inductive inference is a "myth", the most shocking of which is perhaps the Popper-Miller argument http://www.nature.com/natur.... I offer this latter citation mainly to provide a full and proper context for my comments here, and not necessarily as 'suggested reading' for Andrew Stein. But I would very much like to encourage Stein to read the Platt paper, and to consider how he might recast his discussion in language that bears the spirit of 'Strong Inference'. I rather suspect that a strong-inference mindset would prove far more habitable for his proposed simulation-based approach to identifiability, and would better support its further growth and evolution.

I am very glad to have had the opportunity to read this stimulating paper, and I look forward to following its further development!