Select Page

In an article in Biometrical Journal, Heinze et al discussed phases of methodological research in biostatistics. The authors of this publication are all members of an international STRATOS (STRengthening Analytical Thinking for Observational Studies) Intiative, whose goals has been to provide guidance for relevant methodological topics in the design and analysis of observational studies for specialist and non-specialist audiences.  As they had mentioned phases, they were motivated by drug development, which has used the framework “phases of research” from decades ago.  These phases have been: phase I (usually animal models or as they say, intolerability), phase II (safety and/or efficacy), phase II (efficacy of drug of interest compared to standard of care of placebo), and phase IV (long-term effectiveness in real world).  A similar way of defining phases was also made for prognostic factor research.  The authors were then motivated by this to define “phases with well-defined aims” to build evidence base for methodological research.

They proposed a framework for this.  They proposed Phase I to be methodological.  They then recommended that Methodological phase II may have the aim to prove that a method can be used with caution in an applied setting which is not completely identical to the developer’s target setting. They mentioned that in looking at biostatistical journals that these phase II studies were abundant in the literature.  The Methodological phase III was recommended to apply methods across different types of studies and help assess validity for the method.  The Methodological phase IV was seen as working with an established method.  Breakdown scenarios would have been where the method was suboptimal and that had not turned up before.

It would have been unusual in publications to see all four phases of research published so they give two examples. One was Firth’s correction.  As the authors stated, Firth’s correction is a bias correction method for maximum likelihood estimators. As a side effect, the correction actually provides finite estimates of regression coefficients in generalized linear models even with data constellations, where the maximum likelihood estimates do not exist.  The different phases of the this research occurred but all at different time points in history. The second example which they presented was Predictive Mean Matching (PMM), a type of “hot deck” procedure that multiply imputes each missing value with a “borrowed” observed value.  Roderick Little first introduced this procedure to impute only observable values to replace missing ones based on a model in Phase I. In Phase II, he along another author (Heitjan and Little, 1991) used predictive mean matching to multiply impute seatbelt use and blood alcohol content in the Fatal Accident Reporting System database and they also did some simulations. In Phase III, Schenker and Taylor (1996) conducted a  more extensive simulation study.  Finally in Phase IV, Morris et al (2014) reviewed the existing literature on PMM and they considered how to improve it.

The authors then reviewed a volume from each of four biostatistical journals to see if the pilot phase was actually published. While Biometrika had phase I studies, phase II dominated Biometrical Journal, Statistics in Medicine, and Statistical Methods in Medical Research. They also found that only a few papers could be classified as Phase IV. Oftentimes, another issue was that only a single evaluator was used for the review, which could have clouded the review by personal judgement. The authors called for a greater need for using their framework to engender more efficient communication and also better peer regulation of reviews. Also, the community at large would benefit from Phase IV studies so statistical folks can understand how well a tool is working in practice.  In their conclusion, their hope was stated that ideally their framework can provide a constructive way of thinking about what research would move the development of a method forward.


Written by,


Usha Govindarajulu


Keywords:  biostatistics, research, phases



Heinze G, Boulesteix A-L, Kammer M, Morris TP, White IR, and the Simulation Panel of the STRATOS Initiative (2023). “Phases of methodological research in biostatistics – Building the evidence base for new methods” Biometrical Journal,

Heitjan, D. F., & Little, R. J. A. (1991). Multiple imputation for the fatal accident reporting system. Journal of the Royal Statistical Society, Series C (Applied Statistics)40(1), 13–29.

Morris, T. P., White, I. R., & Royston, P. (2014). Tuning multiple imputation by predictive mean matching and local residual draws. BMC Medical Research Methodology14, 75.

Schenker, N., & Taylor, J. M. G. (1996). Partially parametric techniques for multiple imputation. Computational Statistics & Data Analysis22(4), 425–446.