Statistical issues in general (Primary causal effects with unmeasured confounders)

May 20, 2026

Unmeasured confounding has been a long-standing methodological challenge for causal inference and these confounding mechanisms can violate the ignorability assumption that is a bedrock of causal inference. As they say, the profound implications of this are crystallized in Simpson’s paradox (Simpson, 1951) which then demonstrates how unmeasured confounders can systematically distort both outcome modelling and covariate balance structures, which then compromises conventional causal analyses. Various strategies to mitigate these issues have been developed from E-values to instrumental variables. The authors have looked into leveraging information from latent confounding contained within primary and secondary outcomes to achieve more precise average treatment effect (ATE) estimation. The authors proposed a novel method to estimate accuracy of the treatment effect on the primary outcome by leveraging additional latent information from multiple secondary outcomes and secondly, under the ignorability assumption with unmeasured confounding, their constructed promxy variables inherit the critical conditiona mean independence property to ensure both identification and favorable asymptotic properties, consistency and asymptotic normality.

In their methodology they developed a structural equation model with all covariates and unmeasured confounders. Next, their key steps involved first removing the information from known covariates that are associated with the outcomes. Therefore, factor analysis was applied to the residual matrix, and the extracted common factors, together with known confounders, are used as covariates for inverse probability weighting to achieve covariate balance, ultimately yielding the IPW-type estimator. The factors scores which they created were estimated using Bartlett method.

They found that a key result that underpins their approach is the theorem stating that the factors extracted through factor analysis play the same role as the original latent variables. Under the independence assumption among variables, the theoretical analysis demonstrated that their model achieved optimal performance when all parameters attained statistical significance, thereby ensuring accurate propensity score estimation under correct model specification.

They also went through a selection process for the secondary outcomes to ensure that sufficient information on the unmeasured confounders is obtained from the secondary outcomes and avoiding unnecessary number of instrumental variables.

Their proposed estimator yielded substantially smaller bias compared to the one based solely on observed covariates. They stated that this finding validated the effect of their approach in handling nonlinear confounders and as they said, it is hypothesized that this is due to the overall additive nature of the model, which allows factor analysis to effectively extract relevant information.

Their paper introduced a novel causal inference methodology for primary and secondary outcome settings that enables researchers to incorporate auxiliary outcome variables as a supplemental analytical step, enhancing model robustness with minimal additional effort. Their main methodology depended on applying factor analysis to outcome residuals after covariate adjustment, employing the resulting factor scores as proxies for latent confounders rather than attempting direct confounder recovery. The theoretical foundation is established through a general ignorability assumption and alternative factor structure, demonstrating that the conditional expectation ignorability of proxy confounders ensures consistency of the inverse probability weighting estimator. Regarding variability, they reported the variance expression for the IPW estimator. However, under their weaker ignorability condition, standard arguments for establishing the semiparametric efficiency bound do not directly apply; accordingly, the resulting variance is not expected to coincide with the semiparametric efficiency lower bound.

Written by,

Usha Govindarajulu

Keywords: causal inference, unmeasured confounding, factor analysis, ignorability, latent variable

References:

Kong D, Chen M, and Zhou Y (2026) “A Novel Secondary-Outcome Approach to Estimating Primary Causal Effects With Unmeasured Confounders” Biometrical Journal,

https://doi.org/10.1002/bimj.70139Digital Object Identifier (DOI)

https://onlinelibrary.wiley.com/cms/asset/21853089-c4dd-4624-84e1-b2000c5ae667/bimj70139-fig-0004-m.jpg

Statistical issues in general (Primary causal effects with unmeasured confounders)

Recent Posts

Categories