Dynamic Causal Effects in a Nonlinear World: The Good, the Bad, and the Ugly
Michal Kolesár, Mikkel Plagborg-Møller
The good news: local projections and VARs estimate the right thing even in the presence of non-linearities in the DGP
The bad news: purely data-based identification via heteroskedasticity or non-Gaussianity don’t work in the presence of non-linearities
The good news has a caveat – you generally need either the actual shock itself or at least an instrument / proxy. Identifying the shock via controls only works if that residualized shock is non-linearly unpredictable by the controls, which likely isn’t the case in most applications. This seems more damning for identification via controls than they are emphasizing, though I’m sure they understand that:
“we also show that when control variables are needed to isolate a true shock (i.e., recursive or Cholesky identification), then positive weights can only be guaranteed if the linearly residualized shock is nonlinearly unpredictable by the controls, which may be a strong assumption in the absence of detailed institutional knowledge and high-quality data.” (Kolesár and Plagborg-Møller, 2025, p. 3)
I love this passage:
“linearity-based estimators are useful even when economic theory predicts a nonlinear relationship between the shock and the outcome of interest. For example, if the outcome variable has limited support, such as when it is binary or censored (say, due to a zero lower bound), nonlinearities are inherently present. If one is interested in characterizing the nonlinearities, then it makes sense to model them, and it is of course always a good idea to plot the raw data regardless. However, if one is interested in an overall summary of marginal effects, then linear local projections and VARs are theoretically coherent estimators” (Kolesár and Plagborg-Møller, 2025, p. 3)
I read this as basically saying “even with weird outcome variables, local projections and VARs work, and you don’t need to bother characterizing the non-linearities explicitly.”
The data-based estimators are basically useless because they presume linearity of the structural model. If you attempt to fix that issue via a nonparametric approach, you end up with confidence intervals too wide to draw any conclusions at all:
“When there is a dearth of direct shock measures or proxies, applied researchers frequently resort to identification via heteroskedasticity (Sentana and Fiorentini, 2001; Rigobon, 2003; Lewbel, 2012). Unfortunately, we show that these estimation approaches are sensitive to the assumption that the structural model is linear: the estimand can easily be nonzero when there is no causal effect, or negative when the true shock has a uniformly positive effect on the outcome of interest. Fixing these issues while still delivering informative inference appears difficult, since a natural nonparametric generalization of the identification strategy yields very wide identified sets.” (Kolesár and Plagborg-Møller, 2025, p. 3)
“the nonparametric analogue of the identification assumptions yields an identified set so large that effectively any function of the data can be construed as a “shock”. Intuitively, the mere assumptions that the latent shocks are independent and non-Gaussian are vacuous in a nonparametric context: any collection of random variables can always be represented as some nonlinear function of independent uniformly distributed random variables.” (Kolesár and Plagborg-Møller, 2025, p. 4)