Granular IV

How to calculate granular instrumental variables

  1. Run regression of endogenous variable on exogenous variable(s) while controlling for time and unit fixed effects (alternatively, just unit effects).
  2. Run Principal component analysis on the residuals from the regression in (1) to extract common factors (time x components) and unit-specific loadings (components x unit), using time as the rows (samples) and the units as the columns (features) of the matrix. Can use any number of components, though two is usually good enough.
  3. Subtract from the residuals the dot product of factor and loadings to get the idiosyncratic shocks
  4. Calculate weighted average of idiosyncratic shocks, optionally excluding (weighted) shocks below certain threshold

The reason to run PCA is that you almost certainly have some omitted variables issues in the prior regressions and PCA lets you do a slightly better job in isolating the truly idiosyncratic component. I do a similar procedure in The Dark Matter of Software Valuations.

Connection to Olley-Pakes decomposition

Granular IV approach of Granular Instrumental Variables is quite related to decompositions of @melitzDynamicOlleyPakesProductivity2015.

In granular IV, size-weighted shocks are subtracted from equal weighted shocks:

In Olley-Pakes decomposition, where overall weighted average is decomposed into equal-weighted average and covariance between the weights and things being averaged:

This can be re-arranged:

Thus the difference in shocks from granular IV is effectively the covariance of the shocks and the size weights. This is why there needs to be some skew in the size distribution in order for the granular IV to work — if not, the covariance between size and shocks will be zero, as there would be no variation in size weights.

One minor difference — when dealing with changes over time, Granular IV uses the average change while Olley-Pakes decomposition uses the change in the average.

References

Granular Instrumental Variables In Search of the Origins of Financial Fluctuations: The Inelastic Markets Hypothesis Granular IV code per Romain Lafarguette Scikit-learn PCA Statsmodels PCA

eonline