# Generalized Random Forests

*Susan Athey, Julie Tibshirani, Stefan Wager*
Link

## Abstract

We propose generalized random forests, a method for non-parametric statistical estimation based on random forests (Breiman, 2001) that can be used to fit any quantity of interest identified as the solution to a set of local moment equations. Following the literature on local maximum likelihood estimation, our method considers a weighted set of nearby training examples; however, instead of using classical kernel weighting functions that are prone to a strong curse of dimensionality, we use an adaptive weighting function derived from a forest designed to express heterogeneity in the specified quantity of interest. We propose a flexible, computationally efficient algorithm for growing generalized random forests, develop a large sample theory for our method showing that our estimates are consistent and asymptotically Gaussian, and provide an estimator for their asymptotic variance that enables valid confidence intervals. We use our approach to develop new methods for three statistical tasks: non-parametric quantile regression, conditional average partial effect estimation, and heterogeneous treatment effect estimation via instrumental variables. A software implementation, grf for R and C++, is available from CRAN.

When you have an estimator with low bias but high variance, averaging multiple noisy instances of that predictor helps reduce variance. It’s worth thinking about how to apply this in other arenas, such as local projections, which are known for their low bias but high variance. This logic would suggest estimating multiple impulse responses via local projections and then averaging them to get a lower variance estimate. How to generate these estimates is the key question. I’d imagine two potential options: (1) estimate the same model on different subsamples of the data, (2) estimate different models with different controls included on the same data:

because individual trees $μ^ _{b}(x)$ have low bias but high variance, such averaging meaningfully stabilizes predictions

Reminds me that it’s worth using random forests as the estimator for @chernozhukovDoubleDebiasedMachine2017 style estimates of local projections. It still surprises me that more people haven’t considered this, especially given all the work around deriving more “causal” impulse responses via instrumental variables variants of local projections (@stockIdentificationEstimationDynamic2018, @rameyMacroeconomicShocksTheir2016).