# Forecasting levels of log variables in vector autoregressions

*Gunnar Bårdsen, Helmut Lütkepohl*
Link

## Abstract

Sometimes forecasts of the original variable are of interest, even though a variable appears in logarithms (logs) in a system of time series. In that case, converting the forecast for the log of the variable to a na¨ıve forecast of the original variable by simply applying the exponential transformation is not theoretically optimal. A simple expression for the optimal forecast under normality assumptions is derived. However, despite its theoretical advantages, the optimal forecast is shown to be inferior to the na¨ıve forecast if specification and estimation uncertainty are taken into account. Hence, in practice, using the exponential of the log forecast is preferable to using the optimal forecast.

Main point here is that the common recommendation that you modify the predictions of a log-linear model when moving to log levels units may not actually be correct. Rather, you can just naively exponentiate and do just fine:

the common practice of forecasting the logs of a variable and then obtaining a forecast of the original variable by applying the exponential function is a useful strategy in practice. … for typical economic variables, gains in forecast precision from using the optimal rather than the na ̈ıve forecast are not likely to be substantial. In fact, in practice the optimal forecast may well be inferior to the na ̈ıve forecast.

They arrive at this conclusion by first deriving a simpler form of the predicted value, then showing in simulation that you get smaller RMSE if you use the naive estimator instead of a post hoc adjustment:

for variables which have typical features of some economic variables, using the optimal forecast is likely to result in efficiency losses if the forecast precision is measured by the root mean square error (RMSE).

The standard recommended adjustment is the following, which is straightforward to derive if you assume normal forecast errors: $E(exp(x))=exp(μ+21 σ_{2})$

There are three main reasons for this counterintuitive result:

- The standard adjustment assume normal forecast errors, which may not be true in fact
- The adjustment factor itself (the variance of the forecast errors) must itself be estimated, introducing error
- Stationary variables that are log transformed have bounded error distributions even out to infinity, and those errors are small relative to the level of the variable, so the adjustment doesn’t do much

While one should be careful with non-stationary variables, errors driven by misspecification / estimation error will tend to drown out everything else, making the naive forecast perform relatively well vs. the “optimal” adjusted forecast:

for integrated variables, the naıve forecasts generally perform better than the optimal forecasts, with the relative gains increasing with the forecast horizon.

This seems like a classic case where being too confident about the DGP / model leads to serious issues. Sometimes, simpler is better, especially when operating under substantial uncertainty, which we tend to be most of the time even when we don’t acknowledge it, per Nassim Taleb.