Zillow Home Value Forecast: Methodology

Posted by: Guy Yollin    Tags:  , , , , , , ,     Posted date:  February 8, 2012  

When Zillow introduced the Zestimate and Zillow Home Value Index in 2006, one of the first things consumers asked for next was an estimate of where home values were going to go in the future. While we periodically produced a national forecast of home values for the purposes of analyzing national economic trends, we did not publish a regular national forecast nor forecasts for the various metropolitan regions.

With the release of our Q4 data this week, we are happy to correct this state of affairs by announcing the Zillow Home Value Forecast (ZHVF). The ZHVF forecasts the change in the Zillow Home Value Index (ZHVI) over the next twelve months for the 25 top metros and the nation as a whole. The ZHVI itself is a time series tracking the monthly median home value in a particular geographical region, and the methodology behind the index is described in more detail in this research brief.

Below, we will detail the basic approaches we’ve used to construct the ZHVF and its historical performance.

A quick primer on characteristics of home value time series

Since the ZHVI is a time series of home values, it inherits many of the characteristics found in other home price indexes (Canarella, Miller, Pollard, Gupta). These characteristics include:

  • Non-stationarity: Time series of the level of home prices (or home values) can be considered to be I(1) or I(2) (integrated of order 1 or 2).
  • Structural breaks: The housing market goes through changing economic regimes. Examples of these regimes include the economic recession of the early 1990s, the housing bubble of the first half of 2000s, and the subsequent housing recession from 2007 to the present day.
  • Seasonal effects: Raw time series of home prices typically have a strong seasonal component which must be addressed.

To address these characteristics, the Zillow Research team designed a forecasting approach with the following considerations:

  • The approach should be based on a seasonally adjusted time series (which the ZHVI is).
  • The approach should incorporate multiple forecasting approaches which will have different performance characteristics depending on the regime in which the forecast is being produced.
  • The method for synthesizing these different approaches should be dynamic so as to quickly capture regime changes when the performance of the various approaches may change.
Multiple model approaches

The Zillow Home Value Forecast employs multiple models, each one belonging to either a univariate time series or multivariate economic leading indicator family of models.

Time series models

Time series models utilize the past history of a time series to predict the future evolution of that time series. Implementing a time series model may also include transforming the series (e.g. Box-Cox), differencing or calculating returns on the series, as well as time series decomposition into trend, seasonal, and irregular components. All of these techniques were explored as part of the model research process.

A number of alternative time series models implemented and evaluated including ARIMA models (Box-Jenkins), structural time series models (Harvey, Shephard, Durbin, Koopman), and exponential smoothing models (Holt, Winters, Gardner, McKinzie, Hyndman). Ultimately, the Zillow Research team chose a damped-trend exponential smoothing model as one of the core models for the ZHVF.

Economic leading indicator models

As opposed to a univariate time series model which relies solely on the history of the time series being modeled, an economic leading indicator model attempts to relate a number of economic variables to the time series being modeled. In this context, we developed an economic leading indicator model which utilizes macro-economic variables to forecast future home value changes. Some of the explanatory variables in this model include home sales, monthly supply of inventory, income growth, rent levels and unemployment. The economic leading indicator model is the other core model for the ZHVF.

Synthesizing various models outputs

The ZHVF is a combination of forecasts (Bates, Clemen, Timmerman) based on input from both the time series models and the economic leading indicator models. The weight of each model’s input into the final forecast is determined dynamically as a result of the model’s recent forecast accuracy.

Historical forecast accuracy

The ZHVF forecasts the change in the ZHVI over the next twelve months. To assess the 12-month ahead out-of-sample predictive accuracy of the ZHVF, the Zillow Research team implemented a technique called time series cross validation (Hyndman). Using this procedure, we forecast historically using only data known at the start of a simulated forecast period. In this process, we simulate a forecast made for a future period and then compare this forecast with the actual results over that same period of time.

The following table summarizes the 12-month ahead cross validation forecast accuracy for all regions combined and for just the United States time series itself:

Not surprisingly, one can see that forecast accuracy over the most extreme periods of the home value boom and bust is lower than during the recent period of time when home value trends have been more stable (albeit negative).

Future research and development

Zillow is committed to ongoing research and continuous improvement in ZVHI forecasting accuracy. Some of the research activities planned for the near term include the following:

  • Bayesian model averaging for the combined forecast (Hoogeheide, Kleijn, Ravazzolo).
  • Dynamic sizing of model training window (Peseran, Pick, Pranovich, Timmerman).
  • Expanding the set of sub-models used in the final modeling synthesis.



References

Bates, J. M. and C. W. J. Granger (1969), Combination of Forecasts, Operational Research Quarterly, 20, 451-468.

Box, G., Time series analysis : forecasting and control. Hoboken, N.J: John Wiley, 2008.

Canarella G., Miller, S., Pollard, S., 2010. “Unit Roots and Structural Change: An Application to US House-Price Indices,” Working papers 2010-04, University of Connecticut, Department of Economics, revised Dec 2010.

Clemen, R. T. (1989), Combining Forecasts: A Review and Annotated Bibliography, International Journal of Forecasting, 559-583.

Durbin, Time series analysis by state space methods. Oxford New York: Oxford University Press, 2001.

Gardner Jr., E.S. & McKenzie, E. (1985) Forecasting trends in time series, Management Science, 31, 1237-1246.

Greene, W. Econometric analysis. Boston: Prentice Hall, 2012.

Gupta, R., Miller, S., 2009. “The Time-Series Properties of House Prices: A Case Study of the Southern California Market,” Working Papers 0912, University of Nevada, Las Vegas , Department of Economics, revised Dec 2009.

Harvey, A. C. and N. Shephard (1993). Structural time series models. In G. S. Maddala, C. R. Rao, and H. D. Vinod (Eds.), Handbook of Statistics, Volume 11. Amsterdam: Elsevier Science Publishers B V.

Holt, C. (1957) Forecasting trends and seasonals by exponentially weighted moving averages, ONR Research Memorandum, Carnegie Institute of Technology 52.

Hoogerheide, L., Kleijn, R., Ravazzolo, F., Van Dijk, H. K. and Verbeek, M. (2010), Forecast accuracy and economic gains from Bayesian model averaging using time-varying weights. Journal of Forecasting, 29: 251–269. doi: 10.1002/for.1145

Hyndman, R., Koehler, A., Ord, J.K., & Snyder, R.D. (2008) Forecasting with exponential smoothing: The state space approach, Springer-Verlag: Berlin.

Pesaran, M., Pick, A., Pranovich, M., 2011. “Optimal Forecasts in the Presence of Structural Breaks,” DNB Working Papers 327, Netherlands Central Bank, Research Department.

Timmermann, A. (2006), Forecast Combinations, in G. Elliot, C. W. J. Granger, and A. Timmermann (eds.), Handbook of Economic Forecasting, North-Holland.

Winters, P. (1960) Forecasting sales by exponentially weighted moving averages, Management Science 6, 324–342.

 


About the author
Guy Yollin