Housing Data 101: Why Use the Zillow Home Value Index (ZHVI) Instead of a Median Sale Price Series
We recommend using ZHVI to track home values over time for the simple reason that ZHVI represents the whole housing stock and not just the homes that list or sell in a given month.
Update: ZHVI was updated in December 2019. Here are highlights of the new methodology, and here’s a look under the hood.
Zillow publishes several different measures of home values monthly including median list prices, median sale prices, and the Zillow Home Value Index (ZHVI), but for almost all use cases we believe ZHVI to be the most representative measure of changing home values over time.
There’s a lot that goes into making the ZHVI. But the one-sentence explanation is that Zillow takes all estimated home values for a given region and month (Zestimates), takes a median of these values, applies some adjustments to account for seasonality or errors in individual home estimates, and then does the same across all months over the past 20 years and for many different geography levels (ZIP, neighborhood, city, county, metro, state, and country). For example, if ZHVI was $400,000 in Seattle one month, that indicates that 50 percent of homes in the area are worth more than $400,000 and 50 percent are worth less (adjusting for seasonal fluctuations– e.g. prices tend to be low in December).
We recommend using ZHVI to track home values over time for the very simple reason that ZHVI represents the whole housing stock and not just the homes that list or sell in a given month. Imagine a month where no homes outside of California sold. A national median price series or median list series would both spike. ZHVI, however, would remain a median of all homes across the country and wouldn’t skew toward California any more than in the previous month. ZHVI will always reflect the value of all homes and not just the ones that list or sell in a given month.
A more realistic example of the California hypothetical is that the housing collapse affected different regions and different segments of the population very differently. Non-distressed sales became less frequent because of negative equity and foreclosures, so the hardest hit regions became underrepresented in median sale series during the worst years of the collapse. This falsely attenuated the effects of the housing crash.
Or imagine a scenario where no homes sell in a month. This is a frequent occurrence for small regions like ZIP codes. With no sales, the median sale price would be undefined. But with ZHVI, estimates of home values[1] can still track the median value of homes.
[1] This might seem strange that Zillow can estimate home values when there are no sales, but say there are no sales in a ZIP code: We still most likely will have sales data in adjacent ZIPs or other nearby areas. From these sales, the models can estimate how much home values are increasing or decreasing. Similarly, we might not have home sales in a given month, but if we see sales in the surrounding months, we can reasonably interpolate values in the middle month.