Listing Language: From the Bottom of the Bubble to Now
The U.S. real estate market has changed a lot in the past five years. So have the kinds of terms used in real estate listing descriptions.

The U.S. real estate market has changed a lot in the past five years. So have the kinds of terms used in real estate listing descriptions.
The U.S. real estate market has changed a lot in the past five years. So have the kinds of terms used in real estate listing descriptions.
In 2011, near the bottom of the housing crash that began in late 2007 and 2008, foreclosure activity was widespread and a number of federal programs were in place designed to stanch the flow of foreclosed homes and entice homebuyers back into the market. As such, descriptions of properties listed for sale on Zillow were chock full of phrases indicating the homes were in or near foreclosure, were ripe for federal financing opportunities and/or featured other recession-era buzzwords.
Fast forward a few years, and by the end of 2015 the market was largely well on the road to recovery. Home values were up, in some cases exceeding pre-recession highs for the first time in years, and foreclosures were way down. Mentions of distress in listing descriptions had fallen to a fraction of their prior levels, and properties were more likely to feature more traditional descriptors that made note of their unique features, spaces and finishes.
In prior research, Zillow explored phrases that were more popular in each state than in the rest of the United States. Here, we’ve cut the data by time and explored which phrases were more or less popular in 2015 (during the economic recovery) than they were in 2011 (in the depths of the mortgage crisis).
At the outset of this research, we expected that phrases indicating financial turmoil would appear more frequently in 2011 than in 2015, and this is exactly what the data show (figure 1).
Zillow listings show a large decrease in the use of the term “HUD” nationwide between 2011 and 2015, from 90 mentions per 100,000 words in 2011 to only 10 mentions per 100,000 in 2015. “HUD” was used in the context of being a HUD-owned home; that is, homes with FHA-insured mortgages that fell into the possession of HUD after foreclosure. This phrase often appeared with the term “as is,” indicating that the seller is not on the hook for repairs. The HUD case number was also often included.
As an alternative to the foreclosure process, a short sale occurs when a home is sold for less than the owed mortgage amount, and it’s another signal of financial trouble. In 2015 the phrase “short sale” was mentioned one-sixth as frequently as it was during the depths of the crisis.
The phrase “mortgage” is a fairly general real estate term, but we saw its usage by 2015 had dropped to a tenth of what it was in 2011. Why? One clue is that in 2011, about 73 percent of the listings containing “mortgage” also mentioned the phrase “HomePath,” while in 2015 it was just 17 percent of the time. HomePath was a Fannie Mae program that allowed buyers of foreclosed homes to borrow extra money for repairs and renovations. In cases where a mortgage was mentioned but HomePath was not, other phrases indicating economic hardship were often present.
The term “finance(-ing)” is similar: It is often mentioned in conjunction with both “mortgage” and “HomePath,” as in “the property is approved for HomePath mortgage or HomePath Renovation Mortgage Financing.” In 2015, all words with the root “financ” dropped to a fourth their prior frequency.
As more technical, financial-related words and phrases faded from listings as the recovery took hold, more traditional listing keywords that described the home itself – and not its financial state – began to crop up more frequently (figure 2).
For example, terms such as “granite” (appearing 34 percent more frequently in 2015 than in 2011), “space” (44 percent more) and “closet” (33 percent more) show off valuable features of a home. Similarly, strong descriptive adjectives like “beautiful,” “large” and “open” were also more common.
These could also be an indicator that homes selling in 2011 were in need of a bit more TLC than those selling in 2015. In the best-selling book Zillow Talk, we found that bottom-tier homes with listing descriptions mentioning “beautiful” (appearing more often in 2015) sell for 2.3 percent above their expected value. Contrast this to the list of terms that have decreased since 2011: The only positive adjective is “great,” which is a weak decriptor and may suggest that that the seller doesn’t have specific, positive things to say about the home.[1]
We analyzed more than 6 million listing descriptions for single-family homes listed for sale on Zillow between 2011 and 2015. After cleaning the listings,[2] we calculated the frequency of all 1- and 2-word phrases by year.
Then for each phrase we computed:
Then, we selected the top 50 phrases where the absolute value of (1) minus (2) was largest, since this seemed to return results that were most interesting.[3]
For these same 50 phrases, we then computed the frequency metrics by state. Since one phrase may have increased over time in one state but decreased in another, the number of phrases in the figure may differ by chosen state.
[1] This could be similar to the usage of the word “nice.” In Zillow Talk, we found that mentioning “nice” in a listing results in a sale that’s on average 1 percent below the expected sale price.
[2] To clean the listings, we converted everything to lowercase and removed numbers, punctuation and common English words (e.g., “the,” “in”). We also “stemmed” the endings of all words, removing common endings like “-s,” “-ing,” “-ed” and “-ly” so that different forms of the same word could be tallied together.
[3] We believed the most interesting phrases were those that were frequent (in either the state or in the U.S.) and had a high relative frequency when compared to the rest of the U.S. Using only frequency as our metric of interestingness would have given only the most common real estate terms, while just using relative frequency would have given hyperlocal, low-frequency terms. Perhaps surprisingly, simply calculating (1) minus (2) was more effective at giving us interesting phrases than were any of the hand-tuned indices we tried.