In prior research, we explored phrases that were more popular in each state than in the rest of the United States. In this post, we’ll rehash the analysis with rentals-only data to determine what’s most important to renters and landlords in terms of home style, local amenities, weather and more. If you think you already know every state’s rental characteristic preferences, we challenge you to take our quiz and view our infographic.
See below for our interactive tool to find the most popular rental characteristics and descriptors in each state according to our analysis of 1.6 million Zillow rental listings descriptions in 2015.
How to use the tool
After selecting a state in the tile map, the word cloud will populate with that state’s popular rental listings phrases. The same phrases will appear in the table along with the frequency statistics that determine the size and color of the phrases in the word cloud. You may notice that the ends of some of the words have been cut off or altered. This is because we shortened words to their roots before calculating frequencies so that similar phrases (like “tiled floor,” “tiled flooring” and “tile floors”) would be tallied together as one phrase (“tile floor”).
See the Methodology section for more information on how we cleaned the data and selected the phrases for each state.
Huskies and ferrets need not apply
Rental listings across the United States like to mention pets. If you poke around the tool, you’ll notice that half of the states have at least one rental phrase that includes the word “pet,” “dog” or “cat.” [1]
But landlords in Colorado and Nebraska are particularly specific when defining the creatures that are forbidden from rental units. In these states we see plenty of mentions of American Bulldogs (321 times as likely to be mentioned in Colorado than in the rest of the United States), Huskies (61 times as likely in Nebraska), Saint Bernards (319 times, Nebraska) and Great Danes (55 times, Nebraska).[2] Listings often mention that these are “statistically dangerous breeds” and link directly to Centers for Disease Control and Prevention website.[3] The relevant statistics seem to be in Table 1 in the section on dog-bite-related fatalities.
Rabbits and ferrets are often prohibited outright from Nebraskan rental units, regardless of how tame or vicious they are.
Notebook, paper and Penn-cil
Since students are much more likely to be renters, landlords are sure to advertise when a property is close to campus. In Pennsylvania, the colleges Drexel, Temple and Penn[4] were surfaced as top phrases, along with main campus and Rittenhouse (referring to Rittenhouse Square, which is close to the campuses of the colleges mentioned earlier). Similarly, rental listings in Connecticut point out when a home is close to Yale. Massachusetts does the same with Harvard Square and Northeastern, Maryland with Johns Hopkins and North Carolina with Duke and UNC.
You may be wondering, is this really a rental-specific phenomenon? It is: College callouts are quite a bit more likely to appear in rental listings than in for-sale listings. For example, in Pennsylvania, Drexel is 8 times as likely to be mentioned in a rental listing than in a for-sale listing. For Yale in Connecticut, it’s 5 times as likely. And for Johns Hopkins in Maryland, 5 times as likely.
Your turn
Each phrase has a story, so feel free to play with the tool and find your own interesting tidbits. If you want to compare rental phrases and frequencies to those of for-sale listings, then click back to our post that includes data on all listings.
Methodology
We analyzed all listing descriptions for rental properties on Zillow in 2015. After cleaning the listings,[5] we calculated the frequency of all 1- and 2-word phrases for each state and for the U.S. as a whole.
Then for each phrase and for each state we computed:
- The frequency of the phrase in the state, divided by the total number of words used in all listings in the state.
- The frequency of the phrase in the rest of the U.S., divided by the total number of words in the rest of the U.S.
Then, in each state, we selected the top 35 phrases where (1) minus (2) was largest, since this seemed to return results that were most interesting.[6]
We did our best to manually exclude legal and transactional terms (e.g., “copyright,” “qualif” and “approv”) and the names of cities, counties, neighborhoods and terrain types (e.g., “river” and “mountain”). We also made sure that the relative popularity of the phrase in a state – (1) divided by (2) – was at least twice the phrase’s relative popularity in any other state. If a 1-word phrase within a state (e.g., “pane”) appeared in at least one other 2-word phrase (e.g., “dual pane” and “pane window”), then we removed the 1-word phrase and kept the 2-word phrase with the highest frequency.[7]
[1] Twenty-five out of the fifty states plus the District of Columbia.
[2] Sample size is admittedly low in Nebraska, but these phrases very likely would have appeared in top phrases for Colorado if they weren’t mentioned in Nebraska given the “winner take all” system for assigning phrases to states. See Methodology for more information.
[3] www.cdc.gov
[4] “Penn” will encompass both the University of Pennsylvania (commonly “Penn” or “UPenn”) and Penn State.
[5] To clean the listings, we converted everything to lowercase and removed numbers, punctuation and common English words (e.g., “the” or “in”). We also “stemmed” the endings of all words, removing common endings like “-s,” “-ing,” “-ed” and “-ly” so that different forms of the same word could be tallied together.
[6] We believed the most interesting phrases were those that were frequent (in either the state or in the U.S.) and had a high relative frequency when compared to the rest of the U.S. Using only frequency as our metric of interestingness would have given common real estate terms like “bathroom” and “home,” while just using relative frequency would have given hyperlocal, low-frequency terms. Perhaps surprisingly, simply calculating (1) minus (2) was more effective at giving us interesting phrases than were any of the hand-tuned indices we tried.
[7] In these situations, we prefer 2-word phrases over 1-word phrases because they often offer crucial context. For example, “dual pane” offers much more information than “dual” or “pane” alone, in that we know the phrase is referring to windows.