Figure 1: Home Insights Collections are lists of thematically coherent home recommendations featuring highly relevant aspects of homes such as pools, views, remodeled kitchens, open floor plans and many others.
At Zillow Group, we strive to delight each and every user with the most personalized home discovery experience. Some users might focus their search on the number of bedrooms, square footage, or location. Others might dream of that newly remodeled kitchen with quartz countertops and a chef stove, or a large patio area to entertain friends and cook amazing barbecue.
In this blog post, we’ll take a closer look at “Home Insight Collections,” a new feature that uses natural language processing (NLP) models to categorize these kinds of home highlights and let users browse properties that are customized to their preferences (e.g., “Homes with a pool in Boca Raton, FL”, as shown in Figure 1).
One of our core NLP pipelines extracts, filters, and ranks “home insights” from property listing descriptions (1). These keyphrases highlight features of a property that may incentivize users to take a closer look at a home.
We’ll begin by discussing a methodology for categorizing these keyphrases under broader categories (e.g., “swimming pool” under “pool”), then understanding each user’s preferences based on their recent activity on Zillow, and finally putting them together to generate personalized collections of listings.
By combining these building blocks, we create Collections of homes that not only fit a user’s primary preferences (e.g., price, square footage) but also the features of their dream home. Let’s discuss these building blocks for Home Insights Collections one at a time:
Figure 2: Our real estate taxonomy organizes home attribute terms in a hierarchical structure, which increases the overall number of listings with shared features during candidate selection (e.g., “Homes with a Pool” may include homes with several types of swimming pools).
One of the biggest challenges with using home insights from property listing descriptions is grouping them under useful and coherent categories. We need to understand these terms in order to build a taxonomy to organize the thousands of unique home insights into a useful hierarchical structure. The taxonomy then allows us to categorize niche features (e.g., “olympic-sized swimming pool”) under broader categories (“pool”). This becomes especially useful when building coherent home insight collections containing listings with diverse, related recommendations for each user.
A taxonomy is a tree that organizes related concepts in a hierarchical structure, where nodes are home insights and directed edges represent hypernym relationships. Parent nodes represent broader concepts than their respective child nodes. Let’s say the following relationship chain is a leaf to root path in the taxonomy:
‘Wolf Stainless Steel Appliances’ → ‘Stainless Steel Appliances’ → ‘Appliances’ → ‘Kitchen’ → ‘Interior Features’ → ‘Thing’
In this example, “Interior Features” is a higher level category that subsumes “Kitchen”-related home insights, just as “Appliances” subsumes “Stainless Steel appliances” and so on.
Figure 3: A high-level visualization of selecting the most relevant parent category for a given keyword. F is a binary classifier that is trained on hypernym pairs to predict the likelihood of a hierarchical relationship between a keyword and a node from the taxonomy (T).
Let’s summarize a semi-supervised approach for building this taxonomic structure:
By combining automated and HITL methods, we build a multi-level taxonomy to enable the recommendations of listings with home insight attributes at different levels of abstraction.
We encourage readers who are interested in learning more about taxonomy construction and recommendations to read our team’s recent publication at RecSys 2022, “Taxonomic Recommendations of Real Estate Properties with Textual Attribute Information.” (4)
Understanding our users’ needs and interests is a foundation of all personalization solutions we deliver, and there are many ways of capturing user interest. For personalization tasks, user interest is often captured through histograms that approximately represent a user’s preferences about specific aspects of a home. As an illustrative example, take a user who has viewed 40 listings over the past week. We take key attributes of these listings and compute a distribution for each feature. If the user has viewed 20 homes with 3 bedrooms, 15 homes with 2 bedrooms, and 5 homes with 1 bedroom, we use this data to quantify the relative preference for number of bedrooms.
Figure 4: A listing card provides a concise snapshot of key elements of a home, which typically include a picture of the home, price, number of bedrooms, bathrooms, square footage and address. A click on a listing card will open up the Home Detail Page, a one stop shop for all of the listing’s data, including home insights.
The user’s typical search or discovery experience will start with a series of listing cards, each card representing a concise snapshot of a home including a picture and a few key data elements (Figure 4).
Figure 5: User clicks on homes are used to derive the user’s preference distributions for a variety of home features, such as number of bedrooms and bathrooms, price range, and location. The Home Detail Page (HDP) also displays Home Insights, as shown in the lower middle right corner of the screenshot.
A click on a listing card will take the user to the listing’s Home Detail Page (HDP), containing a rich set of information about the home (Figure 5). We aggregate a subset of the home attributes to build the user profile.
This same approach is adopted to create histograms of Home Insights. We infer a user’s preferences toward certain tags based on their interactions with listings and the tags displayed on the HDPs of those listings. Our exact approach, which also involves the taxonomy, will be covered in more detail later on.
A Collection is a list of thematically coherent home recommendations. A theme could be as simple as a city, or a more complex combination of different aspects of a home. For example, “Recently listed homes with 2 bedrooms in Seattle, WA” captures 1) recency, 2) an interior feature like number of bedrooms, and 3) the city.
Figure 6: Collections are lists of thematically coherent home recommendations. In this figure we showcase various city-based Collections, a couple “Home Insights” Collections featuring AC and Water Views, and a “More Spacious Homes” Collection designed for users looking to upgrade their space.
Currently, Collections are found on our zillow.com website’s homepage and on the Zillow app’s Updates Tab, as illustrated in Figure 6. Here we’ll focus on the Updates tab, but the same concepts extend across all experiences. The Updates tab is currently the second tab in the app for both iOS and Android and serves as a highly customized space where users access their saved searches and personalized Collections.
The Updates tab is structured with recommendations organized by Collections as thematically coherent rows presented in a two-dimensional layout. A user can scroll down to see more Collections with different themes, or scroll right to see more recommendations in a Collection of interest. A click on any listing will bring the user to the listing’s Home Detail Page.
Because the inventory of homes for sale on any given day includes far more homes than we can display on a single page, and each user comes with their own unique set of interests, we use different AI solutions to identify the top Collections for each user based on their unique profile, and then populate each Collection with the most relevant recommendations ranked according to the user’s personal preferences. This results in the most relevant recommendations being displayed in the top left corner of the Updates tab, with decreasing relevance scrolling down or right.
Home Insights Collections are designed to highlight a variety of aspects of a home that are highly relevant and would otherwise be left to the user to uncover by reading the listings’ descriptions and browsing pictures one Home Detail Page at a time. As elaborated above, Home Insights Collections come to the rescue by first learning user interest for specific home insights, and then compiling the most relevant home recommendations in a Collection. When users interested in “Remodeled Kitchens” access their personalized feed, a brand new “Homes with Remodeled Kitchens” Collection will be waiting for them.
There are two key aspects to creating this highly personalized experience:
Measuring user preference for specific home insights builds on the approach discussed in the Building Blocks sections, but requires some tuning. Imagine for example a user clicking on a home primarily because of its price, and only after having browsed its pictures discovering that the home actually has that great bonus room they were looking for. The bonus room might be the reason the user decides to ultimately tour the home, but it was not the reason the user clicked on the home. Clicks are usually a good source of signal for aspects of the home that are clearly visible to the user at click time. Home insights are usually not, and are often only fully accessible on the HDP.
To account for that weaker signal of a click, we took a two-pronged approach:
The main goal of all Collections is to facilitate the user’s discovery process by compiling lists of relevant homes. And as mentioned before, the Updates Tab offers a finite amount of space to display Collections. From that perspective, a great Collection will not only be relevant to the user, but will also provide plenty of listings to explore.
While our home insights can be very detailed, that level of detail also means that there might not be many homes for sale in a specific geography with that very specific home insight feature.
That’s where the ZG Real Estate Ontology comes into place. The taxonomy provides a flexible way of rolling up the granular home insights into more general categories. This allows us to account for inventory, user interest and granularity in a conjoint manner, ensuring optimal levels of exposure.
Using both offline and online analysis, we leveraged the taxonomy to model more and more nuanced relationships and continue optimizing home insights relevance. For example, we have learned that the difference between patios and covered patios is very relevant to certain users, whereas the exact granite style is of secondary importance compared to whether the kitchen is updated or not. With this approach we serve the most relevant and granular Home Insights Collections while also ensuring an appropriate number of recommendations in each Collection.
We have now run multiple AB Tests to evaluate the impact of Home Insights Collections. These tests uniformly show that Home Insights Collections have the highest CTR among city-based Collections (+3.2% compared to our second highest CTR Collection).
Not only do Home Insight Collections generate more clicks per shown Collection than most other Collection types, but that impact is even higher on a per-recommendation basis, meaning that we get more clicks out of fewer recommendations compared to other Collections. Additionally, Home Insights Collections also generate more Saves (+0.6% relative lift) and a higher proportion of engagement through scrolling right (a proxy for interest in exploring that Collection further).
While the initial results are promising we also acknowledge how early we still are in this space. There are so many areas of opportunity to continue to improve Collections in general and Home Insights Collections in particular and help our users find their dream home. And we look forward to discussing some new exciting developments soon.
Kudos to the team who applied machine learning and AI to bring this amazing feature to life for Zillow customers — Kelsey Juraschka, Siddhi Vakil, Saeid Balaneshin, Ondrej Linda.