The static-all Split – Dividing Zillow’s JavaScript Monolith Repository

Zillow has grown immensely since its inception over a decade ago. What was once a few folks working on a small site has transformed into dozens of teams supporting a home shopping platform serving over 80 million people each month (comScore).
As the company grew over time, the codebase did as well. Our original monolithic application, which housed all of the code for our entire site, needed to be broken up into more manageable components. For the front end, this meant liberating our static resources from that monolith in 2012, a project that is internally known as “The Great Schism”. The result of this project was a repository called “static-all” containing all of the JavaScript, stylesheets, images, templates, and other resources that your browser consumes every day.
As the team continued to grow, the number of static resources did as well, until it reached a point that it too need to evolve. After four years, there were over 30 product teams at Zillow, and nearly all of them had front end code in static-all and needed to make changes regularly. This led to a multitude of issues.
In just over nine years, the size of the codebase itself had ballooned:
April 2007 | May 2016 |
---|---|
9 JavaScript files (1,119 lines) | 1,386 JavaScript files (291,632 lines) |
923 images | 6,071 images |
22 CSS stylesheets | 1,078 LESS stylesheets |
The figures above are only counting code written internally by Zillow engineers, not third party libraries and other external code.
With weekly releases, you can imagine the headaches this caused with our integrations. Teams would try to integrate changes from their feature branches into develop at the same time, causing scheduling lockups and merge conflicts.
Since all the code was deployed as one giant package out of one repository, there was no feature isolation to speak of – any issues in one small feature required a rollback of the entire front end codebase.
That’s not even addressing the pain of making a change in the first place. Since these were originally pulled from our Java-based application repository, we were still using Java-based tools like and Maven to build our static code. Builds took more than 30 minutes and produced a package that was several hundred megabytes. There was a ton of legacy code still in use that was a decade old. Change history was severely damaged from source control migration and years of git merges and re-factors.
Indeed it was time to do something. Our technical debt had reached critical mass, and developers needed the ability to work faster on the front end. We all wanted the flexibility to work differently, on our own integration schedules, writing more organized code with better tools that built faster.
We set out to ease all of these pains, and came up with a list of goals for the project. By the end of the project, we wanted developers to be able to:
To achieve these goals, we decided on a solution to split the static-all repository into smaller static bundles. The requirements of the bundles were simple:
The idea of these bundles was relatively simple, but how do you go into a repository containing nearly 10,000 static files and make sense of it all? We had a lot of work to do before we could assign feature teams to create these bundles.
We needed to take stock of every file in the repo. This was done in a not-so-elegant way: a spreadsheet we called the “Static-all Ownership Map”. Every file in static-all was listed in the spreadsheet; teams who were responsible for the migration were listed next to the file path.
Many files had a clear owner, and were already grouped by feature in the existing file system. Other files had no clear owner, so files that looked like they were related were grouped and assigned to a team together. Teams were assigned files in an attempt to spread the workload evenly. This was especially challenging for poorly documented legacy code that nobody had knowledge of. If a file was investigated and found to belong to a different package, the assigned team needed to negotiate with other teams responsible for the relevant feature.
We decided to leave dormant, unused code in static-all. Some of these files are used to fuel long tail SEO pages that haven’t been developed for years, but are still live and serve small (yet important) amounts of traffic. There was also legacy code for features that were scheduled to be refactored before the static split project started (or immediately after). In order to not interrupt the refactoring process, this code was flagged to be moved once that process was complete.
By the time the product teams were ready to begin their migrations, several static bundles had been created and an archetype had been built. Due to this upfront work, the actual process of creating additional bundles was relatively straightforward:
References to static assets (images, stylesheets, etc.) needed to be changed. The one exception to this requirement is JavaScript files. Our JavaScript loader service is constructed such that it has a reference to each JavaScript module (module names are unique across all of the bundles), regardless of which bundle they live in. That means that pages can reference JavaScript files by bundle name, not specific file path. The loader takes care of mapping the bundle name to the specific paths so it can serve the actual content.
With so many code changes taking place at once, we needed to put some automation in place to sanity check our work and make sure we weren’t stepping on each other’s toes. We added new tests to our internal testing framework to help protect us from the most common mistakes.
We added logic in our JavaScript loader to protect against duplicate module names. These tests failed when a team copied their JavaScript files from static-all to a new bundle, but didn’t delete the original file. One of our tests simply hit a verification endpoint on the loader that returned an appropriate response code and relevant details in the body.
With files being pulled into different repositories, module dependencies now often spanned multiple repositories. It was fairly common to move a file to Bundle A without realizing that its dependencies had already been moved to Bundle B. Our loader was also tracking these relationships, we could cover this as part of the same verification endpoint above.
Since all of our JavaScript modules lived in one bundle, we had no build time dependencies. Each module always had the latest version of its dependencies, since they were always being built together. As we started to split the files up, however, we quickly realized how many of these interdependencies we had:
The image above is a screenshot of a tool we developed during one of our hack weeks to help us visualize these paths. Dependency information came from the JavaScript loader and was output visually. This was a simple example of one such circular dependency. As you can see, the tool brought to light some more complex cases:
We wrote another test that leveraged this tool to uncover any new circular dependencies. Part of a team’s tasks when migrating files out of static-all would be to untangle any new circular paths that appeared.
This project required lots of communication between everybody involved. We split all the product teams into batches based on their existing product roadmap, and each batch did their migrations at the same time so that no one team would be doing work in isolation. Teams needed to move files quickly so that ongoing product work wouldn’t make changes to files that were moved in feature branches. Since the changes affected so many developers, we set up several communication channels to make sure people knew what was going on:
Over the span of several months, each team spent time moving their files out of static-all and into new bundles. The bulk of the work is now complete, and nearly all product teams do all of their front end work outside static-all. To break old habits, we’ve restricted permissions on the static-all repository itself to a few “gatekeepers” and limited changes to deletes. We anticipate that at some point, there will be a feature deprecation in a browser or a third party dependency that will necessitate a change in the legacy code; thankfully, we have not run into that situation yet.
We are already reaping the benefits of what we set out to achieve with this project:
Start of Project | End of Project | |
---|---|---|
Number of static bundles | 1 | 60 (and counting) |
Build time of static-all | 30 minutes | 8 minutes |
Number of code changes per month in static-all | 300 | 12 |
Number of people per month committing code in static-all | 75 | 3 |
We’ve greatly increased the productivity and quality of life for our front end developers, but this is only the beginning. With all of the different bundles in existence now, there are still pain points in the development process. We’ve uncovered a bunch of low hanging fruit with regards to performance through this process. And, of course, we will continue to remove legacy code and look forward to someday getting rid of static-all entirely.
The static-all split required the coordination of dozens of developers across all of our teams. Our core value of “Zillow Group is a Team Sport” rang loudly throughout the project. Everybody is enjoying the benefits and feedback has been overwhelmingly positive. It’s definitely an exciting time to be at Zillow and see projects like this progress!