Debugging Capybara Failures

At NakedApartments we use capybara to write our behavior driven, end-to-end integration tests, aka “feature specs.” Capybara is a powerful tool that sets on top of a web driver. It offers you, the developer, a DSL that is used to write actions that closely mimic how an actual user would interact with your website, actions like clicking links and submitting forms. All the while you assert that the content you expect to be visible on a given page is actually there.
In theory capybara is amazing, useful, and time saving. It catches embarrassing bugs before they can ever reach the sensitive eyes of your users. It allows you to say, with confidence, “A user can log in to my site and create a blog post.” And for the most part this is true. It is amazing, useful, and time saving.
However, anyone who has worked with capybara also knows that when things go wrong it can be very difficult to figure out why. A friend of mine, at the height of a feature spec fueled frustration, once described capybara as “a Rube Goldberg machine of broken dreams.” Failures can be random and silent, and since by their nature they involve your entire web stack you can never be quite sure if the problem is with your code, capybara, or something inside your architecture. (It’s probably your code, I’m sorry to say.)
But even if it is your code, it can be hard to figure out what exactly is wrong with it, especially if capybara is being silent and not giving you much to go on.
So what causes a silent capybara spec failure? I will preface this by saying the issue will only occur in tests that use a real browser driver, like selenium, chromedriver, or capybara-webkit. If you use RackTest, which is the default driver but doesn’t support any javascript, the problem won’t surface.
Now, lets look at an example of what what we mean by “silent.”
Let’s say you have a view template that looks like this:
<h1>Manhattan</h1> <% ['Greenwich Village ', 'Upper East Side'].each do |neighborhood| %> <%= neighborhood.name %>
And a capybara spec that looks like this:
require 'spec_helper' feature 'Manhattan', js: true do before do visit manhattan_path end it 'Contains header' do expect(page).to have_content('Manhattan') end end
If you run this spec, it will fail and capybara will output the following:
expected to find text "Manhattan" in ""
That is incredibly unhelpful. If we return to our view code, we can quickly spot the real issue: our each
block is missing a closing end
.
In the real world this would raise a SyntaxError
. Why here do we instead get a vague failure telling us that our view rendered an empty string, with no details as to where or why?
The answer lies in how capybara attempts to bubble exceptions up from the thread that the browser is running in to the current test runner thread. This is why the silent exception issue only occurs when using a driver like selenium, where the browser and test runner are in separate threads; when using RackTest as the driver, both run in the same thread and therefore it’s easy for capybara to propagate exceptions.
When the two are separated capybara has to find a way to save the error messages that occur in the browser thread, then display them back in the test runner thread. It does this by wrapping the application in a bit of middleware that records the exception.
In most cases this works great, but as we have seen there is an exception to the exception, and the answer is on line 18 of lib/capybara/server.rb
, in the above linked Github commit:
rescue StandardError => e @error = e unless @error raise e end
Capybara only attempts to catch and reraise errors that inherit from StandardError
. SyntaxError
, on the other hand, inherits from Exception
. This is a best practice in Ruby-world, the capybara devs haven’t done anything wrong. StandardError
exceptions are a subset of all exceptions in Ruby, and the default behavior of rescue
is actually to only catch StandardError
‘s:
rescue puts 'This will only catch StandardError' end
Using rescue Exception => e
would actually expand the type of exceptions caught, including things like Interrupt
, which is raised when you hit ctrl+c
to kill an application. Other common types of Exception
include NoMemoryError
and, of course, SyntaxError
, which is the one that bit us here.
So what can we do to make these kinds of exceptions easier to debug?
The answer, I believe, is in how exceptions are handled when Rails runs in a development
environment. In this case, exceptions that occur when rendering or fetching a view don’t “bubble up,” but are instead their stack trace is rendered in HTML in an easy to read format to help the developer debug the issue.
We can turn this behavior on in our test environment, so that instead of reraising exceptions, they will be rendered to the view.
In config/environment/test.rb
, we change this value from false to true:
config.action_dispatch.show_exceptions = true
Now exceptions will be rendered instead of being raised. Our spec will still fail since the real view won’t be rendered, but instead of complaining that our content doesn’t exist within an empty string, capybara will display the full stack trace of the exception. However, there are two drawbacks to this approach:
Fortunately, there are workarounds to these drawbacks, too.
capybara-screenshot is a Ruby gem that takes a screenshot of the current page whenever a capybara spec fails. This is extremely useful even if you aren’t trying to debug a silent exception, but in our case, where we are rendering the exceptions to the view instead of raising them, it is almost essential. We need a user friendly way to read and parse these exceptions when they happen.
However, lets assume that our failing spec only happens randomly, or only seems to happen during our continuous integration (CI) builds. How do we view our screenshots in that case? At NakedApartments we use Codeship Pro as our CI solution, which uses Docker containers to run our tests. By their nature containers are ephemeral, so there’s no place for us to store artifacts, like screenshots, generated by the build.
Thankfully, capybara-screenshot includes a feature that allows you to upload your screenshots to an Amazon S3 bucket.
We can create a Rails initializer that configures this, and only performs the uploads during test runs on our Codeship builds (CI_BUILD_ID
is set automatically by Codeship):
if Rails.env.test? && ENV['CI_BUILD_ID'].present? Capybara::Screenshot.s3_configuration = { s3_client_credentials: { access_key_id: ENV['AWS_ACCESS_KEY_ID'], secret_access_key: ENV['AWS_SECRET_ACCESS_KEY'] }, bucket_name: 'screenshots' } end
Now whenever a feature spec fails in our CI build, we can download the screenshot from S3 and examine the error.
ConfigurableExceptions is another Ruby gem that allows us to toggle the show_exceptions
value programmatically, at run time, during our test execution.
We simply add the following to our spec_helper.rb
file, so that exceptions will render for feature specs but not for request specs:
config.around(:example, type: :feature) do |example| ConfigurableExceptions.show_exceptions = true example.run ConfigurableExceptions.show_exceptions = false end
With this in place we should leave the value of show_exceptions
in our test.rb
config file at its default value of false
.
Now we are all set to quickly debug previously silent exceptions, using a combination of rendered stacktraces and screenshots!
There is one final gotcha to this approach to watch out for: since we are rendering exceptions to the view, we expect our specs to fail since they won’t be able to match a have_content
, or another matcher, with the rendered stack trace. If the content you are matching on appears somewhere in the stack trace, your test would pass even though it should fail.
Therefore it is important to be specific about what you are matching on, and not to use short, generic phrases. This is a best practice when writing feature specs regardless, but its important to remember that a passing spec that should fail is even worse than a failing spec that’s hard to debug.