Zillow Tech Hub

Debugging Capybara Failures

By Zachary Wright on 25 May 2018

A Wild Capybara in its Natural Habitat

At NakedApartments we use capybara to write our behavior driven, end-to-end integration tests, aka “feature specs.” Capybara is a powerful tool that sets on top of a web driver. It offers you, the developer, a DSL that is used to write actions that closely mimic how an actual user would interact with your website, actions like clicking links and submitting forms. All the while you assert that the content you expect to be visible on a given page is actually there.

In theory capybara is amazing, useful, and time saving. It catches embarrassing bugs before they can ever reach the sensitive eyes of your users. It allows you to say, with confidence, “A user can log in to my site and create a blog post.” And for the most part this is true. It is amazing, useful, and time saving.

However, anyone who has worked with capybara also knows that when things go wrong it can be very difficult to figure out why. A friend of mine, at the height of a feature spec fueled frustration, once described capybara as “a Rube Goldberg machine of broken dreams.” Failures can be random and silent, and since by their nature they involve your entire web stack you can never be quite sure if the problem is with your code, capybara, or something inside your architecture. (It’s probably your code, I’m sorry to say.)

But even if it is your code, it can be hard to figure out what exactly is wrong with it, especially if capybara is being silent and not giving you much to go on.

Origin of the Silent Error

So what causes a silent capybara spec failure? I will preface this by saying the issue will only occur in tests that use a real browser driver, like selenium, chromedriver, or capybara-webkit. If you use RackTest, which is the default driver but doesn’t support any javascript, the problem won’t surface.

Now, lets look at an example of what what we mean by “silent.”

Let’s say you have a view template that looks like this:

<h1>Manhattan</h1>
<% ['Greenwich Village ', 'Upper East Side'].each do |neighborhood| %>
    <%= neighborhood.name %>

And a capybara spec that looks like this:

require 'spec_helper'

feature 'Manhattan', js: true do
  before do
    visit manhattan_path
  end

  it 'Contains header' do
    expect(page).to have_content('Manhattan')
  end
end

If you run this spec, it will fail and capybara will output the following:

expected to find text "Manhattan" in ""

That is incredibly unhelpful. If we return to our view code, we can quickly spot the real issue: our each block is missing a closing end.

In the real world this would raise a SyntaxError. Why here do we instead get a vague failure telling us that our view rendered an empty string, with no details as to where or why?

Capybara’s Exception Handling

The answer lies in how capybara attempts to bubble exceptions up from the thread that the browser is running in to the current test runner thread. This is why the silent exception issue only occurs when using a driver like selenium, where the browser and test runner are in separate threads; when using RackTest as the driver, both run in the same thread and therefore it’s easy for capybara to propagate exceptions.

When the two are separated capybara has to find a way to save the error messages that occur in the browser thread, then display them back in the test runner thread. It does this by wrapping the application in a bit of middleware that records the exception.

In most cases this works great, but as we have seen there is an exception to the exception, and the answer is on line 18 of lib/capybara/server.rb, in the above linked Github commit:

rescue StandardError => e
  @error = e unless @error
  raise e
end

Capybara only attempts to catch and reraise errors that inherit from StandardError. SyntaxError, on the other hand, inherits from Exception. This is a best practice in Ruby-world, the capybara devs haven’t done anything wrong. StandardError exceptions are a subset of all exceptions in Ruby, and the default behavior of rescue is actually to only catch StandardError‘s:

rescue
  puts 'This will only catch StandardError'
end

Using rescue Exception => e would actually expand the type of exceptions caught, including things like Interrupt, which is raised when you hit ctrl+c to kill an application. Other common types of Exception include NoMemoryError and, of course, SyntaxError, which is the one that bit us here.

Workaround

So what can we do to make these kinds of exceptions easier to debug?

The answer, I believe, is in how exceptions are handled when Rails runs in a development environment. In this case, exceptions that occur when rendering or fetching a view don’t “bubble up,” but are instead their stack trace is rendered in HTML in an easy to read format to help the developer debug the issue.

We can turn this behavior on in our test environment, so that instead of reraising exceptions, they will be rendered to the view.

In config/environment/test.rb, we change this value from false to true:

config.action_dispatch.show_exceptions = true

Now exceptions will be rendered instead of being raised. Our spec will still fail since the real view won’t be rendered, but instead of complaining that our content doesn’t exist within an empty string, capybara will display the full stack trace of the exception. However, there are two drawbacks to this approach:

The exceptions are difficult to read when they are printed out in our test runner.
This also impacts request specs, which aren’t ran via capybara and don’t involve a browser at all. We’d rather exceptions continue to be raised here.

Fortunately, there are workarounds to these drawbacks, too.

Screenshots

capybara-screenshot is a Ruby gem that takes a screenshot of the current page whenever a capybara spec fails. This is extremely useful even if you aren’t trying to debug a silent exception, but in our case, where we are rendering the exceptions to the view instead of raising them, it is almost essential. We need a user friendly way to read and parse these exceptions when they happen.

However, lets assume that our failing spec only happens randomly, or only seems to happen during our continuous integration (CI) builds. How do we view our screenshots in that case? At NakedApartments we use Codeship Pro as our CI solution, which uses Docker containers to run our tests. By their nature containers are ephemeral, so there’s no place for us to store artifacts, like screenshots, generated by the build.

Thankfully, capybara-screenshot includes a feature that allows you to upload your screenshots to an Amazon S3 bucket.

We can create a Rails initializer that configures this, and only performs the uploads during test runs on our Codeship builds (CI_BUILD_ID is set automatically by Codeship):

if Rails.env.test? && ENV['CI_BUILD_ID'].present?
  Capybara::Screenshot.s3_configuration = {
    s3_client_credentials: {
      access_key_id: ENV['AWS_ACCESS_KEY_ID'],
      secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
    },
    bucket_name: 'screenshots'
  }
end

Now whenever a feature spec fails in our CI build, we can download the screenshot from S3 and examine the error.

Configurable Exceptions

ConfigurableExceptions is another Ruby gem that allows us to toggle the show_exceptions value programmatically, at run time, during our test execution.

We simply add the following to our spec_helper.rb file, so that exceptions will render for feature specs but not for request specs:

config.around(:example, type: :feature) do |example|
  ConfigurableExceptions.show_exceptions = true
  example.run
  ConfigurableExceptions.show_exceptions = false
end

With this in place we should leave the value of show_exceptions in our test.rb config file at its default value of false.

Conclusion

Now we are all set to quickly debug previously silent exceptions, using a combination of rendered stacktraces and screenshots!

There is one final gotcha to this approach to watch out for: since we are rendering exceptions to the view, we expect our specs to fail since they won’t be able to match a have_content, or another matcher, with the rendered stack trace. If the content you are matching on appears somewhere in the stack trace, your test would pass even though it should fail.

Therefore it is important to be specific about what you are matching on, and not to use short, generic phrases. This is a best practice when writing feature specs regardless, but its important to remember that a passing spec that should fail is even worse than a failing spec that’s hard to debug.

Exit mobile version