TestProf: a good doctor for slow Ruby tests

August 11, 2024

Topics

Vladimir Dementyev
Principal Backend Engineer

Translations

JapaneseTestProf: Ruby/Railsの遅いテストを診断するgem
ChineseRuby慢测试的“良医圣手”

Writing tests is a significant part of the development process, especially in the Ruby and Rails community. But it seems we don’t care about test suite performance until we find ourselves wasting too much time waiting for the green light. Sounds familiar? Learn how to identify and fix Rails test performance bottlenecks using a specialized testing profiler, TestProf!

Other parts:

TestProf: a good doctor for slow Ruby tests
TestProf II: Factory therapy for your Ruby tests
TestProf III: guided and automated Ruby test profiling

NOTE: This post was updated in mid-2024, to reflect changes in TestProf usage. Even though almost 7 years have passed from the original release, the problems and solutions described in the post are still relevant!

Since my first year at Evil Martians, I’ve worked on many Rails applications: monoliths and moduliths (modular monoliths), API-only and Hotwired. All of them shared one common trait: test suites that left much to be improved in terms of speed. I started collecting tips and tricks on making tests faster, and eventually bundled them all into a meta-gem called TestProf, the Ruby test profiling toolbox.

Before jumping into the “hows” of tests profiling and optimization, let’s talk a bit about “whys”.

The motivation

Slow tests waste your time by making you less productive.

You’re probably asking yourself: “Why does a test suite’s performance matter?” Let me show you some statistics before proposing any arguments about that.

Irina Nazarova CEO at Evil Martians

Book a call

In 2017, I conducted a short survey where I asked Ruby developers about their testing preferences.

First of all—good news, everyone—it turned out that most Ruby developers do write tests (and frankly speaking, I wasn’t surprised). That’s what I like about Ruby community the most, by the way.

According to the survey, only a quarter of all test suites runs for more than 10 minutes—and about a half of them run for less than 5 minutes.

**Question.** For how long does your whole suite run?

**Question.** For how long does your whole suite run?

Seems like the situation isn’t so bad, doesn’t it? Let’s look at only hte test suites that have more than 1000 tests.

**Question.** How long does your entire suite run? (with more than 1000 tests)

How long does your whole suite run? — **Question.** How long does your entire suite run? (with more than 1000 tests)

This looks much worse now: about half of all suites runs for more than 10 minutes, and almost 30% for more than 20 minutes.

(By the way, a typical Rails application I’ve worked on had 10K to 20K tests.)

Of course, you don’t have to run the whole suite every time you make a change. Usually, I run about 100 examples before each commit while working on a mid-size feature, and it only takes about a minute. But even that “minute” affects my feedback loop (see Justin Searls talk) and wastes my time.

Nevertheless, we must run all the tests during our deployment cycle, using CI services. How would you like waiting dozens of minutes (or even hours if there are loads of builds in the queue) to deploy a hotfix? I doubt very much.

Concurrent builds may help, but they come at a price. Take a look at the chart below:

**Question.** How many parallel jobs do you run on your CI?

**Question.** How many parallel jobs do you run on your CI?

For example, during my first test optimization project at Evil Martians for Fountain, we had 5x parallelism and our average (per job) RSpec time was 2.That’s 5 minutes for ~2k tests. That means our TPM (tests per minute) was about 800.

Before any optimizations, 750 tests ran for ~5 minutes—that is only 150TPM! So, even though the absolute speed increase was about 3-4 minutes per build, the relative difference was huge—tests became 5x faster! We were prepared for the rapid growth in the number of tests, and avoided a catastrophical situation of long CI feedback loops—can you just imagine waiting, say, 30 minutes for each build? What about hours? Well, we’ve seen projects like that, too.

To sum up, slow tests make you less productive by increasing local and CI feedback loops. The good news is that, in most cases, speeding up tests doesn’t require a huge amount of time, only the right tool for the job. Speaking of which, let’s take a closer look at TestProf.

The toolbox

TestProf helps identify the bottlenecks in your test suite and provides you with recipes to fix them.

Let me demonstrate how I use it to analyze and optimize test suites.

General profiling

Usually, the most noticeable improvement comes from optimizing the most frequently-used features (methods, patterns, etc.). To find these frequent suspects, we start by collecting some high-level information about the test suite, without looking into test files themselves.

Try to answer the following questions first:

Where does your suite spend more time: in the controllers, models, services, or jobs?
What is the most time-consuming module/method?

Not so easy to guess, right? TestProf can spare you any guessing games and just give the answers instead.

To answer the first question, you can use Tag Profiler from TestProf; it allows you to collect statistics grouped by a particular tag value. For instance, RSpec automatically adds a type tag, so we can use it where relevant:

$ TAG_PROF=type rspec

[TEST PROF INFO] TagProf report for type

          type          time   total  %total   %time           avg

    controller     08:02.721    2822   39.04   34.29     00:00.171
       service     05:56.686    1363   18.86   25.34     00:00.261
         model     04:26.176    1711   23.67   18.91     00:00.155
           job     01:58.766     327    4.52    8.44     00:00.363
       request     01:06.716     227    3.14    4.74     00:00.293
          form     00:37.212     218    3.02    2.64     00:00.170
         query     00:19.186      75    1.04    1.36     00:00.255
        facade     00:18.415      95    1.31    1.31     00:00.193
    serializer     00:10.201      19    0.26    0.72     00:00.536
        policy     00:06.023      65    0.90    0.43     00:00.092
     presenter     00:05.593      42    0.58    0.40     00:00.133
        mailer     00:04.974      41    0.57    0.35     00:00.121
        ...

This means you can focus only on the most time consuming test types when searching for bottlenecks while ignoring everything else—after all, it doesn’t make sense to optimize test types that are contributing much less to the overall test suite run time.

Okay, what about the most consuming methods? Here we can use the existing, or generic, Ruby profilers, such as Stackprof, RubyProf, or the newer Vernier.

TestProf helps you run them against your test suite without any tweaks—just specify the corresponding environment variable and follow the instructions in the output:

$ TEST_RUBY_PROF=1 bundle exec rake test

...
[TEST PROF INFO] StackProf report generated: tmp/test_prof/stack-prof-report-wall-raw-boot.dump
[TEST PROF INFO] StackProf JSON report generated: tmp/test_prof/stack-prof-report-wall-raw-boot.json

The reports generated by these profilers can help you identify your hottest stack paths and, hence, answer the second question above.

Unfortunately, this type of profiling requires a lot of resources and slows down your already not-so-fast test suite even more. You need to run it against a small portion of tests, but how to choose this selection? Well, randomly!

TestProf includes a special patch which allows you to run a random subset of RSpec example groups or Minitest suites:

SAMPLE=10 bundle exec rspec

To illustrate, here’s what you may find in the Stackprof profile:

%self     calls  name
20.85       721   <Class::BCrypt::Engine>#__bc_crypt
 2.31      4690  *ActiveSupport::Notifications::Instrumenter#instrument
 1.12     47489   Arel::Visitors::Visitor#dispatch
 1.04    205208   String#to_s
 0.87    531377   Module#===
 0.87    117109  *Class#new

It turns out that the encryption settings for Sorcery, an authentication library we used, were similar across all environments, i.e., as strong in tests as in production—but why be so serious about encryption in tests? Well, the fix was just a single line of code, but gave us ~20% performance boost!

Other typical methods that you may find in generic profiler reports usually include many Active Record interactions (e.g., ActiveRecord::PostgreSQLAdapter#exec_no_cache, and similar ones).

Let’s go deeper and see how our tests use the database.

Database interactions

Do you know how much time your test suite spends in the database? Again, skip the guessing, and just use TestProf to give your number.

We’ve already expanded on instrumentation in Rails (ActiveSupport Notification and Instrumentation features), so let’s skip the basics and introduce the Event Profiler.

EventProf collects instrumentation metrics during your test suite run and provides a report with general information and top-N slowest groups and examples concerning the specified metrics.

At present, it only supports ActiveSupport::Notifications out-of-the-box, but it can be easily integrated with your own solution.

To get information about the database usage, we can use the sql.active_record event. Then, the report will look like this (very similar to rspec --profile):

$ EVENT_PROF=sql.active_record rspec

[TEST PROF INFO] EventProf results for sql.active_record

Total time: 00:05.045 00:24.512 (21.00%)
Total events: 6322

Top 5 slowest suites (by time):

MessagesController (./spec/controllers/messages_controller_spec.rb:3) – 00:03.253 (4058 / 100)
UsersController (./spec/controllers/users_controller_spec.rb:3) – 00:01.508 (1574 / 58)
Confirm (./spec/services/confirm_spec.rb:3) – 00:01.255 (1530 / 8)
RequestJob (./spec/jobs/request_job_spec.rb:3) – 00:00.311 (437 / 3)
ApplyForm (./spec/forms/apply_form_spec.rb:3) – 00:00.118 (153 / 5)

For the project under consideration, the amount of time spent for database queries is about 20%—how to interpret this number?

There is no single good value for this metric that fits every project. It’s actually highly dependent on your testing style: that is, whether you write more unit tests or integration tests. Although, from our experience, the sql.active_record is usually around 10-15%, so 20% looks suspicious.

What are some typical reasons for this high database time? It’s impossible to name them all, but let me n name some of them:

Useless data persistence
Heavy setup phase
Factory cascades

The first one is the famous User.new vs. User.create problem (or build_stubbed vs. create in case of FactoryBot)—you likely don’t need to hit the database to unit-test your models. So don’t do that, right?

But what if I’m already doing that? How to figure out which tests don’t need persistence? Here comes Factory Doctor!

FactoryDoctor tells you when you’re creating unnecessary data:

$ FDOC=1 rspec

[TEST PROF INFO] FactoryDoctor report

Total (potentially) bad examples: 2
Total wasted time: 00:13.165

User (./spec/models/user_spec.rb:3)
  validates name (./spec/user_spec.rb:8)–1 record created, 00:00.114
  validates email (./spec/user_spec.rb:8)–2 records created, 00:00.514

Unfortunately, FactoryDoctor isn’t some sort of all-powerful wizard doctor (it’s still learning) and it sometimes produces false negatives. In any case, we still recommend giving it a try and seeing if you can save some time by changing to a lighter data generation strategy.

The second problem, heavy setups_, is a bit more tricky. Consider this example:

describe BeatleSearchQuery do
  # We want to test our searching feature,
  # so we need some data for every example
  let!(:paul) { create(:beatle, name: 'Paul') }
  let!(:ringo) { create(:beatle, name: 'Ringo') }
  let!(:george) { create(:beatle, name: 'George') }
  let!(:john) { create(:beatle, name: 'John') }

  # and about 15 examples here
end

You might think: “Just use fixtures here.” And that’s a good point, unless you’re working on a pretty big project with dozens of models changing every day.

Another option is to use a before(:all) hook and generate data once. But there is a caveat here—we have to cleanup the database manually, because before(:all) runs outside of the transaction.

Alternativley, we could manually wrap the whole group in a transaction! And that is exactly what TestProf’s before_all helper does:

describe BeatleSearchQuery do
  before_all do
    @paul = create(:beatle, name: 'Paul')
    @ringo = create(:beatle, name: 'Ringo')
    @george = create(:beatle, name: 'George')
    @john = create(:beatle, name: 'John')
  end

  # and about 15 examples here
end

If you want to share context between different groups (files), consider using Any Fixture, which allows you to generate fixtures from code (e.g. using factories).

Factory cascades

Factory cascades are a very common, but rarely addressed problem that can bring your entire test suite to a crawl. In short, this is the uncontrollable process of generating excess data through nested factory invocations. But TestProf knows how to deal with this, and we’ve dedicated the next post in this series to this subject alone—make sure you read it.

Background jobs

Let’s take a look at one more common bottleneck in Rails application tests and demonstrate how TestProf can help you refactor you tests for the sake of speed without even touching them (manually, that is).

There’s a common practice to inline background jobs in tests (e.g. Sidekiq::Testing.inline!).

Usually, we put some heavy stuff in the background, so running all jobs unconditionally slows down the run time.

TestProf supports profiling time spent in background jobs. For Sidekiq, you can use EventProf with the special sidekiq.inline event:

$ EVENT_PROF=sidekiq.inline bundle exec rspec spec/requests

[TEST PROF INFO] EventProf results for sidekiq.inline

Total time: 00:03.515 00:14.512 (24.00%)
Total events: 851
...

Now, once you know the exact time wasted, what should you then do? Simply turning off inline mode would likely break a lot of tests—too many to fix quickly.

The solution is to turn off inlining globally and use it only when necessary. If you use RSpec, you can do the following:

RSpec.configure do |config
  config.before(:each) do |ex|
    if ex.metadata[:sidekiq] == :inline
      Sidekiq::Testing.inline!
    else
      Sidekiq::Testing.fake!
    end
  end
end

# And use the tag to enable the inline mode
it "do some bg", sidekiq: :inline do
  # ...
end

But, you still have to add the tags to every failing example …don’t you? Actually, thanks to TestProf, you don’t.

Instead, there’s a special tool which is part of the toolbox called RSpec Stamp. It automagically adds the specified tags:

$ RSTAMP=sidekiq:inline bundle exec rspec

...

Total patches: 62
Total files: 30

RSpec Stamp uses Ripper under the hood to parse source files and correctly insert tags. You can find the complete manual on how to migrate from inline! to fake! in our guides.

TestProf is on GitHub and rubygems.org, ready for you to plug into your application and get some help improving your test suite performance. You can also check out our comprehensive documentation.

All that said, this post is only an introduction to TestProf and doesn’t cover all the features. Jump onward to the next part of this series: TestProf II: Factory therapy for your Ruby tests to learn more about TestProf’s doctor bag of tools that can make your tests snappier, your TDD feedback loop shorter, and can help you become a happier Ruby developer.