Railing against time: tools and techniques that got us 5X faster tests
Slow testing is a drag for any project, but can be especially painful for enterprise-level projects operating at a large scale. This can lead to CI lagging, slower deployment times–and a developer experience that is just plain boring. Evil Martians recently helped a client optimize their CI pipeline, test and change their code faster, ultimately getting five times faster results! In this post, we’ll talk about some of the techniques and tools that made it happen–perhaps you’ll be inspired to do the same!
First, some background on our client: Power Home Remodeling (“POWER”) is the nation’s largest exterior home remodeler, specializing in exterior retrofits–including windows, siding, roofing, doors, attic insulation, and solar roofing panels. Both inside and out, it’s clear the company has high standards, so we set out to meet them.
Technically speaking, the project is built on Ruby on Rails and follows the component-based architecture design (aka modular monolith), with dozens of components.
A bird’s eye view of our optimization
To achieve this performance boost, we used a combination of RSpec hackery, profiling mastery (for instance, via Stackprof), and, of course, by using our own TestProf toolkit.
First, we conducted a preliminary examination of the whole test suite using Stackprof. That helped us identify some general things to improve: factories, logging, API calls, and so on. Then, after general optimizations, we switched to local test optimizations, for example, analyzing and refactoring specific test files.
Irina Nazarova CEO at Evil Martians
Overall, we submitted (and merged) three PRs, each of them bringing test suite improvements at different levels. In this post, we’ll walk you through these three optimization milestones.
The three pillars of the testing optimization mindset
So that’s the story, but let’s wait a second before we get too deep. Before optimizing tests, it’s important to understand the mindset required to get significant improvements. This mindset relies on the following three pillars.
First, search for the most valuable things to optimize. For instance, optimizing the users factory which is used across all tests will give much more value than optimizing a single monthly_report_spec.rb
file (which is rarely changed or used).
Second, find low-hanging fruits and don’t over-optimize. It’s easy to get stuck on some method, and if it’s taking 4 hours to get a 1 second better test performance, you should reconsider your priorities here.
Third, measure timings and always compare timing before and after optimizations. There can be surprising increases, which is not really what we want.
Understanding the situation
First, we had to look at different kinds of metrics to get a clear overview and pinpoint any bottlenecks. Briefly, these metrics are: the total test time, the slowest methods in the call-stack and the most-used (and slowest) factories.
Setting up the tools
To start, we installed and configured Stackprof and TestProf. Stackprof is a call-stack profiler for Ruby, and TestProf is an advanced toolbox for profiling and optimizing tests (and again, which was created by Evil Martians!)
Remember: we need to keep records of our measurements. For us, our starting test suite run time was 53 minutes. How low do you think we can decrease this number? Try to guess, and read on to find the answer (warning: you might be surprised)!
Let’s get a high-level picture of “why it’s so slow” by looking at call stacks collected via Stackprof—and by the way, that’s the first thing to do when you’re dealing with a slow test suite.
Learning from the call stacks
The best way to read an application’s call stacks is to draw them as a flamegraph. To do so, we can load the result of the TEST_STACK_PROF=1 SAMPLE=1000 bin/rspec
command into Speedscope, a flamegraph viewer that supports Stackprof and other profilers (that is, not only Ruby ones).
While actually mastering flamegraphs requires some time, the “Sandwich” view in Speedscope can give you some clues right away. In our case, we identified a couple of suspicious Active Record callbacks:
History#auto_add_history
NitroSearch::Elastic::Indexing
We can also see a lot of low-level Active Record calls (#exec_prepare
, etc.) and factory-related calls (FactoryBot#create
). Let’s take a close look at the factories using TestProf’s FactoryProf profiler.
Analyzing factories with FactoryProf
FactoryProf allows you to see which factories are most used and time-consuming (in our case, the project
and estimate
factories contributed to the test time the most).
There is also a very useful flamegraph-like profiler for factories (FPROF=flamegraph bin/rspec
) that can highlight factory cascades:
In the image above, you can notice one of the cascades: each time you call create(:candidate_interview)
, a lot of dependent records are created.
And we can also see that these cascades are caused by the repetitive creation of the project
and estimate
factories.
With this profiling information at hand, we started applying optimization patches one by one.
PR #1: General optimizations
We discovered that several Active Record callbacks (e.g., History#auto_add_history
) were being called frequently throughout the test run. Yet, the corresponding side-effects are needed only in ~1% of all tests. Thus, to improve the test time, we can disable them in tests by default and only enable them when needed.
To do that, we applied the testability pattern: adding a special Testing
module for the feature to control the invocation of the callbacks in tests:
# models/history.rb
module Testing
class << self
def fake? = @fake
def fake! = @fake = true
def real! = @fake = false
end
# ...
def auto_add_history
return if History::Testing.fake?
# ...
end
end
Here, we’re following the popular Sidekiq::Testing
interface familiar to many Ruby developers. Now, in our rails_helper.rb
we can do the following:
# spec/rails_helper.rb
# ...
History::Testing.fake!
RSpec.configure do |config|
config.before(:each) do |example|
if example.metadata[:auto_history] == true
History::Testing.real!
else
History::Testing.fake!
end
end
end
And here, we use the RSpec tagging feature to conditionally toggle the callback behavior. Whenever we need to test some functionality that relies on the auto-history feature, we add the :auto_history
to the corresponding test example:
# spec/models/history_spec.rb
describe History do
let(:fp) { create(:finance_project) }
it "automatically record history on creation", :auto_history do
h = fp.histories.last
expect(h.owner_changes["roofing_sold"]).to eq([nil, true])
expect(h.activity).to eq("Created the Finance Project")
end
# ...
end
After we applied a similar technique to the NitroSearch::Elastic::Indexing
module, we cut the total test time from 53 minutes down to 33 minutes. And that’s just the beginning!
PR #2: Optimizing factories
We found that two factories, project
and estimate
, were the slowest (according to the FactoryProf reports). We took a look at their source code and found that they have a common denominator—they both create home
and owner
associations by default:
factory :estimate do
# ...
home
association :owner, factory: :user
end
What to do if you see something like this? Try commenting out the associations and see how many tests fail! In our case, the number wasn’t that big (~50 out of thousands of tests), so we proceeded to move association creation into the corresponding traits and use them when needed:
factory :estimate do
# ...
trait :with_home do
home
end
trait :with_owner do
association :owner, factory: :user
end
end
And this is how the factories flamegraph looked after we removed unnecessary associations—the stacks got much lower (and the overall test time was reduced by another ~15%):
Fixturizing nitro_user
We also noticed that for almost every test, a default user was being created (User.current
). The common idiom we found in tests was User.current || create(:nitro_user)
. Creating a user also involved creating some other factories, so the overhead of re-creating it all the time was quite noticeable.
We decided to move the default user creation into a fixture using TestProf’s AnyFixture tool:
# spec/nitro_user.rb
require "test_prof/recipes/rspec/any_fixture"
using TestProf::AnyFixture::DSL
RSpec.shared_context "fixture:nitro_user" do
let(:nitro_user) { fixture(:nitro_user) }
end
RSpec.configure do |config|
# Create a Nitro user for the whole test run
config.before(:suite) do
fixture(:nitro_user) { FactoryBot.create(:nitro_user) }
end
config.include_context "fixture:nitro_user"
end
After introducing the fixture, we removed all the explicit creation of User.current
in tests in order to get another performance boost.
On the database state and its cleanness
While we were working on the nitro_user
fixture introduction, we found that the database wasn’t exactly left in pristine condition after our test runs, even though transactional fixtures were used.
A quick search revealed a few before(:all)
usage in tests, which we’ve replaced with before_all
—a transactional version of the original RSpec hook.
We also introduced a convenient (and conditional) hook to reset the database state, so developers working in any other, not yet fixed, branches could easily fix global state issues:
RSpec.configure do |config|
config.prepend_before(:suite) do
# Truncate all tables to make sure we start with a clean slate
ActiveRecord::Tasks::DatabaseTasks.truncate_all if ENV["CLEAN_DB"]
end
end
Let’s also note that we’re using plain Active Record, with no third-party database cleaners involved.
In the end, our second PR resulted in the further test time reduction down to 21 minutes (as a reminder, after step one we were at 33 minutes, and originally at 53!).
But we didn’t stop there… we continued with more granular optimizations.
PR #3: Optimizing specific test files
We already introduced one of TestProf’s killer features, before_all
, during the previous step, but the real power comes from its sibling—let_it_be
. The let_it_be
helper works similarly to let!
(from the usage point of view) but with a big performance difference—the data is created once and then re-used by all the tests in the file (or within an RSpec context). So, we can see let_it_be
as a local fixture.
Migrating to let_it_be
must be done file by file. How to figure out where to start this migration, and which file to choose first?
For that, TestProf provides a specific profiler called RSpecDissect. It shows how much time tests spend in let
/let!
and before(:each)
declarations:
Top 10 slowest suites (by `let` time):
HiringRequisition (./spec/models/hiring_requisition_spec.rb:5) – 00:04.126 (378 / 59) of 00:05.019 (82.21%)
...
The example report below shows that about a half of the run time of the spec/models/hiring_requisition_spec.rb
file is occupied by let
expressions. Let’s see how we refactored it!
…but, before doing any refactoring, we recorded the current state of the test file by capturing its factories profile via FactoryProf; the initial statistics we captured were as follows:
$ FPROF=1 rspec spec/models/hiring_requisition_spec.rb
Total time: 00:03.368 (out of 00:07.064)
name total top-level total time
user 237 109 3.8929s
office_location 86 41 2.3305s
...
Now, without going into a deep discussion of what this information told us, let see how we refactored the test:
- let(:office_location) { create(:office_location, territory: create(:territory)) }
+ let_it_be(:default_user) { create_default(:user) }
+ let_it_be(:office_location) { create(:office_location, territory: create(:territory)) }
What changed? First of all, we replaced let
with let_it_be
to ensure that an office_location
record is only created once for the whole file. Additionally, we applied the create_default
optimization, which allowed us to reuse the user record in all factory-generated objects that needed user
associations.
To measure the effect of the change, we ran FactoryProf again and got the following results:
FPROF=1 rspec spec/models/hiring_requisition_spec.rb
Total time: 00:01.039 (out of 00:04.341)
name total top-level total time
office_location 11 11 0.1609s
...
As you can see, the total time decreased by 3 seconds (~50%) and the factories time decreased from 3s to 1s—not bad for a single file change!
We continued applying the let_it_be
/ create_default
refactoring to other files and after approaching a few dozens of files (out of hundreds) we reached the satisfying 11 minutes.
To sum up, our three optimization steps helped us speed up tests by ~five times (from 53 to 11 minutes) in a matter of two working weeks. How do you like that? Feel free to reach out to us if you need help with slow tests!