Squash N+1 queries early with n_plus_one_control test matchers for Ruby and Rails

October 27, 2020

Topics

Vladimir Dementyev
Principal Backend Engineer

Translations

Chinese在Rails中尽早碾碎N+1查询

Discover the test-oriented alternative to detecting the N+1 query problem in your Rails and pure Ruby applications even before the redundant database calls manifest in your development. The n_plus_one_control gem works differently from well-known tools such as Bullet and makes sure that extra SQL queries never go unnoticed, independent of your ORM tool of choice.

Every backend Ruby developer who works with databases through object-relational mappers like Active Record or rom knows the drill: you need to be careful with your database queries so as not to generate too many of them. Even though the N+1 problem is well-known, it is still impossible to outsmart ORMs every single time: we need an automated way to signal the problem.

For the past three years, I’ve been using my own little tool that helps me to detect N+1 at the earliest stage possible: while running RSpec or Minitest on my code. My n_plus_one_control gem has been slowly maturing and has recently hit the 0.5.0 version, so I feel it’s time to show you what it’s all about and how my approach is different from more popular tools out there.

Biting the bullet

You might wonder why I’ve built yet another library to detect N+1 issues when we have a ~~silver~~ Bullet gem? Let me share a little story.

I’ve been successfully using Bullet for a few years: first, in development and staging environments, as most of us. But at some point, I realized that the test environment is a better fit for detecting and preventing N+1 problems. With some tinkering, it was possible to make Bullet work in a testing environment. The code example below creates a bulletify helper that can be used in the specs.

# spec/shared_contexts/bulletify.rb
RSpec.shared_context "bullet" do
  before(:each) do
    Bullet.enable = true
    Bullet.bullet_logger = true
    Bullet.raise = true # raise an error if N+1 query occurs
    Bullet.start_request
  end

  after(:each) do
    Bullet.perform_out_of_channel_notifications if Bullet.notification?
    Bullet.end_request
    Bullet.enable = false
    Bullet.bullet_logger = false
    Bullet.raise = false
  end
end

RSpec.configure do |config|
  config.include_context "bullet", bullet: true
  config.alias_example_to :bulletify, bullet: true
end

# spec/controllers/my_controller_spec.rb
context "N+1" do
  bulletify { get :index }
end

While working on multiple mature Rails applications, I discovered that Bullet fails to find issues when database interactions go beyond Active Record and its associations. False positives are also pretty common.

Here is an example that uses the bulletify helper from above:

# user.rb
class User < ApplicationRecord
  has_many :teams

  def supports?(team_name)
    teams.where(name: team_name).exists?
  end
end

<!-- users/index.html.erb -->
<% @users.each do |user| %>
  <li class="<%= user.supports?("FC Spartak Moscow") && "red-white" %>">
    <%= user.name %>
  </li>
<% end %>

# users_controller_spec.rb
describe "GET #index" do
  render_views

  let!(:users) { create_pair(:user, :with_teams) }

  bulletify { get :index }
end

Even though we definitely have N+1 queries here (from the #supports? method), the test would still be green.

Running and counting

I’ve started thinking about a more robust way of detecting N+1 queries, independent of Active Record internals.

And I came up with the following idea: what if we run the same code twice with a different number of records and compare the number of executed SQL queries?

If this number doesn’t depend on the size of a test set, we don’t have a problem. Otherwise, we’re likely to have an XN+Y.

This exact idea was implemented in the n_plus_one_control gem that adds N+1 detecting DSL to your Ruby testing framework of choice. Here’s the example for RSpec:

context "N+1", :n_plus_one do
  # Populate block is called multiple times
  # with different values of n (2 and 3 by default)
  populate { |n| create_list(:user, n, :with_teams) }

  specify do
    # The example body is executed
    #after each `populate` call
    expect { get :index }.to perform_constant_number_of_queries
  end
end

This test would fail with the following message:

Expected to make the same number of queries, but got:
  3 for N=2
  4 for N=3

Search and destroy N+1 violations

In the latest release, I focused on developer experience to help not only detect N+1 but also to easily find the offending code in a large application.

When working on an optimization task, I realized that simply having a test that fails due to N+1 is not enough (I knew that problem existed thanks to performance monitoring systems). So, I’ve started adding new features to the gem and came up with the following plan for finding N+1 problems:

Step 1: Write a test that fails

Creating a good performance test can be tricky. We need our background data, our setup, to cover most of the code paths, so we do not miss any unwanted database interactions. For example:

# This example uses Minitest
class PerformanceTest < ApplicationIntegrationTestCase
  def populate(scale_factor)
    scale_factor.times do
      # Here, I'm not using create_list but introducing some
      # randomness instead.
      # That would make our setup less deterministic
      # (and, thus, test more valuable)
      create(:resource, :with_tags, tags_num: [0, 1, 2].sample)
    end
    create_list(:document, scale_factor)
  end

  test "should not produce N+1 queries" do
    assert_perform_constant_number_of_queries do
      get :index
    end
  end
end

The output could look like this:

Expected to make the same number of queries, but got:
  10 for N=2
  11 for N=3
Unmatched query numbers by tables:
  resources (SELECT): 2 != 3
  permissions (SELECT): 4 != 6

NOTE: Even if we do not see the documents table listed, we shouldn’t remove this part of the setup: N+1 problems can occur in the future, so we should be prepared.

Step 2: Localize the problem

From the error message above, we can see that there are two tables affected. Let’s re-run our tests for each of them individually by using the query filtering feature. We can also enable the verbose output mode to get the list of collected queries and where they came from:

$ NPLUSONE_VERBOSE=1 \
  NPLUSONE_FILTER=resources \
  bundle exec rails test

Expected to make the same number of queries, but got:
  2 for N=2
  3 for N=3
Unmatched query numbers by tables:
  resources (SELECT): 2 != 3
Queries for N=2
   SELECT "resources".* FROM "resources" WHERE "resources"."deleted_at" IS NULL
   ↳ app/controllers/resources_controller.rb:32:in `index'
   ...
Queries for N=3
   ...

And repeat for the second table:

$ NPLUSONE_VERBOSE=1 \
  NPLUSONE_FILTER=permissions \
  bundle exec rails test

Expected to make the same number of queries, but got:
  4 for N=2
  6 for N=3
Unmatched query numbers by tables:
  permissions (SELECT): 4 != 6
Queries for N=2
  SELECT "permissions".* FROM "permissions" WHERE "permissions.user_id" = 42 AND "permissions.resource_id" = 15 AND "permissions.grants" @> '{manage}'
  ↳ app/policies/resource_policy.rb:41:in `update?'
  SELECT "permissions".* FROM "permissions" WHERE "permissions.user_id" = 42 AND "permissions.resource_id" = 15 AND "permissions.grants" @> '{invite}'
  ↳ app/policies/resource_policy.rb:56:in `invite?'
  ...
Queries for N=3
  ...

Sometimes queries could be too long, making our output harder to read. We can only display the first N characters of the query by setting the NPLUSONE_TRUNCATE environment variable.

Also, showing just a single line of the backtrace might not be very useful. No worries! You can increase the number of stack trace lines displayed through the NPLUSONE_BACKTRACE variable.

Thus, the final command could look like this:

NPLUSONE_VERBOSE=1 \
NPLUSONE_FILTER=permissions \
NPLUSONE_TRUNCATE=100 \
NPLUSONE_BACKTRACE=5 \
bundle exec rails test

To sum up, with n_plus_one_control, you can quickly identify the root causes of N+1 queries of any kind by writing a test and running it multiple times with different parameters. And after you fix all the problems, this check would become a regression test—the one preventing your code from having issues in the future!