Custom "cops" for RuboCop: an emergency service for your Ruby code

June 22, 2021

Topics

Dmitry Tsepelev
Sr. Backend Engineer

It is hard to find a Ruby developer who never heard of RuboCop. In all the recent Evil Martians projects, RuboCop is included in the CI pipeline from the early days. However, sometimes it may be helpful to write more rules manually and to enforce some project-specific practices. By the end of the article, you’ll be in a great position to cope with custom cops, add them to your application, or even extract them into a separate gem. Let’s get started!

In our experience, most well-maintained, large, and cutting-edge Ruby applications use RuboCop—a tool for checking and maintaining Ruby code style. Just include RuboCop into your CI suite, and it will save time and effort when reviewing the pull requests for basic style violations.

At some point, you realize that the built-in rules don’t quite cut it for your application, so you want to add some more. Another perk of using the custom cops is how helpful they can be when refactoring a mature application: add a rule that stops some unfortunate practice, generate a TODO config to make RuboCop pass, and refactor all the occurrences one by one. Alternatively, you can implement autocorrection, and RuboCop will take care of the rest! Check out some cops to combat legacy code and enforce some practices:

a cop that restricts accessing ENV;
a cop that prevents a project-specific bad project practice and knows how to autocorrect it;
a cop that requires the queue to be explicitly defined in a Sidekiq job and makes sure it exists in sidekiq.yml (thanks to @stem from epicery);
some custom cops by charitywater;
dozens of custom cops from Airbnb;
a cop that enforces single-line block braces from TestDouble;
cops for Sorbet (a static type checker).

Writing cops 101

Let’s write a simple cop to make sure all arguments in your GraphQL mutations are in snake case:

# good
class BanUser < BaseMutation
  argument :user_id, ID, required: true
end

# bad
class BanUser < BaseMutation
  argument :userId, ID, required: true
end

Here it goes:

class ArgumentName < RuboCop::Cop::Cop
  def_node_matcher :argument_name, <<~PATTERN
    (send nil? :argument (:sym $_) ...)
  PATTERN

  MSG = "Use snake_case for argument names".freeze
  SNAKE_CASE = /^[\da-z_]+[!?=]?$/.freeze

  def on_send(node)
    argument_name(node) do |name|
      next if name.match?(SNAKE_CASE)

      add_offense(node)
    end
  end
end

Let’s explore the code line by line. We start by inheriting our class from RuboCop::Cop::Cop, which serves as a base class for all cops. You can also define your own base class (which inherits from RuboCop::Cop::Cop) if you want.

Heads up! This section is a brief recap of the brilliant official guide. Please refer to it if you need more details on AST and node patterns.

Then we define a helper called argument_name, which can determine if a given node is an argument definition or not. The second argument to pass to the def_node_matcher helper is a pattern, which describes the node we want to look for. Since RuboCop uses parser, we can apply it to inspect any code using it:

ruby-parse -e 'argument :user_id, ID, required: true'

(send nil :argument
  (sym :user_id)
  (const nil :ID)
  (kwargs
    (pair
      (sym :required)
      (true))))

Can you see that the pattern we passed to the def_node_matcher is similar to the AST above? We look for the :argument method call (send) with the :sym argument and use … to tell RuboCop that we don’t care about the rest of the arguments. We also use nil? predicate to specify that the argument method is not used on any object. $_ is a capture: if RuboCop decides that the passed node matches a given pattern, it yields the capture to the block or returns it from the method call.

Let’s move to the on_send method. It’s one of the most extensively used callback methods; it’s called when RuboCop finds a send node in AST. Now we can use our argument_name helper to get the argument name (if a given node does not match the pattern, it won’t even yield). The only thing we need to do is to perform additional checks (in our case—check if the argument name is in a snake case) and add the offense. By default, add_offense searches for the MSG constant defined in the current cop class.

Fancy to try it for a real project? Let’s move our cop to the lib/custom_cops folder, wrap it into the CustomCops module, and require our cop inside the .rubocop.yml:

# .rubocop.yml
require:
  - ./lib/custom_cops/argument_name.rb

At this point, RuboCop should add our cop to its list and run it during the regular rubocop checks! 🥳

Let’s write another cop to learn more tricks. Imagine that you want scopes inside Rails models to be alphabetically sorted, for instance:

# bad
class User < ApplicationRecord
  scope :inactive, -> { where(active: false) }
  scope :active, -> { where(active: true) }
end

# good
class User < ApplicationRecord
  scope :active, -> { where(active: true) }
  scope :inactive, -> { where(active: false) }
end

Here is the cop:

class OrderedScopes < RuboCop::Cop::Cop
  def_node_search :scope_declarations, <<~PATTERN
    (send nil? :scope (:sym _) ...)
  PATTERN

  MSG = "Scopes should be sorted in an alphabetical order. " \
        "Scope `%<current>s` should appear before `%<previous>s`.".freeze

  def on_class(node)
    scope_declarations(node).each_cons(2) do |previous, current|
      next if scope_name(current) > scope_name(previous)

      register_offense(previous, current)
    end
  end

  private

  def register_offense(previous, current)
    message = format(
      MSG,
      previous: scope_name(previous),
      current: scope_name(current)
    )
    add_offense(current, message: message)
  end

  def scope_name(node)
    node.first_argument.value.to_s
  end
end

At the very beginning, we used a new helper called def_node_search. What’s the difference with the def_node_matcher? First, it iterates over the child nodes (while def_node_matcher checks only the current node), and second, it returns a value depending on the passed method name. If it ends with ?—the generated method will check if there is at least one matching child node and return true or false; otherwise, it will return a list of matching nodes. In our case, we look for all the scope calls.

Earlier, we used the on_send callback to check all the send nodes independently, but now we have a new challenge: we need to compare scopes placed one after another inside the class. So we use the on_class hook. Inside the method, we find all scope declarations (using the helper generated by def_node_search), combine them into pairs, ensure they are properly ordered, and add an offense when the ordering is incorrect.

Since we want to include names of scopes to the offense message, we have to insert them into the MSG and pass the resulting message to the add_offense explicitly.

⚠️ Optional homework: what should we change to check the ordering of scopes in groups only, separated by empty lines or other code? For instance, the following code should be valid:

class User < ApplicationRecord
  # the first group starts
  scope :banned, -> { where(status: STATUS_BANNED) }

  # the second group starts
  scope :active, -> { where(active: true) }
  scope :inactive, -> { where(active: false) }
end

Auto-correct

In some cases, a rule is so simple that we can automatically fix the violating code by changing a couple of lines. It’s the perfect case for the auto-correct that is run with rubocop --auto-correct-all. Supporting auto-correct is nothing more than adding the AutoCorrecor mixing into the cop and using the corrector object. Let’s implement auto-correct for our ArgumentName rule:

class ArgumentName < RuboCop::Cop::Cop
  extend AutoCorrector

  def_node_matcher :argument_name, <<~PATTERN
    (send nil? :argument (:sym $_) ...)
  PATTERN

  MSG = "Use snake_case for argument names".freeze
  SNAKE_CASE = /^[\da-z_]+[!?=]?$/.freeze

  def on_send(node)
    argument_name(node) do |name|
      next if name.match?(SNAKE_CASE)

      add_offense(node) do |corrector|
        downcased_name = name.to_s.gsub!(/(.)([A-Z])/, '\1_\2').downcase
        corrector.replace(node, node.source.sub(name.to_s, downcased_name))
      end
    end
  end
end

What’s new here? First, a cop with auto-correct mode should extend the AutoCorrector module. Second, we should pass a block to the add_offense, where the corrector argument will be passed. This object allows us to perform various operations on the node: in our case, we take the snake-cased argument name, make it camel-cased and replace it in the node.

Extracting cops into a separate gem

One day you might decide that you want to use your custom cops in a different project, or you might want to include some of the best practices in your open source project. No matter how you get here, you now want to move your cops to a separate gem.

I suppose you can think that this task is so regular that there should be a generator, and that’s true! The generator is called rubocop-extension-generator:

$ gem install rubocop-extension-generator
$ rubocop-extension-generator rubocop-custom

The result of this command is a gem with the rubocop dependency and a folder for cops (in our case, lib/rubocop/cop). By the way, it also generates a special Inject hack:

module RuboCop
  module Custom
    module Inject
      def self.defaults!
        path = CONFIG_DEFAULT.to_s
        hash = ConfigLoader.send(:load_yaml_configuration, path)
        config = Config.new(hash, path).tap(&:make_excludes_absolute)
        puts "configuration from #{path}" if ConfigLoader.debug?
        config = ConfigLoader.merge_with_default(config, path)
        ConfigLoader.instance_variable_set(:@default_configuration, config)
      end
    end
  end
end

This file adds your gem’s default configuration to RuboCop—RuboCop is yet to support extension configs naturally.

⚠️ Currently, generator uses the old version of RuboCop (0.8). If you want to run examples from this article, update the version in the Gemfile to “1.1”.

Of course, there is a generator to create new cops:

$ bundle exec rake 'new_cop[Custom/MaxScopes]'

Custom is a namespace (our gem is called rubocop-custom), while MaxScopes is a new cop’s name. This generator did three things: it created a file for the new cop, requested it in the lib/rubocop/cop/custom_cops, and added a config stub to the config/default.yml. Let’s implement the cop, which limits the number of scopes in the model:

module RuboCop
  module Cop
    module Custom
      class MaxScopes < Base
        def_node_search :scope_declarations, <<~PATTERN
          (send nil? :scope (:sym _) ...)
        PATTERN

        MSG = "Class should not have more than %<max_scopes>s scopes.".freeze

        def on_class(node)
          scope_count = scope_declarations(node).count
          return if scope_count <= max_scopes

          register_offense(node)
        end

        private

        def register_offense(node)
          message = format(MSG, max_scopes: max_scopes)
          add_offense(node, message: message)
        end

        def max_scopes
          cop_config["MaxScopes"]
        end
      end
    end
  end
end

We use the on_class callback to get the class root: def_node_search is to find all the scope declarations and, if the count of scopes exceeds a pre-set maximum, the error will be added to the class. To read the configuration, we use a cop_config object. There will be an error if nothing is configured, so let’s set up a default value in the config/default.yml:

Custom/MaxScopes:
  Description: "Limit scope count in the Rails model"
  MaxScopes: 2

Perhaps you noticed that earlier we used the same pattern for detecting scopes. At that moment, we defined def_node_matcher and def_node_search right inside cops. But we also can make them shareable: move them to a module and include them everywhere we want. In this case, we add include NodePattern::Macros to the module.

Writing specs

When it comes to testing our cops, we should always be prepared for dozens of edge cases. Luckily, writing specs for cops is a no-brainer: you just need to pass a string containing sample code to expect_offense or expect_no_offenses helper:

RSpec.describe RuboCop::Cop::Custom::MaxScopes do
  subject(:cop) { described_class.new(config) }

  let(:config) { RuboCop::Config.new("Custom/MaxScopes" => { "MaxScopes" => 2 }) }

  context "when class has more than MaxScopes scopes" do
    it 'registers an offense when using `#bad_method`' do
      expect_offense(<<~RUBY)
        class User < ApplicationRecord
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Class should not have more than 2 scopes.
          scope :active, -> { where(active: true) }
          scope :inactive, -> { where(active: false) }
          scope :banned, -> { where(status: "banned") }
        end
      RUBY
    end
  end

  context "when class has scope count equal to MaxScopes" do
    it 'does not register an offense`' do
      expect_no_offenses(<<~RUBY)
        class User < ApplicationRecord
          scope :active, -> { where(active: true) }
          scope :inactive, -> { where(active: false) }
        end
      RUBY
    end
  end
end

You can initialize config with any options to test different behaviors.

Advanced cop configuration

Sometimes we need to run cops against a specific set of files:

rubocop-rspec cares only about *_spec.rb files;
rubocop-graphql checks files inside the graphql folder only.

In this case, we add a default top-level configuration:

GraphQL:
  Include:
    - "**/graphql/**/*"

If you ever explored various cops, you are likely to have noticed many similarities in their configurations. For instance, when you need to have a configurable constant (like that one in MaxScopes), there is a handy helper ExcludeLimit. Let’s add it to our cop:

module RuboCop
  module Cop
    module Custom
      class MaxScopes < Base
        extend RuboCop::ExcludeLimit

        exclude_limit 'MaxScopes'

        def_node_search :scope_declarations, <<~PATTERN
          (send nil? :scope (:sym _) ...)
        PATTERN

        MSG = "Class should not have more than %<max_scopes>s scopes.".freeze

        def on_class(node)
          scope_count = scope_declarations(node).count
          return if scope_count < max_scopes

          register_offense(node)
        end

        private

        def register_offense(node)
          message = format(MSG, max_scopes: max_scopes)
          add_offense(node, message: message)
        end

        def max_scopes
          cop_config["MaxScopes"]
        end
      end
    end
  end
end

Why do we prefer the module over the good old cop_config["MaxScopes"] sometimes? ExcludeLimit helps when RuboCop is used with the --auto-gen-config option, which generates a TODO config that allows iterative codebase migration to new rules.

In some cases, cops allow coping with completely different configurable behaviors. For instance, Style/NilComparison cop lets you choose whether you like x.nil? or x == nil. There is a built-in module ConfigurableEnforcedStyle responsible for cop styles configuration: you just need to include the module, and the style method will manage the rest. Here is the example of ConfigurableEnforcedStyle usage.

Another helpful mixin is the RuboCop::Cop::CodeLength that helps to check the length of a code segment. It supports the following configuration options:

Max—maximum number of lines;
CountComments—option to exclude comments from count;
CountAsOne—allows counting arrays, hashes, and heredocs as one line.

When CodeLength is included in the cop, all you need to do is to pass a code block to the check_code_length method. Here is an example from Metrics/MethodLength cop:

module RuboCop
  module Cop
    module Metrics
      class MethodLength < Base
        include CodeLength

        def on_def(node)
          return if ignored_method?(node.method_name)

          check_code_length(node)
        end

        def on_block(node)
          return unless node.send_node.method?(:define_method)

          check_code_length(node)
        end
      end
    end
  end
end

Thank you for reading! We hope you can use this little guide as a reference to tidy up your codebase even further—through custom rules that fit your unique project needs. Feel free to give as a shout if you want to talk about the current tooling at your company or need Evil Martians to help you set up and automatically enforce better development practices.