Custom "cops" for RuboCop: an emergency service for your Ruby code
It is hard to find a Ruby developer who never heard of RuboCop. In all the recent Evil Martians projects, RuboCop is included in the CI pipeline from the early days. However, sometimes it may be helpful to write more rules manually and to enforce some project-specific practices. By the end of the article, you’ll be in a great position to cope with custom cops, add them to your application, or even extract them into a separate gem. Let’s get started!
In our experience, most well-maintained, large, and cutting-edge Ruby applications use RuboCop—a tool for checking and maintaining Ruby code style. Just include RuboCop into your CI suite, and it will save time and effort when reviewing the pull requests for basic style violations.
At some point, you realize that the built-in rules don’t quite cut it for your application, so you want to add some more. Another perk of using the custom cops is how helpful they can be when refactoring a mature application: add a rule that stops some unfortunate practice, generate a TODO config to make RuboCop pass, and refactor all the occurrences one by one. Alternatively, you can implement autocorrection, and RuboCop will take care of the rest! Check out some cops to combat legacy code and enforce some practices:
- a cop that restricts accessing ENV;
- a cop that prevents a project-specific bad project practice and knows how to autocorrect it;
- a cop that requires the queue to be explicitly defined in a Sidekiq job and makes sure it exists in
sidekiq.yml
(thanks to @stem from epicery); - some custom cops by charitywater;
- dozens of custom cops from Airbnb;
- a cop that enforces single-line block braces from TestDouble;
- cops for Sorbet (a static type checker).
Writing cops 101
Let’s write a simple cop to make sure all arguments in your GraphQL mutations are in snake case:
# good
class BanUser < BaseMutation
argument :user_id, ID, required: true
end
# bad
class BanUser < BaseMutation
argument :userId, ID, required: true
end
Here it goes:
class ArgumentName < RuboCop::Cop::Cop
def_node_matcher :argument_name, <<~PATTERN
(send nil? :argument (:sym $_) ...)
PATTERN
MSG = "Use snake_case for argument names".freeze
SNAKE_CASE = /^[\da-z_]+[!?=]?$/.freeze
def on_send(node)
argument_name(node) do |name|
next if name.match?(SNAKE_CASE)
add_offense(node)
end
end
end
Let’s explore the code line by line. We start by inheriting our class from RuboCop::Cop::Cop
, which serves as a base class for all cops. You can also define your own base class (which inherits from RuboCop::Cop::Cop
) if you want.
Heads up! This section is a brief recap of the brilliant official guide. Please refer to it if you need more details on AST and node patterns.
Then we define a helper called argument_name
, which can determine if a given node is an argument definition or not. The second argument to pass to the def_node_matcher
helper is a pattern, which describes the node we want to look for. Since RuboCop uses parser, we can apply it to inspect any code using it:
ruby-parse -e 'argument :user_id, ID, required: true'
(send nil :argument
(sym :user_id)
(const nil :ID)
(kwargs
(pair
(sym :required)
(true))))
Can you see that the pattern we passed to the def_node_matcher
is similar to the AST above? We look for the :argument
method call (send
) with the :sym
argument and use …
to tell RuboCop that we don’t care about the rest of the arguments. We also use nil?
predicate to specify that the argument
method is not used on any object. $_
is a capture: if RuboCop decides that the passed node matches a given pattern, it yields the capture to the block or returns it from the method call.
Let’s move to the on_send
method. It’s one of the most extensively used callback methods; it’s called when RuboCop finds a send
node in AST. Now we can use our argument_name
helper to get the argument name (if a given node does not match the pattern, it won’t even yield). The only thing we need to do is to perform additional checks (in our case—check if the argument name is in a snake case) and add the offense. By default, add_offense
searches for the MSG
constant defined in the current cop class.
Fancy to try it for a real project? Let’s move our cop to the lib/custom_cops
folder, wrap it into the CustomCops
module, and require our cop inside the .rubocop.yml
:
# .rubocop.yml
require:
- ./lib/custom_cops/argument_name.rb
At this point, RuboCop should add our cop to its list and run it during the regular rubocop
checks! 🥳
Let’s write another cop to learn more tricks. Imagine that you want scopes inside Rails models to be alphabetically sorted, for instance:
# bad
class User < ApplicationRecord
scope :inactive, -> { where(active: false) }
scope :active, -> { where(active: true) }
end
# good
class User < ApplicationRecord
scope :active, -> { where(active: true) }
scope :inactive, -> { where(active: false) }
end
Here is the cop:
class OrderedScopes < RuboCop::Cop::Cop
def_node_search :scope_declarations, <<~PATTERN
(send nil? :scope (:sym _) ...)
PATTERN
MSG = "Scopes should be sorted in an alphabetical order. " \
"Scope `%<current>s` should appear before `%<previous>s`.".freeze
def on_class(node)
scope_declarations(node).each_cons(2) do |previous, current|
next if scope_name(current) > scope_name(previous)
register_offense(previous, current)
end
end
private
def register_offense(previous, current)
message = format(
MSG,
previous: scope_name(previous),
current: scope_name(current)
)
add_offense(current, message: message)
end
def scope_name(node)
node.first_argument.value.to_s
end
end
At the very beginning, we used a new helper called def_node_search
. What’s the difference with the def_node_matcher
? First, it iterates over the child nodes (while def_node_matcher
checks only the current node), and second, it returns a value depending on the passed method name. If it ends with ?
—the generated method will check if there is at least one matching child node and return true
or false
; otherwise, it will return a list of matching nodes. In our case, we look for all the scope
calls.
Earlier, we used the on_send
callback to check all the send
nodes independently, but now we have a new challenge: we need to compare scopes placed one after another inside the class. So we use the on_class
hook. Inside the method, we find all scope declarations (using the helper generated by def_node_search
), combine them into pairs, ensure they are properly ordered, and add an offense when the ordering is incorrect.
Since we want to include names of scopes to the offense message, we have to insert them into the MSG and pass the resulting message to the add_offense
explicitly.
⚠️ Optional homework: what should we change to check the ordering of scopes in groups only, separated by empty lines or other code? For instance, the following code should be valid:
class User < ApplicationRecord
# the first group starts
scope :banned, -> { where(status: STATUS_BANNED) }
# the second group starts
scope :active, -> { where(active: true) }
scope :inactive, -> { where(active: false) }
end
Auto-correct
In some cases, a rule is so simple that we can automatically fix the violating code by changing a couple of lines. It’s the perfect case for the auto-correct that is run with rubocop --auto-correct-all
. Supporting auto-correct is nothing more than adding the AutoCorrecor
mixing into the cop and using the corrector
object. Let’s implement auto-correct for our ArgumentName
rule:
class ArgumentName < RuboCop::Cop::Cop
extend AutoCorrector
def_node_matcher :argument_name, <<~PATTERN
(send nil? :argument (:sym $_) ...)
PATTERN
MSG = "Use snake_case for argument names".freeze
SNAKE_CASE = /^[\da-z_]+[!?=]?$/.freeze
def on_send(node)
argument_name(node) do |name|
next if name.match?(SNAKE_CASE)
add_offense(node) do |corrector|
downcased_name = name.to_s.gsub!(/(.)([A-Z])/, '\1_\2').downcase
corrector.replace(node, node.source.sub(name.to_s, downcased_name))
end
end
end
end
What’s new here? First, a cop with auto-correct mode should extend the AutoCorrector
module. Second, we should pass a block to the add_offense
, where the corrector
argument will be passed. This object allows us to perform various operations on the node: in our case, we take the snake-cased argument name, make it camel-cased and replace it in the node.
Extracting cops into a separate gem
One day you might decide that you want to use your custom cops in a different project, or you might want to include some of the best practices in your open source project. No matter how you get here, you now want to move your cops to a separate gem.
I suppose you can think that this task is so regular that there should be a generator, and that’s true! The generator is called rubocop-extension-generator:
$ gem install rubocop-extension-generator
$ rubocop-extension-generator rubocop-custom
The result of this command is a gem with the rubocop
dependency and a folder for cops (in our case, lib/rubocop/cop
). By the way, it also generates a special Inject
hack:
module RuboCop
module Custom
module Inject
def self.defaults!
path = CONFIG_DEFAULT.to_s
hash = ConfigLoader.send(:load_yaml_configuration, path)
config = Config.new(hash, path).tap(&:make_excludes_absolute)
puts "configuration from #{path}" if ConfigLoader.debug?
config = ConfigLoader.merge_with_default(config, path)
ConfigLoader.instance_variable_set(:@default_configuration, config)
end
end
end
end
This file adds your gem’s default configuration to RuboCop—RuboCop is yet to support extension configs naturally.
⚠️ Currently, generator uses the old version of RuboCop (0.8). If you want to run examples from this article, update the version in the Gemfile to “1.1”.
Of course, there is a generator to create new cops:
$ bundle exec rake 'new_cop[Custom/MaxScopes]'
Custom
is a namespace (our gem is called rubocop-custom
), while MaxScopes
is a new cop’s name. This generator did three things: it created a file for the new cop, requested it in the lib/rubocop/cop/custom_cops
, and added a config stub to the config/default.yml
. Let’s implement the cop, which limits the number of scopes in the model:
module RuboCop
module Cop
module Custom
class MaxScopes < Base
def_node_search :scope_declarations, <<~PATTERN
(send nil? :scope (:sym _) ...)
PATTERN
MSG = "Class should not have more than %<max_scopes>s scopes.".freeze
def on_class(node)
scope_count = scope_declarations(node).count
return if scope_count <= max_scopes
register_offense(node)
end
private
def register_offense(node)
message = format(MSG, max_scopes: max_scopes)
add_offense(node, message: message)
end
def max_scopes
cop_config["MaxScopes"]
end
end
end
end
end
We use the on_class
callback to get the class root: def_node_search
is to find all the scope declarations and, if the count of scopes exceeds a pre-set maximum, the error will be added to the class. To read the configuration, we use a cop_config
object. There will be an error if nothing is configured, so let’s set up a default value in the config/default.yml
:
Custom/MaxScopes:
Description: "Limit scope count in the Rails model"
MaxScopes: 2
Perhaps you noticed that earlier we used the same pattern for detecting scopes. At that moment, we defined def_node_matcher
and def_node_search
right inside cops. But we also can make them shareable: move them to a module and include them everywhere we want. In this case, we add include NodePattern::Macros
to the module.
Writing specs
When it comes to testing our cops, we should always be prepared for dozens of edge cases. Luckily, writing specs for cops is a no-brainer: you just need to pass a string containing sample code to expect_offense
or expect_no_offenses
helper:
RSpec.describe RuboCop::Cop::Custom::MaxScopes do
subject(:cop) { described_class.new(config) }
let(:config) { RuboCop::Config.new("Custom/MaxScopes" => { "MaxScopes" => 2 }) }
context "when class has more than MaxScopes scopes" do
it 'registers an offense when using `#bad_method`' do
expect_offense(<<~RUBY)
class User < ApplicationRecord
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Class should not have more than 2 scopes.
scope :active, -> { where(active: true) }
scope :inactive, -> { where(active: false) }
scope :banned, -> { where(status: "banned") }
end
RUBY
end
end
context "when class has scope count equal to MaxScopes" do
it 'does not register an offense`' do
expect_no_offenses(<<~RUBY)
class User < ApplicationRecord
scope :active, -> { where(active: true) }
scope :inactive, -> { where(active: false) }
end
RUBY
end
end
end
You can initialize config
with any options to test different behaviors.
Advanced cop configuration
Sometimes we need to run cops against a specific set of files:
- rubocop-rspec cares only about
*_spec.rb
files; - rubocop-graphql checks files inside the
graphql
folder only.
In this case, we add a default top-level configuration:
GraphQL:
Include:
- "**/graphql/**/*"
If you ever explored various cops, you are likely to have noticed many similarities in their configurations. For instance, when you need to have a configurable constant (like that one in MaxScopes
), there is a handy helper ExcludeLimit. Let’s add it to our cop:
module RuboCop
module Cop
module Custom
class MaxScopes < Base
extend RuboCop::ExcludeLimit
exclude_limit 'MaxScopes'
def_node_search :scope_declarations, <<~PATTERN
(send nil? :scope (:sym _) ...)
PATTERN
MSG = "Class should not have more than %<max_scopes>s scopes.".freeze
def on_class(node)
scope_count = scope_declarations(node).count
return if scope_count < max_scopes
register_offense(node)
end
private
def register_offense(node)
message = format(MSG, max_scopes: max_scopes)
add_offense(node, message: message)
end
def max_scopes
cop_config["MaxScopes"]
end
end
end
end
end
Why do we prefer the module over the good old cop_config["MaxScopes"]
sometimes? ExcludeLimit
helps when RuboCop is used with the --auto-gen-config
option, which generates a TODO config that allows iterative codebase migration to new rules.
In some cases, cops allow coping with completely different configurable behaviors. For instance, Style/NilComparison cop lets you choose whether you like x.nil?
or x == nil
. There is a built-in module ConfigurableEnforcedStyle responsible for cop styles configuration: you just need to include the module, and the style
method will manage the rest. Here is the example of ConfigurableEnforcedStyle
usage.
Another helpful mixin is the RuboCop::Cop::CodeLength
that helps to check the length of a code segment. It supports the following configuration options:
Max
—maximum number of lines;CountComments
—option to exclude comments from count;CountAsOne
—allows counting arrays, hashes, and heredocs as one line.
When CodeLength
is included in the cop, all you need to do is to pass a code block to the check_code_length
method. Here is an example from Metrics/MethodLength
cop:
module RuboCop
module Cop
module Metrics
class MethodLength < Base
include CodeLength
def on_def(node)
return if ignored_method?(node.method_name)
check_code_length(node)
end
def on_block(node)
return unless node.send_node.method?(:define_method)
check_code_length(node)
end
end
end
end
end
Thank you for reading! We hope you can use this little guide as a reference to tidy up your codebase even further—through custom rules that fit your unique project needs. Feel free to give as a shout if you want to talk about the current tooling at your company or need Evil Martians to help you set up and automatically enforce better development practices.