Lefthook, Crystalball, and git magic for smooth development experience
From this step-by-step tutorial, you will learn how to setup Lefthook git hooks manager, Crystalball test selection library, and also how to automatically install missing gems and migrate your database when you’re switching to a feature branch and back to the master
.
Every project should have a CI installed. But CI builds sometimes can queue up, you need to wait for the notification. And anyway, is there a method to reduce the time for “implement-test-fix” cycle, save keystrokes and mouse clicks, but don’t let broken code to get into the project repository? Yes, it exists long ago and is well-known: git hooks. Plain executable scripts in your local .git/hooks/
directory. But it is so bothersome to set them up, to update them, and to sync with your collaborators because you can’t commit them to the repository itself.
Setting up Lefthook
Lefthook is our own git hook manager written in Go. Single dependency-free binary. Fast, reliable, feature-rich, language-agnostic.
-
Install it via
gem
,npm
, or from source or your OS package manager. -
Define your hooks in config file
lefthook.yml
pre-push: parallel: true commands: rubocop: tags: backend run: bundle exec rubocop rspec: tags: rspec backend run: bundle exec rspec --fail-fast
-
And run
lefthook install
.
And voila, starting from now RSpec and Rubocop will be run in parallel on every git push
, and push will be aborted if they would find any issues.
But why you’ve chosen pre-push over pre-commit?
First of all, sometimes, during refactorings, you want to make a lot of small commits locally. Most of them won’t pass linters. Why execute all this machinery if you know that it won’t pass? Afterward, you will squash and reorder them with
git rebase --interactive
and push clean code to the repo.More importantly, some things are not executing fast. Wait a minute on every
git commit
is meh. You lose speed. So why not to move long operations to the much more rare push event?
However, running whole test suite may take a too long time (and we have CI exactly for that anyway), so we need some method to run only specs that we probably might break by our changes.
Setting up Crystalball
Crystalball is a Ruby library by Toptal which implements Regression Test Selection mechanism. Its main purpose is to select a minimal subset of your test suite, which should be run to ensure your changes didn’t break anything. It is a tricky problem in Ruby applications in general and especially in Rails applications because of Ruby on Rails constant autoloading mechanism.
Crystalball solves this problem by tracing code execution and tracking dependencies between files when your test suite is running. Using this profiling data, it could tell which files affect which. Take a look at slides about Crystalball from the talk at RubyKaigi 2019.
- Install it
# Gemfile
gem "crystalball"
- Configure for our pre-push case (by default crystalball is configured to be used in pre-commit hooks)
# config/crystalball.yml
---
map_expiration_period: 604800 # 1 week
diff_from: origin/master
- Setup your test suite to collect code coverage information:
# spec/spec_helper.rb
if ENV['CRYSTALBALL'] == 'true'
require 'crystalball'
require 'crystalball/rails'
Crystalball::MapGenerator.start! do |config|
config.register Crystalball::MapGenerator::CoverageStrategy.new
config.register Crystalball::Rails::MapGenerator::I18nStrategy.new
config.register Crystalball::MapGenerator::DescribedClassStrategy.new
end
end
- Generate code execution maps:
CRYSTALBALL=true bundle exec rspec
- Replace RSpec with crystalball in
lefthook.yml
:
- run: bundle exec rspec --fail-fast
+ run: bundle exec crystalball --fail-fast
And from now every push will be accelerated dramatically if your changes are small.
But crystalball needs up-to-date code execution maps to work correctly. Can we automate these maps refreshing, too? Sure, we can!
Keeping Crystalball up-to-date
For that sake, git’s post-checkout
hook fits very well. We can run code with updating crystalball data logic. “Logic” implies complexity as there is no single command for that. To cover such cases, Lefthook allows having separate executable script files. We can put our logic to .lefthook/post-checkout/crystalball-update
file, make it executable, and declare in lefthook configuration like this:
# lefthook.yml
post-checkout:
scripts:
crystalball-update:
tags: rspec backend
And there, in crystalball-update
script, we need a bit of magic.
First of all, we don’t need to do anything when a developer uses git checkout -- path
command to reject some changes from the working tree. Because this is not switching between commits and repository is possibly “dirty.” Yes, Git CLI sometimes can feel weird as checkout
command is used for two different tasks.
Per git docs, git will always pass to post-checkout hook three arguments: previous HEAD commit identifier (SHA1 sum), current HEAD commit identifier and flag whether it was checkout between branches (1
) or file checkout to the state of another commit (0
). Lefthook will catch these arguments and will carefully pass them to every managed script.
#!/usr/bin/env ruby
_prev_head, _curr_head, branch_change, * = ARGV
exit if branch_change == "0" # Don't run on file checkouts
Next, we want to update crystalball profiling data only on the master
branch as recommended in the Crystalball docs. To do so, we need to ask git what branch we’ve checked out:
# Rails.root if we look from .lefthook/post-checkout dir
app_dir = File.expand_path("../..", __dir__)
ENV["BUNDLE_GEMFILE"] ||= File.join(app_dir, "Gemfile")
require "bundler/setup"
require "git"
exit unless Git.open(app_dir).current_branch == "master"
git
gem is a dependency of Crystalball, so we don’t have to install it.
And finally we need to do most heavy part: ask Crystalball, “Are your profiling data up-to-date?”
require "crystalball"
config = Crystalball::RSpec::Runner.config
prediction_builder = Crystalball::RSpec::Runner.prediction_builder
exit if File.exist?(config["execution_map_path"]) && !prediction_builder.expired_map?
And if it is not fresh we need to run the whole test suite with special environment variable set:
puts "Crystalball Ruby code execution maps are out of date. Performing full test suite to update them…"
ENV["CRYSTALBALL"] = "true"
RSpec::Core::Runner.run([app_dir])
And we’re done. Are we?
Automate other routine tasks
But running specs require that we have:
- installed gems, and
- actual database state.
And in actively developing application gems are frequently updated, added, and removed, database schema sometimes can be changed several times a day in different branches. It is so typical to pull fresh master at morning and get updated gems and new database migrations. In that case, RSpec would fail, and Crystalball execution path maps won’t be complete. So we need to ensure that our specs always can run beforehand.
Install missing gems on a git checkout
This task is quite simple and can be achieved by a simple bash script. Most of it will consist of checks to avoid calling bundler when it’s not needed. Bundler is quite heavy as it runs by noticeable time.
Two first of these checks are same, but just rewritten to shell: is this branch checkout? Did we actually move between commits?
#!/bin/bash
BRANCH_CHANGE=$3
[[ $BRANCH_CHANGE -eq 0 ]] && exit
PREV_HEAD=$1
CURR_HEAD=$2
[ $PREV_HEAD == $CURR_HEAD ] && exit
Next one is more tricky:
# Don't run bundler if there were no changes in gems
git diff --quiet --exit-code $PREV_HEAD $CURR_HEAD -- Gemfile.lock && exit;
We’re asking here, “Did a set of required gems change between commits?” If Gemfile.lock
was changed, we need to check do we have all of the gems installed by invoking bundler.
bundle check || bundle install
Again, if you have up-to-date gems (and in most checkouts, it will be so), only bundle check
will be executed.
Automatically rollback and apply database migrations
Next task is much more interesting.
When we’re switching from branch A to branch B, we need to ensure that database schema actually is compatible with our specs. To do so, we must rollback every migration that exists in branch A but not in branch B and then to apply every migration that exists in B and still isn’t applied. Rollback part is required because migrations that remove or rename columns and tables are not backward compatible.
The problem here is that there is no pre-checkout
hook in git, only post-checkout
one. And after checkout, there are no more migration files left that existed only in the branch we’re switched from. How to rollback them?
But this is git! The files are out there. Why not just take them from git itself?
To do so programmatically let’s use gem git
to access our git repository. Crystalball already uses it under the hood so there will be no new dependency, but it is a good idea to add it to the Gemfile
explicitly.
Let’s start from the check that we really have any migrations to run (either up or down):
require "git"
# Rails.root if we look from .lefthook/post-checkout dir
app_dir = File.expand_path("../..", __dir__)
git = Git.open(app_dir)
# Don't run if there were no database changes between revisions
diff = git.diff(prev_head, curr_head).path("db/migrate")
exit if diff.size.zero?
Then, to be able to use migrations, we need to load our rails application and connect to the database:
require File.expand_path("config/boot", app_dir)
require File.expand_path("config/application", app_dir)
require "rake"
Rails.application.load_tasks
Rake::Task["db:load_config"].invoke
Then we can take files and save them somewhere:
# migrations added in prev_head (deleted in curr_head) that we need to rollback
rollback_migration_files = diff.select { |file| file.type == "deleted" }
if rollback_migration_files.any?
require "tmpdir"
MigrationFilenameRegexp = ActiveRecord::Migration::MigrationFilenameRegexp
versions = []
Dir.mktmpdir do |directory|
rollback_migration_files.each do |diff_file|
filename = File.basename(diff_file.path)
contents = git.gblob("#{prev_head}:#{diff_file.path}").contents
File.write(File.join(directory, filename), contents)
version = filename.scan(MigrationFilenameRegexp).first&.first
versions.push(version) if version
end
# Now, when we have files for migrations that need to be rolled back we can rollback them
begin
old_migration_paths = ActiveRecord::Migrator.migrations_paths
ActiveRecord::Migrator.migrations_paths.push(directory)
versions.sort.reverse_each do |version|
ENV["VERSION"] = version
Rake::Task["db:migrate:down"].execute
end
ensure
ENV.delete("VERSION")
ActiveRecord::Migrator.migrations_paths = old_migration_paths
end
end
end
Here we’re adding our temporary directory with another branch migrations to the ActiveRecord’s migrations_paths
. This setting is available since Rails 5, but not widely known. Now ActiveRecord can see our ghost migration files, and we can simply invoke rake db:migrate:down VERSION=number
for every migration to rollback it.
And after that we can just migrate not yet applied migrations:
Rake::Task["db:migrate"].invoke
And that’s it!
Composing it together
Now we only need to invoke these scripts in the right order: install gems, run migrations and run specs (if required). To do so, we need to name files in alphabetical order, place them in .lefthook/post-checkout
directory and declare them in lefthook.yml
:
post-checkout:
piped: true
scripts:
01-bundle-checkinstall:
tags: backend
02-db-migrate:
tags: backend
03-crystalball-update:
tags: rspec backend
The piped
option will abort the rest of the commands if the preceding command fails. For example, if you forget to launch a database server, the second step will fail, and lefthook will skip the third step altogether.
Frontend devs, not interested in running RSpec can exclude rspec
tag in their lefthook.local.yml
, and they will only get always installed gems and migrated database. Automagically.
Any gotchas?
From now on you always have to write reversible migrations. Look at example application to learn how to do it.
Conclusion
Now we not only have checks that will prevent us from pushing broken code to the repository (and will check this really fast) but also, as a side-effect, we always will have installed gems, and our database will be in the migrated state. You will forget about bundle install
(unless you are the one who updates gems). And no more “Okay, now I need to rollback X and Y first, and checkout back to master only then.”
Check out an experiment with example application published on GitHub: lefthook-crystalball-example.
Happy coding!