Pulling the trigger: How to update counter caches in your Rails app without Active Record callbacks
Translations
In this article, we experiment with triggers as a tool for keeping aggregated data consistent when using Active Record and your favorite SQL database. Instead of using sophisticated tools such as ElasticSearch for filtering and searching, we will demonstrate a simple approach that achieves the same result with some out-of-the-box database features. As a bonus, learn how to avoid nasty race conditions!
Sometimes you need to sort and filter records in the database by some aggregated values. For instance, you might be building a paginated list of users in an admin panel, and you want to implement filtering by the number of orders and the total amount that users have spent on them. There are several tools like ElasticSearch, which are good at filtering by aggregates, but setting up a massive search engine and all required infrastructure to process a couple of columns sounds like an overkill. Let’s find a more straightforward way!
Trigger finger
Imagine the following data model:
ActiveRecord::Schema.define do
create_table "orders", force: :cascade do |t|
t.bigint "user_id", null: false
t.decimal "amount"
t.datetime "created_at", precision: 6, null: false
t.datetime "updated_at", precision: 6, null: false
t.index ["user_id"], name: "index_orders_on_user_id"
end
create_table "users", force: :cascade do |t|
t.datetime "created_at", precision: 6, null: false
t.datetime "updated_at", precision: 6, null: false
end
add_foreign_key "orders", "users"
end
class User < ActiveRecord::Base
has_many :orders
end
class Order < ActiveRecord::Base
belongs_to :user
end
Let’s see how we can filter and paginate users by their order total. We can easily achieve our goal with the vanilla SQL statement, but we will immediately run into performance issues. To demonstrate, let’s fill the database with 10,000 users and 100,000 orders and use EXPLAIN
:
User.insert_all(10_000.times.map { { created_at: Time.now, updated_at: Time.now } })
Order.insert_all(
10_000.times.map do
{
user_id: rand(1...1000),
amount: rand(1000) / 10.0,
created_at: Time.now,
updated_at: Time.now
}
end
)
ActiveRecord::Base.connection.execute <<~SQL
EXPLAIN ANALYZE SELECT users.id, SUM(orders.amount), COUNT(orders.id)
FROM users JOIN orders ON orders.user_id = users.id
GROUP BY users.id
HAVING SUM(orders.amount) > 100 AND COUNT(orders.id) > 1
ORDER BY SUM(orders.amount)
LIMIT 50
SQL
This is the result you might see:
Limit (cost=3206.16..3206.29 rows=50 width=48) (actual time=59.737..59.746 rows=50 loops=1)
-> Sort (cost=3206.16..3208.95 rows=1116 width=48) (actual time=59.736..59.739 rows=50 loops=1)
Sort Key: (sum(orders.amount))
Sort Method: top-N heapsort Memory: 31kB
-> HashAggregate (cost=2968.13..3169.09 rows=1116 width=48) (actual time=59.103..59.452 rows=1000 loops=1)
Group Key: users.id
Filter: ((sum(orders.amount) > '100'::numeric) AND (count(orders.id) > 1))
-> Hash Join (cost=290.08..2050.73 rows=73392 width=48) (actual time=2.793..37.022 rows=100000 loops=1)
Hash Cond: (orders.user_id = users.id)
-> Seq Scan on orders (cost=0.00..1567.92 rows=73392 width=48) (actual time=0.011..11.650 rows=100000 loops=1)
-> Hash (cost=164.48..164.48 rows=10048 width=8) (actual time=2.760..2.760 rows=10000 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 519kB
-> Seq Scan on users (cost=0.00..164.48 rows=10048 width=8) (actual time=0.006..1.220 rows=10000 loops=1)
Planning Time: 0.237 ms
Execution Time: 64.151 ms
With a bigger database, it will take even more time! We need to find a better solution, the one that scales. Let’s denormalize our database and store orders_amount
in the separate user_stats
table:
class CreateUserStats < ActiveRecord::Migration[6.0]
def change
create_table :user_stats do |t|
t.integer :user_id, null: false, foreign_key: true
t.decimal :orders_amount
t.integer :orders_count
t.index :user_id, unique: true
end
end
end
Now we should decide how to keep orders_count
and orders_amount
in sync. ActiveRecord callbacks do not look like a proper place to handle such operations, because we want to have our stats updated even when data is changed with a plain SQL (e.g., in the migration). There is a built-in counter_cache
option for the belongs_to
association, but it cannot help us with orders_amount
. Triggers to the rescue!
A trigger is a function that is automatically invoked when INSERT, UPDATE, or DELETE is performed on the table.
To work with triggers from our Rails app, we can use gems like hair_trigger, fx, or even write them by hand. In this example, we use hair_trigger
, which can generate migrations for trigger updates using only the latest version of the SQL procedure.
Let’s add our trigger to the Order
model. We want to perform the UPSERT: if there is no row with the matching user_id
in the user_stats
table—we add a new row, otherwise—update the existing one (make sure there is a unique
constraint on the user_id
column):
class Order < ActiveRecord::Base
belongs_to :user
trigger.after(:insert) do
<<~SQL
INSERT INTO user_stats (user_id, orders_amount, orders_count)
SELECT
NEW.user_id as user_id,
SUM(orders.amount) as orders_amount,
COUNT(orders.id) as orders_count
FROM orders WHERE orders.user_id = NEW.user_id
ON CONFLICT (user_id) DO UPDATE
SET
orders_amount = EXCLUDED.orders_amount,
orders_count = EXCLUDED.orders_count;
SQL
end
end
Now we should generate the migration with rake db:generate_trigger_migration
, run migrations with rails db:migrate
, and run the application.
Off to the races
It might seem to be working, but what if we try to insert multiple orders in parallel? (you can run the following code as a rake task or check my implementation here)
user = User.create
threads = []
4.times do
threads << Thread.new(user.id) do |user_id|
user = User.find(user_id)
user.orders.create(amount: rand(1000) / 10.0)
end
end
threads.each(&:join)
inconsistent_stats = UserStat.joins(user: :orders)
.where(user_id: user.id)
.having("user_stats.orders_amount <> SUM(orders.amount)")
.group("user_stats.id")
if inconsistent_stats.any?
calculated_amount = UserStat.find_by(user: user).orders_amount
real_amount = Order.where(user: user).sum(:amount).to_f
puts
puts "Race condition detected:"
puts "calculated amount: #{calculated_amount}"
puts "real amount: #{real_amount}."
else
puts
puts "Data is consistent."
end
There is a huge chance that there will be a race condition, but why? The problem is that the trigger runs inside the current transaction, and the default isolation level is READ COMMITTED
, which cannot handle race conditions.
PostgreSQL supports four levels of transaction isolation—READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ and SERIALIZABLE
The obvious solution is to use a stricter SERIALIZABLE
isolation level, but, unfortunately, an isolation level cannot be changed inside a running transaction. Creating a new explicit transaction every time we work with orders does not sound right either, so let’s try another approach for making sure our triggers are always executed in sequence—advisory locks.
The only thing we need to change is to add lock PERFORM pg_advisory_xact_lock(NEW.user_id);
at the beginning of our procedure code:
class Order < ActiveRecord::Base
belongs_to :user
trigger.after(:insert) do
<<~SQL
PERFORM pg_advisory_xact_lock(NEW.user_id);
INSERT INTO user_stats (user_id, orders_amount, orders_count)
SELECT
NEW.user_id as user_id,
SUM(orders.amount) as orders_amount,
COUNT(orders.id) as orders_count
FROM orders WHERE orders.user_id = NEW.user_id
ON CONFLICT (user_id) DO UPDATE
SET
orders_amount = EXCLUDED.orders_amount,
orders_count = EXCLUDED.orders_count;
SQL
end
end
It’s way faster! The updated version of the code is here, if you run it, you’ll see that race condition is gone, and the app can handle parallel requests. Let’s add the index to the orders_amount
column in the user_stats
table, change the query, and compare the performance:
EXPLAIN ANALYZE SELECT user_id, orders_amount, orders_count
FROM user_stats
WHERE orders_amount > 100 AND orders_count > 1
ORDER BY orders_amount
LIMIT 50
Limit (cost=0.29..22.99 rows=50 width=40) (actual time=0.059..11.241 rows=50 loops=1)
-> Index Scan using index_user_stats_on_orders_amount on user_stats (cost=0.29..3438.69 rows=7573 width=40) (actual time=0.058..11.2 rows=50 loops=1)
Index Cond: (orders_amount > '100'::numeric)
Filter: (orders_count > 1)
Planning Time: 0.105 ms
Execution Time: 11.272 ms
Lock-free alternative
There is a way (suggested by Sergey Ponomarev) to achieve the same result without locks and make it work faster—use deltas:
class Order < ActiveRecord::Base
belongs_to :user
trigger.after(:insert) do
<<~SQL
INSERT INTO user_stats (user_id, orders_amount, orders_count)
SELECT
NEW.user_id as user_id,
NEW.amount as orders_amount,
1 as orders_count
ON CONFLICT (user_id) DO UPDATE
SET
orders_amount = user_stats.orders_amount + EXCLUDED.orders_amount,
orders_count = user_stats.orders_count + EXCLUDED.orders_count;
SQL
end
end
The trick here is not to use any subqueries, so race conditions would not be possible. As a bonus, you’ll get better performance when inserting new records. This approach might come handy for simple cases like the one described in this article, but when you are dealing with more complex logic, you might want to resort to locks (imagine that orders have statuses, we need to cache counts of orders in each status and orders can be updated).
Loop instead of UPSERT
In previous examples, we use UPSERT, which was introduced in PostgreSQL 9.5, but what if we use the older version? Let’s review how the trigger works again: it tries to insert a new row into the user_stats
table and, if a conflict happens, it updates the existing row. In the real-world application there will be conflicts most of the time (to be precise—insert happens only once for each user). We can use this fact and rewrite our trigger in the following way:
class Order < ActiveRecord::Base
belongs_to :user
trigger.after(:insert) do
<<~SQL
<<insert_update>>
LOOP
UPDATE user_stats
SET orders_count = orders_count + 1,
orders_amount = orders_amount + NEW.amount
WHERE user_id = NEW.user_id;
EXIT insert_update WHEN FOUND;
BEGIN
INSERT INTO user_stats (
user_id, orders_amount, orders_count
) VALUES (
NEW.user_id, 1, NEW.amount
);
EXIT insert_update;
EXCEPTION
WHEN UNIQUE_VIOLATION THEN
-- do nothing
END;
END LOOP insert_update;
SQL
end
end
In this case, we have inverted our logic: trigger tries to update the existing row, and if it misses—the new row gets inserted.
Working with aggregated data is hard: when you have a lot of counter (and other kinds of) caches, it makes sense to use a special tool for that. However, for simple cases, we can stay with good old database triggers: when configured properly, they are quite performant!