5 tips for dealing with heavy ActiveRecord dashboards in Rails

January 12, 2016

Topics

Kir Shatrov
Sr. Backend Engineer and a Rails hacker

In Rails, a large view can be a dashboard or analytics page. It always starts as a simple page and then grows into a complex page with dozens of partials and query chains in each of them.

I’ve spent a lot of time optimizing such views in client applications. Like most Rails consultants, we had a huge application that was heavily stitched to ActiveRecord models. Of course, good performance is quite possible with ActiveRecord. However, for dashboards, the trick is that using ActiveRecord for fetching data doesn’t mean that you have to operate with ActiveRecord objects in views.

However, that was not the only reason why we avoided using ActiveRecord objects in views. We also wanted to split up the work among the project developers in such a manner that no performance regression would occur in case a new developer should slightly change the view.

1. Build data structures

By data structures, I mean that instead of initializing a list of instance variables in the controller, you build a hash that describes the state of the whole page:

{
  stages: [
    {
      id: 1,
      title: "Initial",
      slug: "initial"
    },
    {
      id: 2,
      title: "Interview",
      slug: "interview"
    }
  ],
  users: [
    { id: 1, visible: false },
    { id: 2, visible: true }
  ],
  posts: [
    { id: 1, tags: [] }
  ]
}

First of all, it provides a clear structure. Once it has come to looking into the data and checking the format, you understand how important it is to have one. It is really awesome to have a view represented as an immutable structure.

When it comes to passing data to Rails views, it’s also way cleaner to pass a single data structure (a hash) rather than use dozen of instance variables.

It’s also handy in the longer term. For instance, when you get to render the client side with something React or the likes, you can simply serialize the same data structure to JSON and then use it for rendering.

2. Limit the API

Working with ActiveRecord objects from a view gives unlimited API opportunities for fetching data (user.posts.first(5).map(&:title).join(", ")). However, the opportunities to completely break the performance are also unlimited.

When developing a large application, we have to limit the API to prevent cases when a junior developer modifies the view in a way that causes an unpredictable N+1 query.

By using the data structure approach described above, you can limit the API and avoid destructive database calls from your views.

3. Preload records

Before iterating over a model relation, always check if it’s preloaded to avoid N+1 queries.

# Model
class User
  has_many :posts
end

# View
@users.all.each do |user|
  user.posts.each(&:title)
end

This will cause N+1 queries problem unless includes(:posts) is appended to @users.

As a preventive measure, I found it useful to perform an additional check before iterating over collections like has_many:

@users = User.all.includes(:posts)

@users.each do |user|
  raise "association is not preloaded" unless user.reflection(:posts).loaded?

  user.posts.each(&:title)
end

Now, even if someone accidentally removes includes(:posts) from the query chain, you will immediately know about it.

4. Load only the database columns you use

Selecting all table columns is not efficient. You’re not likely to utilize all of them—while they will be consuming extra memory.

Calling model.accessed_attributes at the end of a controller action allows you to determine which fields were accessed and then explicitly pass these fields to SELECT.

5. Denormalize the data

Concerning a dashboard or analytics page, it might be useful to utilize the native database features such as views.

PostgreSQL supports Materialized Views, which means you can write a result of some complex query into a table on disk, and update the view when refreshing the query result. It works just like a cached database view. MySQL doesn’t provide a native support for materialized views, but it’s possible to implement it with SQL functions and triggers.

You can also use a document store like ElasticSearch and push denormalized data from ActiveRecord models into ElasticSearch indexes that will allow you to perform any aggregations and calculations.

Generally, you can achieve a high performance with ActiveRecord in your codebase. However, for performance-sensitive parts of your application, it may be a good idea to avoid ActiveRecord objects and use plain SQL mapped to PORO.

As seen from the “Limit the API” and “Preload records” parts, not only the current code was optimized, but extra constraints were also added to help the developers prevent performance regressions in the future.

After all the improvements have been introduced to the dashboard, we’ve got quite a few positive feedbacks. Since customers do appreciate faster applications, the “speed is a feature” approach is of extreme efficiency. Therefore, the faster your application is, the more people will use it.