Freezolite: the magic gem for keeping Ruby literals safely frozen

November 14, 2023

Topics

Once upon a time, a question was raised in the #backend channel of the Evil Martians Slack. “Is there a way to magically set frozen_string_literal to true for an entire project to silence that annoying comment in every Ruby file? It’s seriously frustrating, and RuboCop keeps bugging me as I’m always forgetting it somewhere!“ Sounds familiar? Read on and learn about the Freezolite gem, a concoction crafted up to solve this problem by Evil ~~Wizards~~ Martians.

In Ruby, everything is mutable by default, including strings. This model fits Ruby’s dynamic nature well but may have noticeable downsides at scale, primarily performance-wise. Why? Mutability means we cannot safely reuse the same object and have to opt into copying it over to avoid race conditions; that’s what Ruby does with literal strings.

A literal string is defined in the source code, i.e., created statically, not dynamically. Let’s consider an example:

class Cat
  def say = "meow"
end

Cat.new.say #=> "meow"

In the snippet above, "meow" is a literal string. Even though it’s the same for all the Cat objects we create, every time we invoke the #say method, a new String object is created (or allocated); the same holds true when calling #say multiple times for the same cat. Here is a quick snippet that proves the statement:

class Cat
  def say = "meow"
end

cat = Cat.new

require "objspace"

was_alloc = ObjectSpace.count_objects[:T_STRING]
100.times { cat.say }
new_alloc = ObjectSpace.count_objects[:T_STRING]

p "Allocated strings: #{new_alloc - was_alloc}"

We run this code via the ruby command (not IRB or another console as it would add additional allocations). You should see something like Allocated strings: 128 (we’ll leave it up to the reader to figure out where the additional 28 allocations came from 😁). Changing the method source code to return "meow".freeze would result in the same number, minus 100 (i.e., 28 for me).

Now, imagine you’ve generate thousands of cat meows in a unit of work in your project (e.g. when processing a web request). That might lead to massive stress on Ruby’s garbage collector and, as a result, to noticeable performance degradation. This may sound like a made-up problem, but believe it or not, I’ve seen many Ruby on Rails applications that do not treat strings carefully in their presentation layer (serializers, presenters, decorators, etc.) and making response times slower for no reason.

The positive effect of frozen strings caught the Ruby Core team’s attention a while ago. As a result, a new magic comment (or pragma) was introduced in Ruby 2.3 to freeze string literals on a file basis automatically—the infamous # frozen_string_literal: true comment you might have seen pretty much everywhere these days. So, instead of adding .freeze manually to all of the strings in a file, you just add a single line at the very beginning of it:

# frozen_string_literal: true

class Cat
  def say = "meow"
end

Cat.new.say.frozen? #=> true

This magic comment has made it possible to introduce this new behavior without breaking the existing code. There was a plan to make literals immutable (or frozen) by default in Ruby 3.0, but… it didn’t work out. Nonetheless, we still have to deal with this “magic”. So, let’s see our options.

Living with magic… comments

Today, Ruby developers have the following alternatives when it comes to frozen strings.

The first solution is to just not care about them—no magic comments, no #freeze—just good old Ruby code. Yeah, you lose the potential performance benefits, but that may not bother you at all. And that’s fine; I don’t insist on optimization just for the sake of optimization.

And from there, a theoretical option exists to enable frozen strings globally via the Ruby command line option: --enable=frozen-string-literal. I’ve never seen it used in the wild, though. Why? Because you may still have code depending on mutable string literals, and that code would break if you make them immutable. Lucky you if you can catch this problem before hitting the production environment.

Try to run your project’s tests with this option enabled as an experiment. The result for one of my pet Rails projects is below:

$ RUBYOPT="--enable=frozen-string-literal" bin/rails test

...

/ruby/3.2.0/gems/websocket-driver-0.7.5/lib/websocket/driver/hybi.rb:221:in `send_frame':
  can't modify frozen String: "C2" (FrozenError)

Finally, the most popular approach for large projects these days is relying on RuboCop and its Style/FrozenStringLiteralComment rule that enforces adding the magic comment to every Ruby file. You either get used to starting every file with the # frozen_string_literal: true spell or following the linter instructions.

Although the RuboCop approach is robust and straightforward, it’s still a bit annoying. Why on Mars should I keep this tiny low-level detail in my mind? Why is there no way to tell Ruby, “Please, make all the strings in my project immutable by default”?

That’s how this story started, with a simple question. And now I’m ready to reveal the answer.

Freeze my project’s literals, please

After the question was raised, we (the Evil Martians backend team) met to brainstorm possible solutions. Below, I present edited logs from this meeting to demonstrate how we reached the final result.

We started by grepping the Ruby codebase in search of frozen_string_literal mentions. The goal was to understand how the magic comment works—if you want to hack something, you must first master it. Here is what we found:

Here the compiler checks if the frozen_string_literal option has been set for the instruction sequence (ISEQ_COMPILE_DATA(iseq)->option->frozen_string_literal). Hmm, compile options? Sounds interesting; let’s learn more about them…
The list of available compile options and their default values give us a hint: there is something called VM::CompileOption: “You can change these options at runtime by VM::CompileOption”. Awesome! That’s exactly what we want, to change the frozen_string_literal option at runtime!
After digging, we found the API: RubyVM::InstructionSequence.compile_option = {frozen_string_literal: true}.

Let’s see how changing compile options at runtime works. Assuming we have a cat.rb file with the Cat class defined above, try to run the following code in the IRB console:

$ irb

irb> load "./cat.rb"
irb> Cat.new.say.sub!("ow", "-wow")
me-wow

irb> RubyVM::InstructionSequence.compile_option = {frozen_string_literal: true}
irb> load "./cat.rb"
irb> Cat.new.say.sub!("ow", "-wow")

can't modify frozen String

irb> RubyVM::InstructionSequence.compile_option = {frozen_string_literal: false}
irb> load "./cat.rb"
irb> Cat.new.say.sub!("me", "w")
wow

Nice! We’ve just proven that enabling string immutability at runtime is possible. All that’s left is to inject this logic when loading Ruby files so we can automatically set the frozen_string_literal option to true for the project’s files.

How can one hijack the process of loading source code in Ruby? One option is to monkey-patch Kernel.require (and its siblings, such as Kernel.load). That may sound scary, but many projects do that, such as Zeitwerk and Ruby Next, to name a couple.

Luckily, in MRI, we have a better option: the RubyVM::InstructionSequence.load_iseq callback. It’s an official mechanism for intervening with the default loading workflow. You only need to define this method (it’s undefined by default), and Ruby will call it whenever a source file is loaded. So, the very first working prototype we had looked like this:

# patch.rb
class RubyVM::InstructionSequence
  def self.load_iseq(path)
    # Simple file filtering logic just for the test sake
    return nil unless path.match?(%r{cat.rb$})

    frozen_string_literal = RubyVM::InstructionSequence.compile_option[:frozen_string_literal]
    RubyVM::InstructionSequence.compile_option = {frozen_string_literal: true}
    RubyVM::InstructionSequence.compile_file(path)
  ensure
    RubyVM::InstructionSequence.compile_option = {frozen_string_literal:}
  end
end

We set the frozen_string_literal option to true for every file path we’re interested in and restored it to the previous value after that. You can see it in action by running the following command:

$ ruby -r ./patch.rb -r ./cat.rb -e "puts Cat.new.say.frozen?"

true

And that might be all—just a few lines of code to eliminate tons of # frozen_string_literal: true comments in your codebase. Unfortunately, things turned out a bit more complicated, and a comprehensive solution deserved to be promoted to a gem.

Introducing Freezolite

First, relying on the .load_iseq hook has a limitation in that it can only be defined once. But what if you have two libraries trying to hijack the loading process? That’s what would likely happen in your Rails application due to the presence of the Bootsnap gem.

Bootsnap defines its own .load_iseq method to perform compiled instruction sequence caching. So, we must preserve its functionality, and thus, we have to rewrite our patch a bit:

module FrozenInstructionSequence
  def load_iseq(path)
    frozen_string_literal = RubyVM::InstructionSequence.compile_option[:frozen_string_literal]

    if path.match?(%r{name.rb$})
      RubyVM::InstructionSequence.compile_option = {frozen_string_literal: true}

      defined?(super) ? super : RubyVM::InstructionSequence.compile_file(path)
    else
      defined?(super) ? super : nil
    end
  ensure
    RubyVM::InstructionSequence.compile_option = {frozen_string_literal:}
  end
end

RubyVM::InstructionSequence.singleton_class.prepend(FrozenInstructionSequence)

The essential bits of the snippet above are defined?(super) and Module#prepend. This way, we make our patch aware of other potential patches. Moreover, Bootsnap is smart enough to take compile options into account when calculating cache keys. Thus, there is no need to invalidate cache manually when introducing our patch. Awesome!

After adding some high-level interface, we packed this solution into a gem: Freezolite. You can add it to your project and configure automatic string literal freezing with just a single line of code in your application bootstrap file. For example, in a Rails application, that would be the config/application.rb file:

# config/application.rb

#...

Bundler.require(*Rails.groups)

require "freezolite/auto"

# ...

That’s it! No more # frozen_string_literal: true and RuboCop failures.

Bonus: require-hooks for more experiments

Although Freezolite only targets CRuby, I’ve found that having a generic interface to interact with Ruby code loading might be useful. We already had this functionality implemented in Ruby Next, so we’ve been just waiting for another project to have similar problems so we could extract it. And the day has come—let me introduce require-hooks.

Require Hooks is a library that provides a common interface for injecting custom logic into source file loading in all major Rubies: Cruby, JRuby, TruffleRuby. Depending on the platform, it picks the best suitable strategy: installing the InstructionSequence.load_iseq callback, patching Kernel.require, or even patching Bootsnap (because in some cases, we only need to operate on new, non-cached code). The library passes all Kernel.require tests from the ruby/spec, so it’s safe to use.

The final version of Freezolite using Require Hooks became very lightweight:

module Freezolite
  class << self
    def setup(patterns:, exclude_patterns: nil)
      require "require-hooks/setup"

      ::RequireHooks.around_load(patterns: patterns, exclude_patterns: exclude_patterns) do |path, &block|
        was_frozen_string_literal = ::RubyVM::InstructionSequence.compile_option[:frozen_string_literal]
        ::RubyVM::InstructionSequence.compile_option = {frozen_string_literal: true}
        block.call
      ensure
        ::RubyVM::InstructionSequence.compile_option = {frozen_string_literal: was_frozen_string_literal}
      end
    end
  end
end

We use the .around_load hook here, which wraps the code loading process. Another example of using it is implementing custom syntax error handling:

RequireHooks.around_load do |path, &block|
  block.call
rescue SyntaxError => e
  raise "Oops, your Ruby is not Ruby: #{e.message}"
end

Compare this to the monkey-patches in the syntax_suggest gem doing pretty much the same thing. Can you tell the difference?

There are also .source_transform and .hijack_load to perform source-to-source transformations or take full control over the loading process, respectively.

Maybe this is the tool you’ve been looking for to bring your weird Ruby ideas to life!

At Evil Martians, we transform growth-stage startups into unicorns, build developer tools, and create open source products. If you’re ready to engage warp drive, give us a shout!