Exploring Active Agent, or can we build AI features the Rails way?

Cover for Exploring Active Agent, or can we build AI features the Rails way?

In a fast-paced world with AI oozing out of every pore, frameworks and ecosystems must adjust quickly to rapidly evolving business demands. Everyone needs smart features in their apps. The easier it is to add AI-driven features to the project built with Some Framework, the more likely it will be the tool of choice. Let’s assume that “Some Framework” is “Ruby on Rails” and explore this ecosystem’s AI readiness!

The Ruby and Rails AI ecosystem took off as soon as the AI revolution sparked. It evolved from unofficial LLM provider SDKs like ruby-openai to specialized libraries such as Raix and RubyLLM. Each offers different approaches to integrating large language models into Ruby applications, but neither feels like a missing Rails abstraction for AI to me.

I prefer abstractions that follow Rails conventions and design principles and bring a familiar user experience. Both Raix and RubyLLM focus on providing a common interface for chatting with AI (and they do that very well), but do not go above that (in the layered architecture sense, they cover mainly the infrastructure layer).

Book a call

Hire Evil Martians

Let's talk about how to integrate AI features into your Rails application!

Active Agent, on the other hand, promises to fill the gap in the Rails abstraction stack and bring LLM-driven functionality the Rails way.

That’s what hooked me when I heard about Active Agent from my friend Justin. Coincidentally, that was at the time I started pondering the idea of the second edition of the “Layered Design for Ruby on Rails Applications” book, and the concept behind Active Agent sounded precisely like the one I kept in mind for the corresponding new chapter. So, I decided to become an early adopter and champion Active Agent at Evil Martians.

Does Active Agent deliver on its promise of being a Rails native AI library? Let’s try to answer this question through real-world examples and explore what the future might hold for AI in Rails.

Disclaimer: This post brings more questions than answers (though we have some answers, too) and should be seen as a reflection on the state of the ecosystem and a particular tool as well as the invitation to join the discussion about the future of Rails-native AI abstractions. This discussion is meant to pave the way towards Active Agent 1.0, which we want to build together with the community.

Active Agent in a nutshell

The Active Agent gem brings a new abstraction to Rails: agents (what a surprise!). Agents are meant to encapsulate AI-backed logic and glue it together with the rest of the framework by using familiar Rails patterns: action-driven objects (like controllers, channels, and mailers), callbacks (of course!), and prompt rendering backed by Action View. While controllers are responsible for turning HTTP requests into responses, mailers—for writing and sending emails, an agent’s primary purpose is to trigger AI generations of any kind.

Okay, time to stop talking (I hear you yelling “Show me the code!”). Let’s see Active Agent in action by trying to implement the “hello world” of AI—a “tell me a story” agent. Although, in our case, it’s gonna be “tell me a joke”.

Here’s the code for the JokerAgent class:

class JokerAgent < ApplicationAgent
  after_generation :notify_user

  def dad_joke(topic = "beer")
    prompt(body: "Generate a dad joke for the topic: #{topic}")
  end

  def nerd_joke(topic = "Apple keynote")
    prompt(body: "Generate a nerd joke for the topic: #{topic}")
  end

  private

  def notify_user
    return unless params[:user]

    UserMailer.with(user: params[:user]).joke_generated(response.message.content).deliver_later
  end
end

For Rails developers, this code looks pretty familiar and requires little explanation about what it does. Next, this is how we can use it:

result = JokerAgent.nerd_joke("Ruby on Rails").generate_now
puts result.message.content

#=> Why do Ruby on Rails developers always carry a pencil?
#=> Because they want to draw their routes!

user = User.find_by(email: "palkan@evl.ms")
JokerAgent.with(user:).dad_joke("Ruby on Rails").generate_later

# Now, in my Imbox:

# Congrats! A new dad joke has landed:
#
# Why did the developer break up with Ruby on Rails?
# Because they just couldn't handle the "active" relationship!

As you can see, two modes are available: you can get a generation result right away (synchronously) or move it to a background job (though in this case, you have to use a callback to do something with the result). Add the usage of a prompt object as a LLM request representation and you’ll see how it closely resembles Action Mailer with “deliver_” substituted for “generate_“. As we will see later, this mailer-ness has its downsides.

Before we move to the real-world examples, I’d like to reveal one more feature of Active Agent hidden from the snippet above.

In the agent code above, the prompts are very basic; in real life, they would contain dozens of lines of instructions with guardrails and other tricks to keep jokes under control and within the law. With Active Agent, you don’t need to touch the agent class code to improve the instructions; you can put them into a separate prompt template file:

<!-- app/agents/joker_agent/instructions.text-->
You are an AI joke generator. Your task is to create short, clever jokes that are appropriate and entertaining.

GENERAL GUIDELINES:
- Keep jokes concise and punchy (typically 1-2 sentences)
- Focus on wordplay, puns, unexpected twists, or clever observations
- Ensure jokes are family-friendly and appropriate for all audiences
- ... and so on

This kind of setup requires a bit of configuration, and that’s exactly why we have a base class for agents:

class ApplicationAgent < ActiveAgent::Base
  generate_with :gpt

  # Keep system instructions in the <agent>/instructions.<format> template
  default instructions: {template: :instructions}

  # Keep prompt views next to agents
  prepend_view_path Rails.root.join("app/agents")

  # Just shortcut
  delegate :response, to: :generation_provider
end

Similarly, you can store per-action prompts in template files (i.e., joker_agent/dad_joke.text.erb, etc.). Since under the hood, Active Agent uses Action View, all the familiar view layer features are available to you: partials, formats, and so on. Imagine writing a JSON-formatted prompt with Jbuilder!

To sum up, an agent is responsible for preparing and performing AI generation requests for a given business domain (so, you may have multiple generation actions sharing the same instruction set and, probably, peripheral logic, like callbacks).

Now, let’s talk about how we battle-tested Active Agent!

Battle #1: Twitter-like on-demand translations

I got the first chance to test out Active Agent just after its announcement. I was working on an internal Martian team-building project (soon to be redprinted), and engaging in the typical post-comment-reply conversation hierarchy.

Our team is multi-lingual, and to keep this project “warm and fuzzy”, I wanted to let anyone express their thoughts and consume others’ thoughts in their native language. Thus, we needed a translation feature, and I decided to go with an on-demand one. (Think Twitter’s translate button, but for a Rails application where users can instantly translate any user-generated content.)

Here’s the translation agent code:

class TranslateAgent < ApplicationAgent
  after_generation :update_record

  def translate(content, locale)
    @locale = locale
    @content = content

    prompt
  end

  private

  def update_record
    return unless params[:record]

    record = params[:record]
    result = response.message.content

    # The translation result has a form of: <from>-><to>: <content>
    # (We had no structured output support back then)
    _, original_locale, locale, content = result.split(/^(\w{2})->(\w{2}):\s*/)

    return unless original_locale.present? && locale.present? && content.present?

    record.translations << Translation.new(locale:, content:) unless locale == original_locale
    record.original_locale = original_locale
    record.save!
  end
end

Let’s omit the prompt as it’s not of any interest to us. We’re discussing software design questions here, so looking at how this code is being used is more important. In our case, we invoked it directly from a controller. Here’s a simplified version of the controller:

class TranslationsController < ApplicationController
  def create
    comment = Comment.find(params[:id])
    unless comment.translated_to?(params[:locale])
      TranslateAgent.with(record: comment)
        .translate(comment.body, params[:locale]).generate_now
    end

    render json: comment
  end
end

This example raises an interesting architectural question: Should agents encapsulate the entire operation, including database updates, or should they only produce structured outputs for other abstractions to handle?. In other words, is this #update_record method an architectural smell? Does it follow such common good design principles, like separation of concerns? My answers would be “likely” and “not really”.

The #update_record method introduces indirection and implicitness to the whole operation. The calling code must know about the side effect of updating the record’s state by the agent—it’s not well communicated by the #generate_now method.

Additionally, the agent knows too much about the target model implementation (how exactly the translation is stored). The latter could be minimized by reorganizing the code a bit and leaving model-level knowledge within the model abstraction layer:

# app/agents/translate_agent.rb

 return unless original_locale.present? && locale.present? && content.present?

-record.translations << Translation.new(locale:, content:) unless locale == original_locale
-record.original_locale = original_locale
-record.save!
+record.translated!(original_locale, locale, content)

# app/controllers/translations_controller.rb
 def create
   comment = Comment.find(params[:id])
-  unless comment.translated_to?(params[:locale])
-    TranslateAgent.with(record: comment)
-      .translate(comment.body, params[:locale]).generate_now
-  end
+  comment.generate_translation!(params[:locale])

# app/models/concerns/localizable.rb
+ def generate_translation!(locale)
+   return if translated_into?(locale)
+
+   TranslateAgent.with(record: self).translate(translatable_content.to_s, locale).generate_now
+ end

+ def translated!(from, to, content)
+   self.original_locale = from
+   translations << Translation.new(locale: to, content:) unless original_locale == to
+   save!
+ end

Still, we trigger the model update from the agent’s callback. Why not move the #translated! call to the #generate_translation! method?

First, we want (and must) keep the response parsing in the agent: the model should not care about how exactly AI responds with the requested data. The #generate_now returns the raw response only.

Then, let’s think about another aspect of this example: we invoke LLM right within a web request, synchronously. That was fine in our case (low application load); that could be totally okay if you go with async Ruby and Falcon. But in general, performing long-running HTTP requests should happen outside of the request-response loop. If we decide to go this way, all we need is to replace #generate_now with #generate_later and probably add an ActionCable.server.broadcast(...) call to the agent’s callback to notify clients of a newly generated translation.

We used this background generation mode in our application to regenerate translations in case the original content has been updated:

def regenerate_translations
  return if translations.empty?

  translations.each do
    TranslateAgent.with(record: self)
      .translate(translatable_content.to_s, it.locale)
      .generate_later
  end
end

To extract persistence logic from the agent class, we would have to introduce a custom job class, a generation result parsing method, and, likely, a service object or a concern (so the model class is not polluted with this peripheral logic). Will this increased complexity be worth it? I doubt so. The current implementation looks like a good compromise to me. The only thing I would love to see is a better place to put the post-generation logic than a callback and/or an ability to return post-processed results.

Testability is the key

One of the key properties of maintainable code is testability. How good is the developer experience of writing and maintaining tests for the object itself, as well as for the objects interacting with it?

For example, in the case of TranslateAgent, the questions will be rephrased as follows: how do we test its own logic (like, parsing an LLM response and updating the record), and how do we test the TranslationsController’s #create action?

Active Agent doesn’t give us any testing tools out of the box. Surely, you can use Webmock or VCR to stub HTTP requests to LLM APIs, but that doesn’t sound right to me. Actual HTTP requests must be an implementation detail of the library, not my code. Everything from slightly tuned LLM parameters to switching to another provider would require updated mock data—the potential churn rate is unacceptable.

Luckily, Active Agent adapterizes generation providers in a similar fashion, Active Storage adapterizes storage services, or Action Mailer adapterizes delivery mechanisms. The only missing bit (for now) is a generation provider suitable for the test environment. No worries, we can add it ourselves!

This is our version of the fake_llm provider to be used in tests:

require "active_agent/generation_provider/response"

module ActiveAgent
  module GenerationProvider
    class FakeLLMProvider < Base
      attr_reader :response

      class << self
        def generate_response_content
          raise NotImplementedError, "Must be stubbed via: allow(ActiveAgent::FakeLLM).to receive(:generate_response_content).and_return('...')"
        end

        def generations
          Thread.current[:generations] ||= []
        end
      end

      def initialize(*)
      end

      def generate(prompt)
        @prompt = prompt

        raw_response = prompt_parameters

        message = ActiveAgent::ActionPrompt::Message.new(
          content: self.class.generate_response_content,
          role: "assistant"
        )

        # Keep track of executed prompts to verify their contents
        self.class.generations << prompt

        @response = ActiveAgent::GenerationProvider::Response.new(prompt:, message:, raw_response:)
      end

      private

      def prompt_parameters
        {
          messages: @prompt.messages.map(&:to_h),
          tools: @prompt.actions
        }
      end
    end
  end
end

Put it, for example, into the lib/active_agents/generation_provider/fake_llm_provider.rb file (to match Active Agent conventions) and activate in the configuration file as follows:

# config/active_agent.yml
# ...
test:
  gpt:
    service: fake_llm

Spice it with the following helpers (for RSpec):

module LLMHelpers
  def stub_llm_response(content)
    allow(ActiveAgent::GenerationProvider::FakeLLMProvider).to receive(:generate_response_content).and_return(content)
  end

  def assert_llm_has_been_called(times: 1)
    expect(ActiveAgent::GenerationProvider::FakeLLMProvider).to have_received(:generate_response_content).exactly(times).times
  end

  def assert_llm_has_not_been_called = assert_llm_has_been_called(times: 0)

  def llm_generations = ActiveAgent::GenerationProvider::FakeLLMProvider.generations
end

RSpec.configure do |config|
  config.include LLMHelpers

  config.after { llm_generations.clear }
end

And write your agents and agent-dependent tests as follows:

describe TranslateAgent do
  describe "#translate" do
    before do
      stub_llm_response("ru->en: Blood type on the sleeve")
    end

    specify do
      agent = described_class.translate("Группа крови на рукаве", "en")

      result = agent.generate_now

      expect(result.message.content).to eq("ru->en: Blood type on the sleeve")

      prompt = llm_generations.last

      # Verify that the correct instructions have been used
      expect(prompt.instructions).to include("You are an experienced translator knowing many languages")
    end
  end
end

This is a basic testing functionality required for an agentic framework. We expect something like this to land in Active Agent soon. Stay tuned!

Battle #2: AI reviewer for Redprints CFP

Our second example comes from extending our Redprints CFP application with an AI-powered proposal reviewer. This agent evaluates conference proposals, provides scores, and offers constructive feedback to help organizers make informed decisions.

The ReviewAgent we’ve experimented with was tailored to the needs of the SF Ruby conference. We employed three criteria to get the initial score for each proposal: novelty, relevance, and quality (of the proposal itself). Thus, we needed to teach our agent how to assess each criterion.

Tools integration for looking up talk novelty

One of the evaluation criteria was novelty: we wanted to score new talks and topics higher. Is LLM capable of knowing which talks and topics have already been worn out or not? Maybe. However, “maybe” is not the level of non-determinism we could accept. So, we decided to provide our agent with the capability to search through the database of Ruby talks and decide on one of the ones under review based on the novelty score.

First, we prepared a search index of Ruby and Rails talks from the selected conferences in recent years. We took the RubyEvents “database” (YAML files) and turned them into a Trieve dataset.

Then, we defined the #search_talks tool for ReviewAgent. I think it’s time to show its source code:

class ReviewAgent < ApplicationAgent
  def review(proposal)
    @proposal = proposal

    prompt(output_schema: "review_schema")
  end

  def search_talks
    query = params[:query]
    results = TrieveClient.search(query)

    prompt(instructions: "") do |format|
      format.html { render partial: "search_talks_results", locals: {results:} }
    end
  end
end

A tool is just a method, similar to other actions. The tool arguments are accessed via the #params Hash. You can use templates to render the results.

To make a tool visible to an LLM provider, we must define its schema in a separate template file (e.g., search_talks.json). You can even use Jbuilder for that!

json.type :function
json.function do
  json.name :search_talks
  json.description "This action takes a query parameter and returns a list of talks with their abstracts from the Ruby Events database for the previous few years."
  json.parameters do
    json.type :object
    json.properties do
      json.query do
        json.type :string
        json.description "A search query (specify a term or two)"
      end
    end
  end
end

However, in my opinion, this is where we deviate from the beauty and convention-ness of Rails: one should not manually write code for machines, it must be inferred somehow.

In my other Ruby vs. AI journey, I proposed an elegant (IMO) way of defining tools using inlined Ruby Type Signatures (RBS):

# This action takes a query parameter and returns a list of talks with their abstracts
# from the Ruby Events database for the previous few years.
# @rbs (query: String) -> ActiveAgent::Prompt
tool def search_talks(query:)
  results = TrieveClient.search(query)
  prompt(instructions: "") do |format|
    format.html { render partial: "search_talks_results", locals: {results:} }
  end
end

This approach not only spares you from writing JSON schemas (we can generate one from types), but it also assumes a clear indication that the method is a tool and moves arguments to the method parameters. Moreover, when using RBS, we can use its runtime testing capabilities to ensure that the code matches the schema.

RBS is not the only option. It could be YARD, or a custom DSL (we also experimented with ruby_llm-schema for that and it played nicely with Active Agent), or whatever. It just shouldn’t be explicit schema generation. This is not the Rails way.

More structure in the output novelty

In the previous snippet, you might have noticed the output_schema: "review_schema" parameter we pass to the prompt. That’s how we instruct LLM to generate the final response using the provided JSON schema. As a result, we can skip the text parsing part and get the structured output:

result = ReviewAgent.review(proposal).generate_now

JSON.parse(result.message.content)

#=> {
#     "scores": {"novelty":4,"relevance":4,"quality":5},
#     "feedback": "This one is a good fit for the conference",
#     "notes": "I've searched for the topic and couldn't find a lot of talks like this one"
#   }

Pretty cool, yeah? Well, not enough for me.

First, we have the same problem with manually defining a schema through a template as with tools.

Then, why do I need to call JSON.parse? I’ve already specified that the output must be a JSON string when I used the output_schema parameter. I wonder if we can achieve the following experience:

class ReviewAgent < ApplicationAgent
  Scores = Data.define(
    :novelty, #: Integer
    :relevance, #: Integer
    :quality #: Integer
  )
  ReviewResult = Data.define(
    :scores, #: Scores
    :feedback, #: String
    :notes, #: String
  )

  def review(proposal)
    @proposal = proposal
    prompt(output_object: ReviewResult)
  end
end

result = ReviewAgent.review(proposal)
result.message.data #=> #<data ReviewResult scores="" ...>

Yes, I’m suggesting RBS again, but I’m sure you got the main point: less boilerplate for machines.

Few-shot RAG with proposal examples

As a part of the instructions for this agent, we include examples of well-crafted and not-so-well-crafted proposals (you can find a lot of examples at speakerline.io; mostly well-crafted though):

# review_agent/instructions.text.erb

...

#### Proposal example: accepted

<%= render partial: "proposal_example_accepted" %>

#### Proposal example: rejected

<%= render partial: "proposal_example_rejected" %>

As you can see, we use partials here to organize the instructions content. This is especially useful when the instructions are very detailed.

However, this raises another question: How can we move from static, view-based prompts to dynamic ones stored in the database? How can we pick different proposals and measure their efficiency as examples? How can we iterate on the prompt itself without needing to touch the codebase every time we do that?

Let me stop here and switch to the blitz mode of talking about features Active Agent currently miss and how we’ve tried to find workarounds.

Future battles: what more Rails AI needs

Our experience with Active Agent has revealed that while it provides Rails-like conventions for AI interactions, the AI application landscape demands much more sophisticated abstractions. More precisely, given the ever-evolving nature of modern AI, we need a flexible and extensible framework that would enable quicker adaptation to changes.

Here are some areas where additional work may be needed when building AI-driven features for Rails applications.

Usage tracking and AI credits

Every AI application needs usage monitoring and limits. Users should have AI credits, and tenants should have budget controls. The AI engine must provide hooks to a: track usage and b: prevent using AI when no credits are available. Here’s an example of how we’ve implemented this with Active Agent:

class ApplicationAgent < ActiveAgent::Base
  before_generation :ensure_has_credits
  before_generation :track_start_time
  after_generation :track_usage

  private

  def identity = (params[:account] || Current.account)&.ai_identity

  def ensure_has_credits
    return unless identity

    raise "No credits available" unless identity.has_credits?
  end

  def track_start_time = @start_time = Time.current

  def track_usage
    return unless response

    if identity
      identity.generations.create!(
        purpose: [agent_name.underscore, action_name].join("/"),
        time_spent: (Time.current - @start_time).to_i,
        tokens_used: response.tokens_used,
        model: generation_provider.config["model"],
        provider: generation_provider.config["service"]
      )
    end
  end
end

Agent callbacks work perfectly for that. The account.ai_identity is a custom model that contains all the AI-relevant information for a given account (tenant): usage, credits, and more. It has the “generations” association containing detailed usage information (for auditing purposes mostly). This is not a part of the framework (yet), but could be generalized and provided as a plugin someday.

Dynamic credentials and private LLMs

Users should be able to use their own API keys or private model deployments. We can do this using the same account.ai_identity model:

class ApplicationAgent < ActiveAgent::Base
  generate_with :default

  # ...

  private

  # Override the default generation_provider behaviour
  def generation_provider
    @generation_provider ||= identity&.provider_for(self.class._generation_provider_name) || super
  end
end

Again, with Active Agent, it’s just a matter of a single method override in the base class. Note that we use semantical names for configured generation providers (“default”, “mini”, etc.), not service-specific (“openai”, “grok”, and so on).

Dynamic prompts

Hardcoded prompts may work for smaller applications or basic use cases. But the appetite comes with eating. The more you use AI-driven logic, the more sophisticated it becomes, the more polishing it starts requiring.

At some point, you may even decide to let users provide custom prompts (well, I hope not complete prompts, but some parts). All of these require an ability to load a prompt from somewhere dynamically.

Luckily, the Rails AI ecosystem was recently replenished with a prompt library engine, or PromptEngine. There should be a way to integrate it with Active Agent, like this:

class ApplicationAgent < ActiveAgent::Base
  default body: proc {
    slug = ["agent", agent_name.underscore, action_name].join("/")
    PromptEngine.find_by(slug:)&.render_in(self)
  }

  # ...
end

(The code above is an example, though I believe we can make it real.)

We should not try to implement everything within a single library/framework. Integrating existing tools is beneficial for all parties.

Guards and evals

Have you heard of prompt injections? Are you sure your prompt generates what you expect and not some hallucinations?

Security was always Rails’ top priority. An AI framework must encourage users to build safe AI features. User input must be validated, prompts must include guardrails, and outputs must be evaluated, too.

Technically speaking, we need input and output processors, or middleware. Callbacks can be used for that in the first place, but a proper AI generation pipeline would work better.

Agentic workflows

Although I wouldn’t reach out to an agentic workflow (i.e., a workflow where the task orchestration is controled by AI, and, thus, non-deterministic), it might be a good option for some features. An agentic workflow can be modeled as an entrypoint orchestrator agent that delegates work to other agents when needed.

For delegation, it would be nice to have an ability to connect an agent as a tool via some Rails-ish syntax sugar, for example:

class ReviewAgent < ApplicationAgent
  # Adds a `check_grammar` tool that invokes GrammarCriticAgent.check_grammer(message)`
  has_agent :grammar_critic, through: :check_grammar
end

Workflows may require suspension or human-in-the-loop; that would be impossible without adding some durability or memory to agents.

Memory and context persistence

Agents often need to remember previous interactions or maintain context across conversations. Memory could be static (fully included into the context) or dynamic (requestable by LLM via tools), short-term or long-term, compressed or lossless. That all deserves a dedicated library that would be integratable into others.

Context engineering and vectorization

Active Agent focuses only on generations, not context engineering (that’s a next-gen term for RAG). It would be nice to have abstractions such as vectorizer, extractor, and chunker in addition to generation provider.

Why so many? Each abstraction reflects a stage in the process of turning a document into a searchable piece of context. First, we extract (usually) a textual representation from the document (of any kind). We split it into chunks (some documents are large), and then we generate embeddings so we can search through chunks later. Surely, not every step is required for every application. However, some may have more.

For example, when dealing with knowledge bases, we may want to summarize large documents before chunking them or extract propositions as a special kind of chunks.

The path forward

The amount of open questions is huge, and most Rails AI libraries—including Active Agent—are still too young to answer all of them. However, we should expect these libraries to demonstrate an understanding of everyday needs and design themselves in an extensible way so that others can solve particular problems via plugins.

Active Agent shows promise as a Rails-like foundation for AI features. Still, the real test will be whether it can evolve to support the sophisticated patterns that production AI applications demand. The framework’s success will ultimately depend on its ability to provide extension points for usage tracking, advanced instrumentation, complex workflows, and the myriad other requirements that emerge when AI moves from prototype to production.

As Rails developers, we’re still in the early days of understanding how AI fits into our beloved framework. But with libraries like Active Agent leading the charge toward Rails conventions, we’re optimistic about building AI-powered applications that feel as natural as the Rails applications we know and love.

Book a call

Irina Nazarova CEO at Evil Martians

Let's set up a call and discuss how we can integrate AI features into your Rails application!