Exploring Active Agent, or can we build AI features the Rails way?

In a fast-paced world with AI oozing out of every pore, frameworks and ecosystems must adjust quickly to rapidly evolving business demands. Everyone needs smart features in their apps. The easier it is to add AI-driven features to the project built with Some Framework, the more likely it will be the tool of choice. Let’s assume that “Some Framework” is “Ruby on Rails” and explore this ecosystem’s AI readiness!
The Ruby and Rails AI ecosystem took off as soon as the AI revolution sparked. It evolved from unofficial LLM provider SDKs like ruby-openai to specialized libraries such as Raix and RubyLLM. Each offers different approaches to integrating large language models into Ruby applications, but neither feels like a missing Rails abstraction for AI to me.
I prefer abstractions that follow Rails conventions and design principles and bring a familiar user experience. Both Raix and RubyLLM focus on providing a common interface for chatting with AI (and they do that very well), but do not go above that (in the layered architecture sense, they cover mainly the infrastructure layer).
Hire Evil Martians
Let's talk about how to integrate AI features into your Rails application!
Active Agent, on the other hand, promises to fill the gap in the Rails abstraction stack and bring LLM-driven functionality the Rails way.
That’s what hooked me when I heard about Active Agent from my friend Justin. Coincidentally, that was at the time I started pondering the idea of the second edition of the “Layered Design for Ruby on Rails Applications” book, and the concept behind Active Agent sounded precisely like the one I kept in mind for the corresponding new chapter. So, I decided to become an early adopter and champion Active Agent at Evil Martians.
Does Active Agent deliver on its promise of being a Rails native AI library? Let’s try to answer this question through real-world examples and explore what the future might hold for AI in Rails.
Disclaimer: This post brings more questions than answers (though we have some answers, too) and should be seen as a reflection on the state of the ecosystem and a particular tool as well as the invitation to join the discussion about the future of Rails-native AI abstractions. This discussion is meant to pave the way towards Active Agent 1.0, which we want to build together with the community.
Active Agent in a nutshell
The Active Agent gem brings a new abstraction to Rails: agents (what a surprise!). Agents are meant to encapsulate AI-backed logic and glue it together with the rest of the framework by using familiar Rails patterns: action-driven objects (like controllers, channels, and mailers), callbacks (of course!), and prompt rendering backed by Action View. While controllers are responsible for turning HTTP requests into responses, mailers—for writing and sending emails, an agent’s primary purpose is to trigger AI generations of any kind.
Okay, time to stop talking (I hear you yelling “Show me the code!”). Let’s see Active Agent in action by trying to implement the “hello world” of AI—a “tell me a story” agent. Although, in our case, it’s gonna be “tell me a joke”.
Here’s the code for the JokerAgent class:
class JokerAgent < ApplicationAgent
after_generation :notify_user
def dad_joke(topic = "beer")
prompt(body: "Generate a dad joke for the topic: #{topic}")
end
def nerd_joke(topic = "Apple keynote")
prompt(body: "Generate a nerd joke for the topic: #{topic}")
end
private
def notify_user
return unless params[:user]
UserMailer.with(user: params[:user]).joke_generated(response.message.content).deliver_later
end
end
For Rails developers, this code looks pretty familiar and requires little explanation about what it does. Next, this is how we can use it:
result = JokerAgent.nerd_joke("Ruby on Rails").generate_now
puts result.message.content
#=> Why do Ruby on Rails developers always carry a pencil?
#=> Because they want to draw their routes!
user = User.find_by(email: "palkan@evl.ms")
JokerAgent.with(user:).dad_joke("Ruby on Rails").generate_later
# Now, in my Imbox:
# Congrats! A new dad joke has landed:
#
# Why did the developer break up with Ruby on Rails?
# Because they just couldn't handle the "active" relationship!
As you can see, two modes are available: you can get a generation result right away (synchronously) or move it to a background job (though in this case, you have to use a callback to do something with the result). Add the usage of a prompt object as a LLM request representation and you’ll see how it closely resembles Action Mailer with “deliver_” substituted for “generate_“. As we will see later, this mailer-ness has its downsides.
Before we move to the real-world examples, I’d like to reveal one more feature of Active Agent hidden from the snippet above.
In the agent code above, the prompts are very basic; in real life, they would contain dozens of lines of instructions with guardrails and other tricks to keep jokes under control and within the law. With Active Agent, you don’t need to touch the agent class code to improve the instructions; you can put them into a separate prompt template file:
<!-- app/agents/joker_agent/instructions.text-->
You are an AI joke generator. Your task is to create short, clever jokes that are appropriate and entertaining.
GENERAL GUIDELINES:
- Keep jokes concise and punchy (typically 1-2 sentences)
- Focus on wordplay, puns, unexpected twists, or clever observations
- Ensure jokes are family-friendly and appropriate for all audiences
- ... and so on
This kind of setup requires a bit of configuration, and that’s exactly why we have a base class for agents:
class ApplicationAgent < ActiveAgent::Base
generate_with :gpt
# Keep system instructions in the <agent>/instructions.<format> template
default instructions: {template: :instructions}
# Keep prompt views next to agents
prepend_view_path Rails.root.join("app/agents")
# Just shortcut
delegate :response, to: :generation_provider
end
Similarly, you can store per-action prompts in template files (i.e., joker_agent/dad_joke.text.erb
, etc.). Since under the hood, Active Agent uses Action View, all the familiar view layer features are available to you: partials, formats, and so on. Imagine writing a JSON-formatted prompt with Jbuilder!
To sum up, an agent is responsible for preparing and performing AI generation requests for a given business domain (so, you may have multiple generation actions sharing the same instruction set and, probably, peripheral logic, like callbacks).
Now, let’s talk about how we battle-tested Active Agent!
Battle #1: Twitter-like on-demand translations
I got the first chance to test out Active Agent just after its announcement. I was working on an internal Martian team-building project (soon to be redprinted), and engaging in the typical post-comment-reply conversation hierarchy.
Our team is multi-lingual, and to keep this project “warm and fuzzy”, I wanted to let anyone express their thoughts and consume others’ thoughts in their native language. Thus, we needed a translation feature, and I decided to go with an on-demand one. (Think Twitter’s translate button, but for a Rails application where users can instantly translate any user-generated content.)
Here’s the translation agent code:
class TranslateAgent < ApplicationAgent
after_generation :update_record
def translate(content, locale)
@locale = locale
@content = content
prompt
end
private
def update_record
return unless params[:record]
record = params[:record]
result = response.message.content
# The translation result has a form of: <from>-><to>: <content>
# (We had no structured output support back then)
_, original_locale, locale, content = result.split(/^(\w{2})->(\w{2}):\s*/)
return unless original_locale.present? && locale.present? && content.present?
record.translations << Translation.new(locale:, content:) unless locale == original_locale
record.original_locale = original_locale
record.save!
end
end
Let’s omit the prompt as it’s not of any interest to us. We’re discussing software design questions here, so looking at how this code is being used is more important. In our case, we invoked it directly from a controller. Here’s a simplified version of the controller:
class TranslationsController < ApplicationController
def create
comment = Comment.find(params[:id])
unless comment.translated_to?(params[:locale])
TranslateAgent.with(record: comment)
.translate(comment.body, params[:locale]).generate_now
end
render json: comment
end
end
This example raises an interesting architectural question: Should agents encapsulate the entire operation, including database updates, or should they only produce structured outputs for other abstractions to handle?. In other words, is this #update_record
method an architectural smell? Does it follow such common good design principles, like separation of concerns? My answers would be “likely” and “not really”.
The #update_record
method introduces indirection and implicitness to the whole operation. The calling code must know about the side effect of updating the record’s state by the agent—it’s not well communicated by the #generate_now
method.
Additionally, the agent knows too much about the target model implementation (how exactly the translation is stored). The latter could be minimized by reorganizing the code a bit and leaving model-level knowledge within the model abstraction layer:
# app/agents/translate_agent.rb
return unless original_locale.present? && locale.present? && content.present?
-record.translations << Translation.new(locale:, content:) unless locale == original_locale
-record.original_locale = original_locale
-record.save!
+record.translated!(original_locale, locale, content)
# app/controllers/translations_controller.rb
def create
comment = Comment.find(params[:id])
- unless comment.translated_to?(params[:locale])
- TranslateAgent.with(record: comment)
- .translate(comment.body, params[:locale]).generate_now
- end
+ comment.generate_translation!(params[:locale])
# app/models/concerns/localizable.rb
+ def generate_translation!(locale)
+ return if translated_into?(locale)
+
+ TranslateAgent.with(record: self).translate(translatable_content.to_s, locale).generate_now
+ end
+ def translated!(from, to, content)
+ self.original_locale = from
+ translations << Translation.new(locale: to, content:) unless original_locale == to
+ save!
+ end
Still, we trigger the model update from the agent’s callback. Why not move the #translated!
call to the #generate_translation!
method?
First, we want (and must) keep the response parsing in the agent: the model should not care about how exactly AI responds with the requested data. The #generate_now
returns the raw response only.
Then, let’s think about another aspect of this example: we invoke LLM right within a web request, synchronously. That was fine in our case (low application load); that could be totally okay if you go with async Ruby and Falcon. But in general, performing long-running HTTP requests should happen outside of the request-response loop. If we decide to go this way, all we need is to replace #generate_now
with #generate_later
and probably add an ActionCable.server.broadcast(...)
call to the agent’s callback to notify clients of a newly generated translation.
We used this background generation mode in our application to regenerate translations in case the original content has been updated:
def regenerate_translations
return if translations.empty?
translations.each do
TranslateAgent.with(record: self)
.translate(translatable_content.to_s, it.locale)
.generate_later
end
end
To extract persistence logic from the agent class, we would have to introduce a custom job class, a generation result parsing method, and, likely, a service object or a concern (so the model class is not polluted with this peripheral logic). Will this increased complexity be worth it? I doubt so. The current implementation looks like a good compromise to me. The only thing I would love to see is a better place to put the post-generation logic than a callback and/or an ability to return post-processed results.
Testability is the key
One of the key properties of maintainable code is testability. How good is the developer experience of writing and maintaining tests for the object itself, as well as for the objects interacting with it?
For example, in the case of TranslateAgent, the questions will be rephrased as follows: how do we test its own logic (like, parsing an LLM response and updating the record), and how do we test the TranslationsController’s #create
action?
Active Agent doesn’t give us any testing tools out of the box. Surely, you can use Webmock or VCR to stub HTTP requests to LLM APIs, but that doesn’t sound right to me. Actual HTTP requests must be an implementation detail of the library, not my code. Everything from slightly tuned LLM parameters to switching to another provider would require updated mock data—the potential churn rate is unacceptable.
Luckily, Active Agent adapterizes generation providers in a similar fashion, Active Storage adapterizes storage services, or Action Mailer adapterizes delivery mechanisms. The only missing bit (for now) is a generation provider suitable for the test environment. No worries, we can add it ourselves!
This is our version of the fake_llm
provider to be used in tests:
require "active_agent/generation_provider/response"
module ActiveAgent
module GenerationProvider
class FakeLLMProvider < Base
attr_reader :response
class << self
def generate_response_content
raise NotImplementedError, "Must be stubbed via: allow(ActiveAgent::FakeLLM).to receive(:generate_response_content).and_return('...')"
end
def generations
Thread.current[:generations] ||= []
end
end
def initialize(*)
end
def generate(prompt)
@prompt = prompt
raw_response = prompt_parameters
message = ActiveAgent::ActionPrompt::Message.new(
content: self.class.generate_response_content,
role: "assistant"
)
# Keep track of executed prompts to verify their contents
self.class.generations << prompt
@response = ActiveAgent::GenerationProvider::Response.new(prompt:, message:, raw_response:)
end
private
def prompt_parameters
{
messages: @prompt.messages.map(&:to_h),
tools: @prompt.actions
}
end
end
end
end
Put it, for example, into the lib/active_agents/generation_provider/fake_llm_provider.rb
file (to match Active Agent conventions) and activate in the configuration file as follows:
# config/active_agent.yml
# ...
test:
gpt:
service: fake_llm
Spice it with the following helpers (for RSpec):
module LLMHelpers
def stub_llm_response(content)
allow(ActiveAgent::GenerationProvider::FakeLLMProvider).to receive(:generate_response_content).and_return(content)
end
def assert_llm_has_been_called(times: 1)
expect(ActiveAgent::GenerationProvider::FakeLLMProvider).to have_received(:generate_response_content).exactly(times).times
end
def assert_llm_has_not_been_called = assert_llm_has_been_called(times: 0)
def llm_generations = ActiveAgent::GenerationProvider::FakeLLMProvider.generations
end
RSpec.configure do |config|
config.include LLMHelpers
config.after { llm_generations.clear }
end
And write your agents and agent-dependent tests as follows:
describe TranslateAgent do
describe "#translate" do
before do
stub_llm_response("ru->en: Blood type on the sleeve")
end
specify do
agent = described_class.translate("Группа крови на рукаве", "en")
result = agent.generate_now
expect(result.message.content).to eq("ru->en: Blood type on the sleeve")
prompt = llm_generations.last
# Verify that the correct instructions have been used
expect(prompt.instructions).to include("You are an experienced translator knowing many languages")
end
end
end
This is a basic testing functionality required for an agentic framework. We expect something like this to land in Active Agent soon. Stay tuned!
Battle #2: AI reviewer for Redprints CFP
Our second example comes from extending our Redprints CFP application with an AI-powered proposal reviewer. This agent evaluates conference proposals, provides scores, and offers constructive feedback to help organizers make informed decisions.
The ReviewAgent we’ve experimented with was tailored to the needs of the SF Ruby conference. We employed three criteria to get the initial score for each proposal: novelty, relevance, and quality (of the proposal itself). Thus, we needed to teach our agent how to assess each criterion.
Tools integration for looking up talk novelty
One of the evaluation criteria was novelty: we wanted to score new talks and topics higher. Is LLM capable of knowing which talks and topics have already been worn out or not? Maybe. However, “maybe” is not the level of non-determinism we could accept. So, we decided to provide our agent with the capability to search through the database of Ruby talks and decide on one of the ones under review based on the novelty score.
First, we prepared a search index of Ruby and Rails talks from the selected conferences in recent years. We took the RubyEvents “database” (YAML files) and turned them into a Trieve dataset.
Then, we defined the #search_talks
tool for ReviewAgent. I think it’s time to show its source code:
class ReviewAgent < ApplicationAgent
def review(proposal)
@proposal = proposal
prompt(output_schema: "review_schema")
end
def search_talks
query = params[:query]
results = TrieveClient.search(query)
prompt(instructions: "") do |format|
format.html { render partial: "search_talks_results", locals: {results:} }
end
end
end
A tool is just a method, similar to other actions. The tool arguments are accessed via the #params
Hash. You can use templates to render the results.
To make a tool visible to an LLM provider, we must define its schema in a separate template file (e.g., search_talks.json
). You can even use Jbuilder for that!
json.type :function
json.function do
json.name :search_talks
json.description "This action takes a query parameter and returns a list of talks with their abstracts from the Ruby Events database for the previous few years."
json.parameters do
json.type :object
json.properties do
json.query do
json.type :string
json.description "A search query (specify a term or two)"
end
end
end
end
However, in my opinion, this is where we deviate from the beauty and convention-ness of Rails: one should not manually write code for machines, it must be inferred somehow.
In my other Ruby vs. AI journey, I proposed an elegant (IMO) way of defining tools using inlined Ruby Type Signatures (RBS):
# This action takes a query parameter and returns a list of talks with their abstracts
# from the Ruby Events database for the previous few years.
# @rbs (query: String) -> ActiveAgent::Prompt
tool def search_talks(query:)
results = TrieveClient.search(query)
prompt(instructions: "") do |format|
format.html { render partial: "search_talks_results", locals: {results:} }
end
end
This approach not only spares you from writing JSON schemas (we can generate one from types), but it also assumes a clear indication that the method is a tool and moves arguments to the method parameters. Moreover, when using RBS, we can use its runtime testing capabilities to ensure that the code matches the schema.
RBS is not the only option. It could be YARD, or a custom DSL (we also experimented with ruby_llm-schema for that and it played nicely with Active Agent), or whatever. It just shouldn’t be explicit schema generation. This is not the Rails way.
More structure in the output novelty
In the previous snippet, you might have noticed the output_schema: "review_schema"
parameter we pass to the prompt. That’s how we instruct LLM to generate the final response using the provided JSON schema. As a result, we can skip the text parsing part and get the structured output:
result = ReviewAgent.review(proposal).generate_now
JSON.parse(result.message.content)
#=> {
# "scores": {"novelty":4,"relevance":4,"quality":5},
# "feedback": "This one is a good fit for the conference",
# "notes": "I've searched for the topic and couldn't find a lot of talks like this one"
# }
Pretty cool, yeah? Well, not enough for me.
First, we have the same problem with manually defining a schema through a template as with tools.
Then, why do I need to call JSON.parse
? I’ve already specified that the output must be a JSON string when I used the output_schema
parameter. I wonder if we can achieve the following experience:
class ReviewAgent < ApplicationAgent
Scores = Data.define(
:novelty, #: Integer
:relevance, #: Integer
:quality #: Integer
)
ReviewResult = Data.define(
:scores, #: Scores
:feedback, #: String
:notes, #: String
)
def review(proposal)
@proposal = proposal
prompt(output_object: ReviewResult)
end
end
result = ReviewAgent.review(proposal)
result.message.data #=> #<data ReviewResult scores="" ...>
Yes, I’m suggesting RBS again, but I’m sure you got the main point: less boilerplate for machines.
Few-shot RAG with proposal examples
As a part of the instructions for this agent, we include examples of well-crafted and not-so-well-crafted proposals (you can find a lot of examples at speakerline.io; mostly well-crafted though):
# review_agent/instructions.text.erb
...
#### Proposal example: accepted
<%= render partial: "proposal_example_accepted" %>
#### Proposal example: rejected
<%= render partial: "proposal_example_rejected" %>
As you can see, we use partials here to organize the instructions content. This is especially useful when the instructions are very detailed.
However, this raises another question: How can we move from static, view-based prompts to dynamic ones stored in the database? How can we pick different proposals and measure their efficiency as examples? How can we iterate on the prompt itself without needing to touch the codebase every time we do that?
Let me stop here and switch to the blitz mode of talking about features Active Agent currently miss and how we’ve tried to find workarounds.
Future battles: what more Rails AI needs
Our experience with Active Agent has revealed that while it provides Rails-like conventions for AI interactions, the AI application landscape demands much more sophisticated abstractions. More precisely, given the ever-evolving nature of modern AI, we need a flexible and extensible framework that would enable quicker adaptation to changes.
Here are some areas where additional work may be needed when building AI-driven features for Rails applications.
Usage tracking and AI credits
Every AI application needs usage monitoring and limits. Users should have AI credits, and tenants should have budget controls. The AI engine must provide hooks to a: track usage and b: prevent using AI when no credits are available. Here’s an example of how we’ve implemented this with Active Agent:
class ApplicationAgent < ActiveAgent::Base
before_generation :ensure_has_credits
before_generation :track_start_time
after_generation :track_usage
private
def identity = (params[:account] || Current.account)&.ai_identity
def ensure_has_credits
return unless identity
raise "No credits available" unless identity.has_credits?
end
def track_start_time = @start_time = Time.current
def track_usage
return unless response
if identity
identity.generations.create!(
purpose: [agent_name.underscore, action_name].join("/"),
time_spent: (Time.current - @start_time).to_i,
tokens_used: response.tokens_used,
model: generation_provider.config["model"],
provider: generation_provider.config["service"]
)
end
end
end
Agent callbacks work perfectly for that. The account.ai_identity
is a custom model that contains all the AI-relevant information for a given account (tenant): usage, credits, and more. It has the “generations” association containing detailed usage information (for auditing purposes mostly). This is not a part of the framework (yet), but could be generalized and provided as a plugin someday.
Dynamic credentials and private LLMs
Users should be able to use their own API keys or private model deployments. We can do this using the same account.ai_identity
model:
class ApplicationAgent < ActiveAgent::Base
generate_with :default
# ...
private
# Override the default generation_provider behaviour
def generation_provider
@generation_provider ||= identity&.provider_for(self.class._generation_provider_name) || super
end
end
Again, with Active Agent, it’s just a matter of a single method override in the base class. Note that we use semantical names for configured generation providers (“default”, “mini”, etc.), not service-specific (“openai”, “grok”, and so on).
Dynamic prompts
Hardcoded prompts may work for smaller applications or basic use cases. But the appetite comes with eating. The more you use AI-driven logic, the more sophisticated it becomes, the more polishing it starts requiring.
At some point, you may even decide to let users provide custom prompts (well, I hope not complete prompts, but some parts). All of these require an ability to load a prompt from somewhere dynamically.
Luckily, the Rails AI ecosystem was recently replenished with a prompt library engine, or PromptEngine. There should be a way to integrate it with Active Agent, like this:
class ApplicationAgent < ActiveAgent::Base
default body: proc {
slug = ["agent", agent_name.underscore, action_name].join("/")
PromptEngine.find_by(slug:)&.render_in(self)
}
# ...
end
(The code above is an example, though I believe we can make it real.)
We should not try to implement everything within a single library/framework. Integrating existing tools is beneficial for all parties.
Guards and evals
Have you heard of prompt injections? Are you sure your prompt generates what you expect and not some hallucinations?
Security was always Rails’ top priority. An AI framework must encourage users to build safe AI features. User input must be validated, prompts must include guardrails, and outputs must be evaluated, too.
Technically speaking, we need input and output processors, or middleware. Callbacks can be used for that in the first place, but a proper AI generation pipeline would work better.
Agentic workflows
Although I wouldn’t reach out to an agentic workflow (i.e., a workflow where the task orchestration is controled by AI, and, thus, non-deterministic), it might be a good option for some features. An agentic workflow can be modeled as an entrypoint orchestrator agent that delegates work to other agents when needed.
For delegation, it would be nice to have an ability to connect an agent as a tool via some Rails-ish syntax sugar, for example:
class ReviewAgent < ApplicationAgent
# Adds a `check_grammar` tool that invokes GrammarCriticAgent.check_grammer(message)`
has_agent :grammar_critic, through: :check_grammar
end
Workflows may require suspension or human-in-the-loop; that would be impossible without adding some durability or memory to agents.
Memory and context persistence
Agents often need to remember previous interactions or maintain context across conversations. Memory could be static (fully included into the context) or dynamic (requestable by LLM via tools), short-term or long-term, compressed or lossless. That all deserves a dedicated library that would be integratable into others.
Context engineering and vectorization
Active Agent focuses only on generations, not context engineering (that’s a next-gen term for RAG). It would be nice to have abstractions such as vectorizer, extractor, and chunker in addition to generation provider.
Why so many? Each abstraction reflects a stage in the process of turning a document into a searchable piece of context. First, we extract (usually) a textual representation from the document (of any kind). We split it into chunks (some documents are large), and then we generate embeddings so we can search through chunks later. Surely, not every step is required for every application. However, some may have more.
For example, when dealing with knowledge bases, we may want to summarize large documents before chunking them or extract propositions as a special kind of chunks.
The path forward
The amount of open questions is huge, and most Rails AI libraries—including Active Agent—are still too young to answer all of them. However, we should expect these libraries to demonstrate an understanding of everyday needs and design themselves in an extensible way so that others can solve particular problems via plugins.
Active Agent shows promise as a Rails-like foundation for AI features. Still, the real test will be whether it can evolve to support the sophisticated patterns that production AI applications demand. The framework’s success will ultimately depend on its ability to provide extension points for usage tracking, advanced instrumentation, complex workflows, and the myriad other requirements that emerge when AI moves from prototype to production.
As Rails developers, we’re still in the early days of understanding how AI fits into our beloved framework. But with libraries like Active Agent leading the charge toward Rails conventions, we’re optimistic about building AI-powered applications that feel as natural as the Rails applications we know and love.