Factories or fixtures? Give me both!

August 21, 2017

Topics

Andrew Kozin
Backend Engineer

Everyone loves tests that are clear, understandable, and real. But sometimes the reality throws us a mess in the form of huge data structures which are sent over from external APIs. As soon as you have wrapped your head around a complicated response, you need to do it again to account for it in your tests. What can we do to ease this pain? Let’s find out!

For our tests to be 100% bulletproof, we need to cover both API responses and the way we process them in our application. Given that we are prone to developers of remote APIs changing interfaces at their will, we need to be able to quickly reflect these changes in our app (and tests!) too.

Long story short, we need tests that are:

readable,
interconnected,
sustainable.

We already have many good instruments like factories and fakers (faker, ffaker)—not to mention good old fixtures. So the question is: how do we mix these tools correctly and avoid inevitable problems of dealing with massive data structures, over which we have no control.

A good fat example of a real-world response comes from the eBay Trading API with its endless structures of product items.

Imagine we have a service object called LoadProductFromEbay. It should take a complex XML describing an eBay product and store it with references to account and category that both exist in our app. The first difficulty is data preparation. Our data is big (I mean, B-I-G). If we place it directly in the spec, we lose the very logic of the test. Obviously, there is a widely known solution: extract data into a fixture.

<!-- spec/fixtures/ebay_item.xml -->
<?xml version="1.0" encoding="UTF-8"?>
<GetItemResponse xmlns="urn:ebay:apis:eBLBaseComponents">
  <Timestamp>2011-01-06T08:08:13.025Z</Timestamp>
  <Ack>Success</Ack>
  <CorrelationID>476457080</CorrelationID>
  <Version>699</Version>
  <Build>E699_CORE_BUNDLED_12457306_R1</Build>
  <NotificationEventName>ItemListed</NotificationEventName>
  <RecipientUserID>testuser</RecipientUserID>
  <EIASToken>testtoken</EIASToken>
  <Item>
    <AutoPay>false</AutoPay>
    <BuyerProtection>ItemIneligible</BuyerProtection>
# ... hundreds of lines go here

We can also write (thanks @darthsim and @gzigzigzeo) a helper method to load our fixtures:

# spec/support/fixture_helpers.rb

def fixture_file_path(filename)
  Rails.root.join("spec/fixtures/#{filename}").to_s
end

def read_fixture_file(filename)
  File.read fixture_file_path(filename)
end

def yaml_fixture_file(filename)
  YAML.load_file(fixture_file_path(filename))
end

# ... whatever loaders you need

Now our spec looks much nicer (in fact, it used to be so ugly I chose not to show it):

# spec/services/load_product_from_ebay_spec.rb
RSpec.describe LoadProductFromEbay do
  subject(:service) { described_class.call source }

  let(:source)      { read_fixture_file("ebay_item.xml") }
  let!(:account)    { create :account }
  let!(:category)   { create :category }
  let(:new_product) { Product.last }

  it "creates the product" do
    expect { subject }.to change { Product.count }.by 1
  end

  it "maps new product to a corresponding account" do
    subject
    expect(new_product.account).to eq account
  end

  it "maps new product to a corresponding category" do
    subject
    expect(new_product.category).to eq category
  end
end

But now we are bumping into another problem: we need to synchronize our fixture with other chunks of data under testing. In the example above we prepare them using factories :account and :category. By writing "it maps new product to a corresponding account" we raise a natural question: “Corresponding to what?” Obviously, results should correspond to some part(s) of XML (user name, or ID), which in turn must relate to the generated object.

There are two ways we can bring our fixture and our instance in sync. The first is to update factory data with some values from a fixture directly in a spec:

let!(:account) { create :account, remote_id: "80290759834298" }

Hmm, this seems far too magical! Why on Earth do we use exactly this value? Is it related to the fixture, or maybe to another factory (think of richer specs than our oversimplified example)? Who knows?

We choose another way, that is updating data loaded from the fixture. For instance:

# spec/services/load_product_from_ebay_spec.rb
# ...
before do
  source.gsub!(/<UserID>\w+/, "<UserID>#{account.remote_id}")
  source.gsub!(/<CategoryID>\w+/, "<CategoryID>#{category.remote_id}")
end

Not the best decision, is it? Should we ever complicate the spec even slightly, we are lost in the heap of these substitutions. The devil himself would not understand what we tested for, and why.

Another problem with this solution is its dependency on the XML structure, one we do not control. Suppose, a new version of the remote API returns "<UserID type="string">...</UserID>" instead of the current "<UserID>...</UserID>". Our test will fail, and we will have a hard time figuring out the reason. That is bad news.

But there is also good news. The very existence of such a difficulty hints at an architectural problem in our spec (and, possibly, in the implementation too). Here, we are biting off more data than we can chew; let’s cut it in half and see what happens. We will break the LoadProductFromEbay service into two parts: one to map source data, another to process the result.

I will omit the mapping for now; we will get back to it later. Just suppose it maps incoming XML into a Ruby hash. Now we can move our hash into another fixture, so substitutions become a bit simpler:

# spec/fixtures/ebay_items/raw.yaml
# ...
Item:
  AutoPay: false
  BuyerProtection: ItemIneligible
  Seller:
    UserID: test_seller
# ...

# spec/services/load_product_from_ebay_spec.rb
require "spec_helper"

RSpec.describe LoadProductFromEbay do
  subject(:service) { described_class.call source }

  let(:source)      { yaml_fixture_file("ebay_item.mapped.yml") }
  let!(:account)    { create :account }
  let!(:category)   { create :category }
  let(:new_product) { Product.last }

  before do
    source[:Seller][:UserID] = account.remote_id
    source[:PrimaryCategory][:CategoryID] = category.id
  end

  # ...
end

Still, not ideal! The change is too low-level; it gives a reader too much information on the source data implementation, instead of telling him: “Let’s take data conforming to this account and this category”. How could we express this? By using a factory:

# let(:source) { yaml_fixture_file("ebay_item.mapped.yml") }
let!(:source) { create :ebay_item, account: account, category: category }
let(:account) { create :account }
let(:category) { create :category }

Bingo! The declaration above correctly reflects the intended behavior, and whatever presumptions we had made. To implement it, we could abandon the fixture in favor of the following factory:

# spec/factories/ebay/items.rb
FactoryGirl.define do
  factory :ebay_item, class: Hash do
    transient do # declare transient parameters for a build/create
      account  nil
      category nil
    end

    skip_create # build === create
    initialize_with do
      {
        Item: {
          AutoPay: false,
          BuyerProtection: "ItemIneligible",
          Seller: {
            UserID: account&.id || "test"
            # ...
          }
        }
      }
    end
  end
end

Well, we have fixed the spec, but also got ourselves two new problems. First: a factory above looks more complicated than the abandoned fixture. To recognize the second one (and the one more important), we need to take a step back and re-think what we did when we separated a mapper from a service. We did not only split the code but also replaced a sort-of-integration spec with two unit tests.

The question is: does this change give us the same confidence in the correctness of the whole suit? Unfortunately, the answer is no.

And here’s why: our spec for the mapper ensures that the source XML is converted into a target hash correctly:

# spec/mappers/ebay_item_mapper_spec.rb
RSpec.describe EbayItemMapper do
  let(:source) { yaml_fixture_file "spec/fixtures/ebay_items/raw.xml" }
  let(:target) { yaml_fixture_file "spec/fixtures/ebay_items/raw.yml" }

  subject(:mapper) { described_class.call source }
  it { is_expected.to eq target }
end

Another spec tests that a service returns a hash (not XML) and processes it in the expected way:

# spec/services/load_product_from_ebay_spec.rb
RSpec.describe LoadProductFromEbay do
  subject(:service) { described_class.call source }

  let(:source)      { create :ebay_item, account: account, category: category }
  let!(:account)    { create :account }
  let!(:category)   { create :category }
  let(:new_product) { Product.last }

  it "creates the product" do
    expect { subject }.to change { Product.count }.by 1
  end

  it "maps new product to the corresponding account" do
    subject
    expect(new_product.account).to eq account
  end

  it "maps new product to the corresponding category" do
    subject
    expect(new_product.category).to eq category
  end
end

How interconnected are these specs? To be more precise, how the target of the mapper spec connects with the source of the service spec? Who knows?

Are we confident that even after the API changes this integration will raise no problems? No! In response to a possible change, we could modify the mapper spec (via its fixtures), but forget about the service, whose source became stale. That would render our test suite incomplete; and, worst of all, this incompleteness would not be evident.

A quick recap: we solved the problem we had at the start, but sacrificed both connectivity and sustainability. Not a good deal!

Can we fix this? Yes, we can.

All we need is to abandon the whole “fixtures vs. factories” dichotomy and use both tools at once. No one prevents us from using the fixture inside the factory.

It is as simple as this:

# spec/factories/ebay/items.rb
FactoryGirl.define do
  factory :ebay_item, class: Hash do
    def insert(fixture, *keys, key, to:)
      fixture_keys = fixture.dig(key, *keys)

      raise "the factory looks stale" unless fixture_keys.is_a?(String)

      fixture.dig(keys)[key] = to
    end

    transient do # declare transient parameters for a build/create
      account
      category
    end

    skip_create # build === create
    initialize_with do
      fixture = yaml_fixture_load "spec/fixtures/ebay_items.yml"
      insert(fixture, "Seller", "UserID", account&.id&.to_s)
      insert(fixture, "PrimaryCategory", "CategoryID", category&.id&.to_s)
    end
  end
end

Let’s summarize what we have in the end:

Two fixtures: one for a raw XML (spec/fixtures/ebay_item/raw.xml), another one for mapped data (spec/fixtures/ebay_item/raw.yml). They are closely connected through the mapper spec; changing one (in response to a new version of API) will affect another.
Our second fixture is linked to :ebay_item factory; any change in the fixture will affect the factory. I have deliberately added the insert method to ensure the factory will not add any new keys, only replace existing ones.
The service spec that is tied to the mapper’s spec. The fixture-factory interconnection provides a connection between both specs. Note how readable they have become!

Our conclusion is obvious: do not oppose factories and fixtures, bring them all and in the darkness bind them to rule your test suites mightily.

Factories or fixtures? Give me both!

Topics

Join our email newsletter