WebSocket Director: scenario-based integration tests for realtime apps

Cover for WebSocket Director: scenario-based integration tests for realtime apps

Black-box testing, end-to-end testing, or system testing, according to Ruby on Rails, is just the tip of the testing iceberg (or diamond, as you wish). Still, it plays a significant role achieving software stability. However, writing and maintaining high-level integration tests is not a piece of cake, especially when you go beyond the classic, synchronous HTTP request-response model. In my case, all the flavors of WebSockets. To help myself with everyday testing tasks, I’ve built a tool, WebSocket Director, which I’m happy to introduce in this post.

Modern software development, especially the web part, places much more attention on engineer productivity than the compiler speed, language runtimes, or whatever. Today, humans are the bottleneck. And to eliminate this bottleneck, we need developer tools.

Have you ever tried counting the number of dev tools you use regularly? It doesn’t matter; it’s never enough. There’s always a routine task which you’d prefer some robot to do for you.

I’d like to share the story of one such dev tool, WebSocket Director, and demonstrate how you can use it to increase your productivity.

From fruitless search to Shakespeare

When I started working on AnyCable back in 2016, I struggled to find a black-box testing tool for WebSockets. I wanted something human-readable and human-editable, which I could repeatedly run in my terminal to simulate typical real-time activities. Either I wasn’t good at searching, or my taste was too picky (yeah, productivity is subjective), but I couldn’t find anything to satisfy my needs.

So, what exactly was I looking for? Here are some scenarios I wanted to automate:

  • Connection handshake (create an HTTP connection, wait for handshake confirmation from a server).
  • Subscribing to channels (or the connect - subscribe - confirm loop).
  • Performing Action Cable actions.

To my Rubyish mind, the best way to formalize such scenarios seemed to use YAML. For example, the Action Cable scenario I initially sketched looked like this:

- subscribe:
    channel: "EchoChannel"
- perform:
    channel: "EchoChannel"
    data:
      text: "Hey!"
- receive:
    channel: "EchoChannel"
    data:
      response: "Hey!"

Readable? Yes. Functional? Sure. That’s good enough for a basic smoke test when you work on a custom server speaking Action Cable protocol.

I continued iterating on possible use cases and the corresponding scenarios without writing any code. That was intentional: I didn’t want code decisions to affect the script’s format; I intended it should stay tool-agnostic.

For real-time applications, it’s crucial to perform multi-user tests. I borrowed the scale factor concept from pgbench and extended the original file structure to support multiple client groups. The resulting multi-user scenario looked like this:

- client:
    multiplier: ":scale"
    name: publishers
    actions:
      - subscribe:
          channel: ChatChannel
          params:
            room_id: "42"
      - wait_all # a special synchronization directive
      - perform:
          channel: ChatChannel
          params:
            room_id: "42"
          action: "speak"
          data:
            message: "test"
- client:
    multiplier: ":scale * 2"
    name: listeners
    actions:
      - subscribe:
          channel: ChatChannel
          params:
            room_id: "42"
      - wait_all
      - receive:
          multiplier: ":scale"
          channel: ChatChannel
          params:
            room_id: "42"
          data:
            message: "test"

Before jumping into implementation, the final touch was to come up with a good name for the tool to execute these scenarios. The phrase “executing scenarios” naturally led to “directing scripts” and… bingo! The name WebSocket Director was born. And then, while using it, a developer becomes a scriptwriter. Writing automation scenarios is a creative process, so why not call a tool appropriately?

All the world’s a server, and all the men and women merely clients.

And the rest is history (you can see on GitHub 😉). Today, I only need to run gem install wsdirector-cli to perform WebSockets tests (and even mini-benchmarks). After installation, I can direct the above script like this:

$ wsdirector chat.yml localhost:8080/cable -s 10

Group publishers: 10 clients, 0 failures
Group listeners: 20 clients, 0 failures

But, despite what you might think, developing AnyCable is not the only use case for WebSocket Director. Let me share a couple of stories where WebSocket director really clicked into place.

WebSocketDirector vs. Twilio Voice

Twilio is a powerful platform for building communication tools of any kind—from SMS, to voice messages. Their Programmable Voice service provides a unique functionality, which I call WebSocket hooks (referring to HTTP webhooks). You can configure your application to consume audio packets through a WebSocket connection in real-time (see, for example, this tutorial).

That could be useful, for instance, to perform some data analysis or to build AI auto-responders. But anyhow, the possibilities that Twilio offers are not the subject of this article. Today, we’re talking about developer happiness (or the lack of it).

Imagine a development flow when working on a feature that involves processing voice streams:

  • You start a web server (say, rails s).
  • You start a tunneling service to expose your local server to the Internet (for example, Ngrok).
  • You open a Twilio console and make a phone call using a web phone simulator.
  • And repeat ‘till the end of the work day.

Unlike HTTP webhooks, which you could just simulate with curl, WebSockets communication is not that easy to reproduce locally. It takes quite a bit of time to perform all the necessary steps; this kind of feedback loop contains more chores than actual work.

While working on such a project, I quickly found this development process anti-productive and started looking for optimizations. Then, I recalled WS Director: “What if we could create a fixture script for a Twilio call?”

All I needed was to capture the incoming WebSocket frames and transform them into WS Director steps: each message could be represented as a send step, and the intervals between messages could be emulated via the sleep steps. I’m not going to share the original implementation of the WebSockets VCR here; instead, let me demonstrate how to capture the incoming WebSocket stream with the latest version of WS Director and a slightly patched Action Cable server:

module ApplicationCable
  class Connection < ActionCable::Connection::Base
    # We patch connection callbacks to simply consume messages
    # without entering the channels layer at all
    def handle_open
      # Snapshot is a container for frames,
      # which takes care of inserting `sleep`-s between messages
      @snapshot = WSDirector::Snapshot.new

      # This line is required to start receiving messages
      message_buffer.process!
    end

    def dispatch_websocket_message(data)
      @snapshot << decode(data)
    end

    def handle_close
      File.write("tmp/call.yml", @snapshot.to_yml)
    end
  end
end

As a result, I got the call.yml script which looked like this:

---
- send:
    data:
      event: connected
      protocol: Call
      version: 0.2.0
- sleep:
    time: 0.022
- send:
    data:
      event: start
      sequenceNumber: '1'
      start:
        accountSid: AC41149b360
        streamSid: MZ50f966ef8a
        callSid: CAa910aa778
        tracks:
        - inbound
        mediaFormat:
          encoding: audio/x-mulaw
          sampleRate: 8000
          channels: 1
        customParameters:
          from: client:Anonymous
          to: ''
      streamSid: MZ50f966ef8a
- sleep:
    time: 0.021
# ...
- send:
    data:
      event: media
      sequenceNumber: '139'
      media:
        track: inbound
        chunk: '138'
        timestamp: '2891'
        payload: "//9+/v3+/f1+fv5+fX5+/3+e37+//7+/w=="
      streamSid: MZ50f966ef8a
# ... thousands lines of code
- send:
    data:
      event: stop
      sequenceNumber: '1107'
      streamSid: MZ50f966ef8a
      stop:
        accountSid: AC41149b360
        callSid: CAa910aa778
- sleep:
    time: 0.02

To emulate a phone call, I just ran the following command without leaving my IDE:

wsdirector fixtures/wsdirector/call.yml ws://localhost:9010/twilio

This saved me an incredible amount of time, and as such, I could actually focus on the real problem. However, I can’t say this application of
Websocket Director was crucial to the application’s development. Rather, it was about pure developer experience.

Let’s move on to the next story, in which WebSocket Director became a part of the application lifecycle.

WS Director vs. system testing

Writing integration tests for WebSockets-driven functionality is not rocket science. We can use our beloved Capybara and Rails system tests to emulate user interactions with our application via a web browser. However, not every interaction can be reproduced in a browser: for example, if we have a mobile application in our “play”.

So, here is a user scenario I had to transform into a system test:

  • User C (driver) uses a mobile application, which tracks their GPS location and sends it to the tracker server.
  • User A (admin) opens a product dashboard, which contains the active drivers list with their real-time locations.
  • We want to ensure that whenever a driver’s mobile application sends location updates, an admin can see them immediately.

How to write such a test? We can’t run a mobile app simulator and control it with Capybara. You might think of triggering broadcasts manually in a test, e.g., ActionCable.server.broadcast "locations", data. Yeah, that could be a workaround (although, in my opinion, calling internal code makes our test escape from the black box). In our case, we used a standalone WebSocket server, which was created specifically for this task and had no public APIs to perform broadcasts from the outside. The only way to emulate the GPS update broadcast was to communicate over a WebSocket connection.

The tracking server was written in Elixir and used Phoenix Channels under the hood. Did I mention that WS Director is protocol-agnostic? Although it was built with Action Cable in mind, it could be used to interact with any WebSockets server by using the primitive actions: send, receive, sleep, etc.

To emulate a mobile application activity, we created the following script:

---
- send:
    data:
      topic: tenant:<%= ENV.fetch('TENANT', 'rspec') %>
      event: phx_join
      payload: {}
      ref: '1'
- receive:
    data:
      topic: tenant:<%= ENV.fetch('TENANT', 'rspec') %>
      ref: '1'
      payload:
        status: ok
        response: {}
      event: phx_reply
- send:
    data:
      topic: tenant:<%= ENV.fetch('TENANT', 'rspec') %>
      event: update_location
      payload:
        position:
          latitude: <%= ENV.fetch('LAT', 32.84019785216758).to_f %>
          longitude: <%= ENV.fetch('LON', -97.06401083105213).to_f %>
        technicianId: <%= ENV['ID'] %>
      ref: '2'

NOTE: The tracking app was built with an older version of Phoenix and used an outdated version of the Channels protocol.

That worked perfectly in development. But what about tests? How to invoke WebSocket Director from a Rails system test?

One option could be to “sell out” and simply call the wsdirector CLI. However, since the tool is written in Ruby, why not use it directly? The final test example code looked like this:

it "sees location updates in real-time" do
  within "#tech-#{tech.id} .location" do
    expect(page).to have_text "Bronx, NY"
  end

  run_websocket_scenario(
    "tracker/update_location.yml",
    token: jwt_token,
    env: {
      "ID" => tech.id,
      "LAT" => 32.9846003797191,
      "LON" => -97.0647746830336
    }
  )

  within "#tech-#{tech.id} .location" do
    expect(page).to have_text "Dallas, TX"
  end
end

What’s hidden inside the #run_websocket_scenario method? Well, I’d prefer not to show the original contents—they didn’t look great. And the reason for that is because the gem wasn’t initially designed for programmatic usage; it was just a CLI tool. For example, it relied on a global configuration object, and the only way to add dynamic values was by using environment variables.

Using WS Director in commercial projects, I collected enough feedback to realize that, first of all, the gem is more generic than just a dev tool for Action Cable or AnyCable developers; and second, some refactoring is needed. This revelation led to the first major release of WS Director. Allow me to introduce WebSocket Director 1.0!

Introducing WS Director 1.0

I doubt many readers were familiar with WebSocket Director prior to this post, so there’s no need to describe the actual changes (but you can check the release notes, if you’re so inclined). Instead, I’d rather show off what you can do with WS Director 1.0.

First, it now provides a much better experience when running from Ruby. For instance, the source code of the run_websocket_scenario now looks like this:

def run_websocket_scenario(path, token:, url: Rails.configuration.tracker_url, **options)
  url = "#{url}?token=#{token}"
  scenario = Rails.root.join "spec" / "fixtures" / "wsdirector" / path

  WSDirector.run(scenario, url:, **options)
end

# Usage in tests
run_websocket_scenario(
  "tracker/update_location.yml",
  token: jwt_token,
  locals: {
    id: tech.id,
    tenant: tech.tenant,
    lat: 32.9846003797191,
    lon: -97.0647746830336
  }
)

Note that we replaced the env vars with locals. That’s because you can now pass local variables directly to scenarios! Here’s our updated scenario:

- client:
    protocol: phoenix
    actions:
      - join:
          topic: tenant:<%= tenant %>
      - send:
          topic: tenant:<%= tenant %>
          event: update_location
          data:
            position:
              latitude: <%= lat %>
              longitude: <%= lon %>
            technicianId: <%= id %>

Oh, did you notice that the script itself has become much more concise? That’s because WS Director now supports the Elixir Channels protocol out-of-the-box.

Speaking of protocols, it’s also possible now to load custom protocols dynamically when using the wsdirector CLI:

wsdirector -r ./my_protocol.rb -u localhost:3030/ws -i '[{"client":{"protocol":"MyProtocol"}}]'

And last but not least, I’d like to talk about one more feature: the verbose output mode. And no words needed, just watch the video:

Running wsdirector with verbose (and colorful) logs

Developer tools make our lives better, even if you’ve made one just to satisfy your own particular needs. But even if you’ve built something tailored specifically for what you want, it doesn’t mean it won’t be up there one day, standing on the stage in front of the bright lights for all the world to see…

And, by the way, if your project has any backstage issues to be solved, of if it needs a hand getting ready for a big stage debut, give Evil Martians a buzz!

Join our email newsletter

Get all the new posts delivered directly to your inbox. Unsubscribe anytime.