Fullstaq Ruby: First impressions, and how to migrate your Docker/Kubernetes Ruby apps today
Translations
What is Fullstaq Ruby?
Fullstaq Ruby is a custom build of standard MRI Ruby interpreter with memory allocator replaced, security patches applied, and more goodies on the way.
If some old-timers are here, they can remember REE—Ruby Enterprise Edition—from ancient times of Ruby 1.8.7 and Ruby on Rails 2.2 (almost ten years ago!) Ah, good ol’ times! You could install it via RVM or Rbenv, and some legacy applications are still running on it or have been just recently migrated. REE has a dozen of different patches on top of Ruby 1.8.7 to improve performance, reduce memory consumption, adjust obsolete security settings, and so on.
In MRI 1.9.x, most of these problems were solved, and, as it gained adoption, REE became obsolete. But even modern “vanilla” MRI still has quirks that can be fixed relatively easy. The most annoying of them is memory bloat due to memory fragmentation.
So it is not at all surprising that Hongli Lai, the creator of REE, have released Fullstaq Ruby.
REE is dead, long live Fullstaq Ruby!
Why do we need it?
At one of our projects at Evil Martians, we were experiencing severe memory bloat. Our application does a lot of IO, and we have a lot of Sidekiq processes with high concurrency setting (20 threads per process). This setting is optimal from the performance point of view because workers are mostly making requests to different remote APIs, our own database, and caches. But such a high level of concurrency also leads to high memory fragmentation. Our Sidekiq processes eat several gigabytes of RAM each.
Read more about choosing Sidekiq concurrency setting in the Sidekiq in Practice part 1 by Nate Berkopec.
We have decided to replace our MRI 2.6.3 to Fullstaq Ruby 2.6.3 with jemalloc to see how it will perform.
Now that’s the difference!
We tried Fullstaq Ruby on a commercial application that runs in production and serves requests from paying clients around the clock.
First of all: nothing broke. Zero downtime!
Now, take a look at these monitoring graphs. Memory bloat of long-running processes has practically gone!
- Web application processes have become very stable in memory consumption (4 times less memory!). Bloat still occurs sporadically, but still, the readings show that about 50% less memory is consumed during spikes.
- Background job workers (we are using Sidekiq) also lost two-thirds of their weight. From 1.5-2 GB before to 500-700 MB after the migration to Fullstaq Ruby.
-
There is no noticeable difference in memory consumption for short processes (e.g., cron jobs)
-
We didn’t notice any changes in response times or CPU utilization.
The graphs above prove that memory fragmentation was the reason for high memory consumption.
And that’s it—quite an improvement for swapping one ruby binary for another, isn’t it?
Alternatives?
If jemalloc isn’t an option for you or you cannot afford to replace MRI with something else, you can try MALLOC_ARENA_MAX=2
spell to adjust MRI’s standard glibc malloc behavior. Results will be close enough to Fullstaq Ruby to treat them almost as equal.
In our case, Ruby with a limited number of malloc arenas (on the right) consumed about 50-100 MB more memory than Ruby with jemalloc (on the left).
Read more and see benchmarks of MALLOC_ARENA_MAX=2
in our blog post Cables vs. malloc_trim, or yet another Ruby memory usage benchmark.
We decided to stick with Fullstaq Ruby.
How to install?
At the moment the only way to install Fullstaq Ruby is to use deb or rpm packages (either installing directly or via repositories). But we deploy our app to Kubernetes cluster, so we need a Docker image. As there is no “container edition” available on the official website yet, so let’s build our own image—actually, it is not that hard!
Let’s use Debian 9, as this is the Linux distribution being used by official Ruby Docker image, and define the Ruby version:
FROM debian:stretch-slim
ARG RUBY_VERSION=2.6.3-jemalloc
And then install prerequisites, add Fullstaq Ruby APT repository, install Ruby itself and cleanup apt caches—all in a single command to reduce the Docker layer size:
RUN apt-get update -q \
&& apt-get dist-upgrade --assume-yes \
&& apt-get install --assume-yes -q --no-install-recommends curl gnupg apt-transport-https ca-certificates \
&& curl -SLf https://raw.githubusercontent.com/fullstaq-labs/fullstaq-ruby-server-edition/master/fullstaq-ruby.asc | apt-key add - \
&& echo "deb https://apt.fullstaqruby.org debian-9 main" > /etc/apt/sources.list.d/fullstaq-ruby.list \
&& apt-get update -q \
&& apt-get install --assume-yes -q --no-install-recommends fullstaq-ruby-${RUBY_VERSION} \
&& apt-get autoremove --assume-yes \
&& rm -fr /var/cache/apt
Fullstaq Ruby also installs Rbenv as a dependency, but we don’t need it in Docker, so let’s add ruby and gems binaries to system $PATH
in the same way that official Docker image for Ruby does:
ENV GEM_HOME /usr/local/bundle
ENV BUNDLE_PATH="$GEM_HOME" \
BUNDLE_SILENCE_ROOT_WARNING=1 \
BUNDLE_APP_CONFIG="$GEM_HOME" \
RUBY_VERSION=$RUBY_VERSION \
LANG=C.UTF-8 LC_ALL=C.UTF-8
# path recommendation: https://github.com/bundler/bundler/pull/6469#issuecomment-383235438
ENV PATH $GEM_HOME/bin:$BUNDLE_PATH/gems/bin:/usr/lib/fullstaq-ruby/versions/${RUBY_VERSION}/bin:$PATH
CMD [ "irb" ]
And that’s it!
We’ve already built and published this image. You can pull it from our repository at quay.io:
docker pull quay.io/evl.ms/fullstaq-ruby:2.6.3-jemalloc-stretch-slim
Dockerfiles are available at github.com/evilmartians/fullstaq-ruby-docker.
Now we can just replace base image in our application Dockerfile
:
-ARG RUBY_VERSION=2.6.3
+ARG RUBY_VERSION=2.6.3-jemalloc
-FROM ruby:${RUBY_VERSION}-stretch-slim
+FROM quay.io/evl.ms/fullstaq-ruby:${RUBY_VERSION}-stretch-slim
And deploy it to staging and then to production.
Feel free to do the same!
Recap
- Migration is smooth. Just reinstall Ruby and gems, and everything should just work.
- Application servers and background jobs worker processes should reduce memory consumption drastically.
- There is no noticeable difference in memory consumption for short processes (like cron jobs or scripts).
- Performance may slightly improve, but it depends on your workload profile.