We’ve found and fixed a couple of performance regressions in PostCSS.
As of release 5.0.11, PostCSS is 1.5 times faster!
Here is a current benchmark for most common preprocessor tasks:
PostCSS 5.0.11: 40 ms
PostCSS 5.0.10: 60ms (1.5 times slower)
Rework: 75 ms (1.9 times slower)
libsass: 76 ms (1.9 times slower)
Less: 147 ms (3.7 times slower)
Stylus: 166 ms (4.1 times slower)
Stylecow: 258 ms (6.4 times slower)
Ruby Sass: 1042 ms (26.0 times slower)
Important disclaimer: PostCSS, libsass and Less are already fast enough for any real-world task. Running benchmarks like these is like listening to audiophiles comparing gold-plated cables for their hobby systems. Don’t use these benchmarks as the main criteria for decision making when you are choosing a tool.
After the PostCSS 5.0 release, we’ve noticed a 1.5 times performance regression. Initially, we’ve deemed it not important—CSS processing is fast enough already. However, a month ago the libsass team made some impressive work and improved libsass performance two times. So PostCSS and libsass at that moment were performing at about the same speed.
That is not exactly bad news; however, I do believe that programming language is not the defining factor for having good performance. Good architecture and proper usage of profiling tools should always come first. And I like to think that PostCSS on JavaScript vs. libsass on C++ was a great example for young developers in that regard. So I’ve decided to find the source of performance regression in 5.0.
Hunting the regression down
The main problem is that we did not have big internal changes in 5.0. We’ve just added a few new methods and renamed some old methods. The task to find out exactly what’s wrong seemed quite complicated. But my Martian colleague Ravil Bayramgalin taught me not to be scared of profiling: just “split and check”.
The idea is quite straightforward: split your code into two parts and test the performance of both parts. Then investigate the slower part in detail.
But to do that, you should have a set of good benchmarks first. Here is something to remember: do not even talk about performance improvements if you don’t have a set of proper benchmarks.
So, I took a set of benchmarks and ran them against PostCSS 4.0 and PostCSS 5.0 to start binary search:
- I’ve replaced several complicated plugins with one straightforward
postcss-calc
.
Regression was still there—that would mean that the issue is in the core part of PostCSS and not in plugins. - I’ve removed all
postcss-calc
code and left theroot.each
code only.
Regression was still not fixed, but now I’ve found a method causing the regression. - I’ve checked all code differences and reverted code in
root.each
line by line to get to the cause of regression.
As a result, I’ve come against two simple changes in root.each
:
// Global counter with closure, instead of instance counter
--- this.lastEach += 1;
+++ lastEach += 1;
// Cleaning system cache after using
+++ delete this.indexes;
I’ve reverted these changes—and PostCSS became 1.5 times faster, just like that.
Conclusions
It is quite strange that a change as simple as that is a huge influence on the performance of the whole library, right?
“Any sufficiently advanced technology is indistinguishable from magic”
I tend to think that the source of this issue was related to V8 optimization. If a JavaScript class is simple enough, V8 compiles it to a very effective typed native code. But if V8 thinks that the code is “too tricky”, it keeps it in slow but dynamic form. Here is a good article by Google on the matter.
For example, in the first set of changes I’ve used a global variable defined outside of the class, so it had an external state. In the second one, I’ve changed the class structure on the fly. As a result, V8 did not compile it to effective typed native code.
What advice can I give following this little optimization adventure of mine?
- Having good benchmarks is the principal and first step to improving performance.
- Do not even try to write what you think is effective code before benchmarking. The VM has many clever optimizations. And even if you will somehow learn of all of them, in the next release they still can be changed. Instead, write a simple and clean piece of code, make a benchmark, find the real bottleneck and rewrite code in small parts.
- Do not think that programming in C++ or any other lower-level language is a must for having good performance. Good architecture, benchmarking and profiling are far more important.