Autoptimize minor update and beyond

I just released an update to Autoptimize, bumping the version to 1.9.2. Main new features;

  • New: support for alternative cache-directory and file-prefix as requested by a.o. Jassi Bacha, Cluster666 and Baris Unver.
  • Improvement: hard-exclude all linked-data json objects (script type=application/ld+json)
  • Improvement: several filters added to the API, e.g. to alter optimized HTML, CSS or JS
  • Some bugfixes
  • Swedish translations updated & Ukrainian added, courtesy of Zanatoly of SebWeo.com

I’m already thinking about version 2.0 (which should fix the 2 big issues some people face; exploding cache size due to page-specific inline code & the rare but nasty white screen of death due to CSS minification issues) and about some powerful new features that could extend Autoptimize for professionals and power-users in need of something more. 2015 is going to be great, hope you guys & girls will be part of that!
Anyway, enjoy the end-of-year festivities and above all, have fun & share some of the happiness!
 

Amazed by Autoptimize take-up

autoptimize at +200K downloads, wow!Less then a year after reaching 100000 downloads, Autoptimize broke the 200000 barrier just last week.
It’s also exiting to see how people are blogging (or tweeting) about it as well;

So yeah, I’m pretty amazed by how well Autoptimize is doing. Thanks for the confidence!

WP SEO vs Autoptimize; who broke your WordPress?

Autoptimize 1.9 was released yesterday but unfortunately some reports were coming in about JS optimization being broken.  At first I suspected the problem being related to small changes that added semi-colons to individual blocks of script (before being aggregated), but tests with some impacted users showed this was not the case.
The breakthrough came in this thread in Autoptimize’s support forum, where user “grief-of-these-days” confirmed the problem started with the update of WP SEO and specifically the “sitelinks search box“-functionality that was added in WP SEO 1.6. Sitelinks Search Box comes as an inline script of type “application/ld+json”, that contains a name-less JSON-object with “linked data”. Autoptimize detected, aggregated and minimized this name-less object, but that not only defies the sitelinks search box mechanism, but potentially also broke the optimized JS itself. So I updated & enabled WP SEO, confirmed the problem, identified “potentialAction” as unique string to base exclusion on and pushed out 1.9.1 which will now no longer Autoptimize Sitelinks Search Box-code.
So who broke your WordPress today, WP SEO or Autoptmize? Well, WP SEO’s update may have made the bug appear, but based on the fact that json-ld is standardized and as such will probably be also present in other guises, Autoptimize should really just exclude any script of the “application/ld+json”-type from being aggregated & minimized (and not just that of the Sitelinks Search Box). Adding to the to-do-list now!

(When) should you Try/Catch Javascript?

Autoptimize comes with a “Add try-catch wrapping?”-option, which wraps every aggregated script in a try-catch-block, to avoid an error in one script to block the others.
I considered enabling this option by default, as it would prevent JS optimization occasionally breaking sites badly. I discussed this with a number of smart people and searched the web, eventually stumbling on this blogpost which offers an alternative for try-catch because;

Some JavaScript engines, such as V8 (Chrome) do not optimize functions that make use of a try/catch block as the optimizing compiler will skip it when encountered. No matter what context you use a try/catch block in, there will always be an inherent performance hit, quite possibly a substantial one. [Testcases] confirm [that] not only is there up to a 90% loss in performance when no error even occurs, but the declination is significantly greater when an error is raised and control enters the catch block.

So given this damning evidence of severe performance degradation, “try/catch wrapping” will not be enabled by default and although Ryan’s alternative approach has its merits, I’m weary of the caveats so I won’t include that (for now anyway). If your site breaks when enabling JS optimization in Autoptimize, you can enable try/catch wrapping as a quick workaround, but finding the offending script and excluding it from being optimized is clearly the better solution.

Next Autoptimize eliminates render-blocking CSS in above-the-fold content

Although current versions of Autoptimize can already tackle Google PageSpeed Insights’ “Eliminate render-blocking CSS in above-the-fold content” tip, the next release will allow you to do so in an even better way. As from version 1.9 you’ll be able to combine the best of both “inline CSS” and “defer CSS” worlds. “Defer” effectively becomes “Inline and defer“, allowing you to specify the “above the fold CSS” which is then inlined in the head of your HTML, while your normal autoptimized CSS is deferred until the page has finished loading.
Other improvements in the upcoming Autoptimize 1.9.0 include:

  • Inlined Base64-encoded background Images will now be cached as well and the threshold for inlining these images has been bumped up to 4096 bytes (from 2560).
  • Separate cache-directories for CSS and JS in /wp-content/cache/autoptimize, which should result in faster cache pruning (and in some cases possibly faster serving of individual aggregated files).
  • CSS is now added before the <title>-tag, JS before </body> (and after </title> when forced in head). This can be overridden in the API.
  • Some usability Improvements of the administration-page
  • Multiple hooks added to the API a.o. filters to not aggregate inline CSS or JS and filters to aggregate but not minify CSS or JS.
  • Multiple bugfixes & improvements

On the todo-list; testing, some translation updates (I’ll contact you translators in the coming week) and updating the readme.txt.
The first test-version of what will become 1.9.0 (still tagged 1.8.5 for now though) has been committed to wordpress.org’s plugin SVN and can be downloaded here. Anyone wanting to help out testing this new release, go and grab your copy and provide me with feedback.

PHP HTML parsing performance shootout; regex vs DOM

As I wrote earlier an Autoptimize user proposed to switch from regular expression based script & style extraction to using native PHP DOM functions (optionally with xpath). I created a small test-script to compare performance and the DOM methods are on average 500% slower than the preg_match based solution. Here are some details;

  • There are 3 tests; regular expression-based (preg_match), DOM + getElementsByTagName and DOM + XPath. You can see the source here and see it in action here.
  • The code in all 3 testcases does what Autoptimize does to start with when optimizing JavaScript:
    1. extract all javascript (code if inline, url if external) and add it to an array
    2. remove the javascript from the HTML
  • With each load of the test-script, the 3 tests get executed 100 times and total time per method is displayed.
  • That test-script was run 5 times on 3 different HTML-files; one small mobile page with some JavaScript and two bigger desktop ones with lots of JS.

The detailed results;

total time regextotal time domtotal time dom+xpath
arturo’s HP0.6114.83664.977
deredactie HP2.33225.6155.879
m deredactie HP0.06960.46040.4558

So while parsing HTML with regular expressions might be frowned upon in developer communities (and rightly so, as a lot can go wrong with PCRE in PHP) it is vastly superior with regards to performance. In the very limited scope of Autoptimize, where the regex-based approach is tried & tested on thousands of blogs, using DOM would simply create too much overhead.

Some HTML DOM parsing gotchas in PHP’s DOMDocument

Although I had used Simple HTML DOM parser for WP DoNotTrack, I’ve been looking into native PHP HTML DOM parsing as a possible replacement for regular expressions for Autoptimize as proposed by Arturo. I won’t go into the performance comparison results just yet, but here’s some of the things I learned while experimenting with DOMDocument which in turn might help innocent passers-by of this blogpost.

  • loadHTML doesn’t always play nice with different character encodings, you might need something like mb_convert_encoding to work around that.
  • loadHTML will try to “repair” your HTML to make sure an XML-parser can work with it. So what goes in will not come out the same way.
  • loadHTML will spit out tons of warnings or notices about the HTML not being XML; you might want to suppress error-reporting by prepending the command with an @ (e.g. @$dom->loadHTML($htmlstring);)
  • If you use e.g. getELementsByTagName to extract nodes into a seperate DomNodeList and you want to use that to change the DomDocument can result in … unexpected behavior as the DomNodeList gets updated when changes are made to the DomDocument. Copy the DomNodes from the DomNodeList into a new array (which will not get altered) and iterate over that to update the DomDocument as seen in the example below.
  • removeChild is a method of DomNode, not of DomDocument. This means $dom->removeChild(DomNode) will not work. Instead invoke removeChild on the parent of the node you want to remove as seen in the example below
// loadHTML from string, suppressing errors
$dom = new DOMDocument();
@$dom->loadHTML($html);
// get all script-nodes
$_scripts=$dom->getElementsByTagName("script");
// move the result form a DomNodeList to an array
$scripts = array();
foreach ($_scripts as $script) {
   $scripts[]=$script;
}
// iterate over array and remove script-tags from DOM
foreach ($scripts as $script) {
   $script->parentNode->removeChild($script);
}
// write DOM back to the HTML-string
$html = $dom->saveHTML();

Now chop chop, back to my code to finish that performance comparison. Who know what else we’ll learn 😉

How to keep Autoptimize’s cache size under control (and improve visitor experience)

Update 2016: since AO 2.0, inline JS is not aggregated by default any more, which should prevent cache-size problems from occurring. Easiest solution for cache size issues is to make sure “aggregate inline JS” (and CSS) option is disabled. Below HowTo remains relevant in case you decide to enable the aggregation of inline code.


Confession time: Autoptimize does not have its proper cache purging mechanism. There are some good reasons for that (see below) but in most cases this is not something to worry about.
Except when it is something to worry about off course. Because in some cases the amount of cache-files generated by Autoptimize can grow to several Gigabytes. Why, you might wonder? Well, for each page being loaded Autoptimize aggregates all JS (and CSS) calculates the hash of that string and checks if an optimized version is in cache using that hash. If there is a difference (even if just a comma), the hash is not the same and the aggregated CSS/ JS is cached seperately. This behavior typically is caused by plugins that generate javascript-variables (or CSS-selectors) that are specific for each page (or even worse, for each page request). That does not only lead to a huge amount of files in the cache, but also impacts visitors as their browsers will have to request a different optimized CSS- or JS-file for each page instead of reusing the same file for several pages.
This is what you can do if you want a healthier cache both from a server- and visitor-perspective (based on JavaScript, but the same principle applies to CSS);

  1. Open two similar pages (posts).
  2. View source of the optimized JavaScript in those two pages.
  3. Copy the source of each to a seperate file and replace all semi-colons (“;”) with semi-colon+linefeed (“;\n”) in both files.
  4. Execute an automatic comparison between the two using e.g. diff (or “compare” in Notepad++), this should give you one or more lines that will probably be almost the same, but not exactly (e.g. with a different nonce or a postid in them).
  5. Now disable JS optimization and look for similar strings in the inline and the external JavaScript.
  6. If you find it in the inline JavaScript, try to identify a unique string in there (the name of a specific variable, probably) and write that down. If the variable JS is in a file, jot down the filename.
  7. Go to the autoptimize settings page and make sure the advanced settings are shown.
  8. Now add the strings or filenames from (6) to “Exclude scripts from Autoptimize:” (which is a comma-seperated list).
  9. Re-enable JS optimization.
  10. Save settings & clear cache.

This does require some digging, but the advantages are clear; a (much) smaller cache-size on disk and better performance for your visitors. Everyone will be so happy, people will want to hug you and there will be much rejoicing, generally.
So why doesn’t Autoptimize have automatic cache pruning? Well, the problem is a page caching layer (which could be a browser, a caching reverse proxy or a wordpress page caching plugin) contains pages that refer to the aggregated JS/CSS-files. If those optimized files were to be automatically removed while the page would remain in the page caching layer, people would get the cached page without any JS- or CSS-files being available. And as I don’t want Autoptimize to break your pages, I didn’t include a automatic cache purging mechanism. But if you have a bright idea of how this problem could be tackled, I’d be happy to reconsider, off course!

Should you inline or defer blocking CSS?

CSS codeYou care about web performance and so you dutifully aggregate and minify your CSS. But then a couple of months ago Google PageSpeed Insights (for mobile) started identifying CSS as a render blocking resource and so you wonder if you should you inline your CSS, or even defer loading it. Based on tests executed on a multi-site WordPress installation, deferring CSS is not the best idea just yet, but inlining might be worthwhile! Read on for the hard numbers and other details.

The test-setup

  • 4 test-blogs on a multi-site WordPress instance
  • using the Expound theme (which is interesting because its main stylesheet imports 2 other CSS-files)
  • using Lite Cache for page caching (new WordPress page caching plugin by the Hyper Cache author)
  • the same content was imported on all 4 blogs
  • all 4 had Autoptimize HTML & JS optimization active
  • the difference was in the Autoptimize CSS settings, where:
    • blog 1 had no CSS optimization at all (baseline )
    • blog 2 had standard CSS aggregation and minification, linked in head
    • blog 3 inlined the optimized CSS
    • blog 4 deferred the optimized CSS
  • each blog was tested on webpagetest.org‘s Amsterdam node on a DSL-profile using IE9 and doing 9 test-runs on one specific blogpost with contained a 16KBimage (I excluded favicon.ico as it seemed to pollute results)
  • each blog was analyzed by Google PageSpeed Insights for both mobile & desktop

The results

test report urlfirst bytestart renderdoc completefully loadedmobile pagespeeddesktop pagespeed
1. no css optimization140212_Z1_MKN0.299s2.246s2.221s2.221s7992
2. optimized CSS linked140212_H7_MKP0.239s0.608s1.390s1.390s9197
3. optimized CSS inlined140212_A3_NJA0.232s0.348s0.658s0.658s9999
4. optimized CSS deferred 140212_8J_P1G0.248s0.357s1.034s1.034s9995

The conclusions

Based on these tests (your mileage may vary, always test your results):

  • deferring all CSS is useless, performance is worse, desktop PageSpeed score is (slightly) lower and there is a “flash of unstyled content” between the rendering of the page and the application of the CSS.
  • Inlining CSS yields the best results both from a page speed and PageSpeed perspective. Although the base HTML is larger as it has the CSS payload, this has almost no impact in this specific context and rendering is almost instantaneous. Off course in a context where multiple other pages from the same site, with the exact same CSS would be loaded, the impact would be significant. Hence inlining CSS is especially interesting for sites with a low pageviews/ visitor ratio.

The future; inline + defer

Deferring CSS may seem pretty useless, but the sweet spot may just be inlining base CSS (everything needed for initial rendering above the fold) and deferring everything else. This is what CSS optimizing tools should focus on in 2014 and you can certainly expect something along these lines in one of the next major Autoptimize releases (although diehards can already test this approach).

Irregular Expressions have your stack for lunch

I love me some regular expressions (problems), but have you ever seen one crash Apache? Well I have! This regex is part of YUI-CSS-compressor-PHP-port, the external CSS minification component in Autoptimize, my WordPress JS/CSS optimization plugin:
/(?:^|\})(?:(?:[^\{\:])+\:)+(?:[^\{]*\{)/)/
yo regex dawgExecuting that on a large chunk of CSS (lots of selectors for one declaration block, which cannot be ripped apart) triggers a stack overflow in PCRE, which crashes Apache and shows up as a “connection reset”-error in the browser.
Regular expression triggered segfaults are no exception in the PHP bugtracker and each and every of those tickets gets labeled “Not a bug” while pointing the finger at PCRE, which in their man-pages and in their bug tracker indeed confirm that stack overflows can occur. This quote from that PCRE bug report says it all, really;

If you are running into a problem of stack overflow, you have the
following choices:
  (a) Work on your regular expression pattern so that it uses less
      memory. Sometimes using atomic groups can help with this.
  (b) Increase the size of your process stack.
  (c) Compile PCRE to use the heap instead of the stack.
  (d) Set PCRE's recursion limit small enough so that it gives an error
      before the stack overflows.

Are you scared yet? I know I am. But this might be a consolation; if you test your code on xampp (or another Apache on Windows version), you’re bound to detect the problem early on, as the default threadstacksize there is a mere 1MB instead of the whopping 8MB on Linux.
As for the problem in YUI-CSS-compressor-PHP-port; I logged it on their Github issue-list and I think I might just have a working alternative which will be in Autoptimize 1.8.