How to keep Autoptimize’s cache size under control (and improve visitor experience)

Update 2016: since AO 2.0, inline JS (and CSS) are not aggregated by default any more, which should prevent cache-size problems from occurring. Easiest solution for cache size issues is to make sure “aggregate inline JS” (and CSS) option is disabled. Below HowTo remains relevant in case you decide to enable the aggregation of inline code.


Confession time: Autoptimize does not have its proper cache purging mechanism. There are some good reasons for that (see below) but in most cases this is not something to worry about.

Except when it is something to worry about off course. Because in some cases the amount of cache-files generated by Autoptimize can grow to several Gigabytes. Why, you might wonder? Well, for each page being loaded Autoptimize aggregates all JS (and CSS) calculates the hash of that string and checks if an optimized version is in cache using that hash. If there is a difference (even if just a comma), the hash is not the same and the aggregated CSS/ JS is cached seperately. This behavior typically is caused by plugins that generate javascript-variables (or CSS-selectors) that are specific for each page (or even worse, for each page request). That does not only lead to a huge amount of files in the cache, but also impacts visitors as their browsers will have to request a different optimized CSS- or JS-file for each page instead of reusing the same file for several pages.

This is what you can do if you want a healthier cache both from a server- and visitor-perspective (based on JavaScript, but the same principle applies to CSS);

  1. Open two similar pages (posts).
  2. View source of the optimized JavaScript in those two pages.
  3. Copy the source of each to a seperate file and replace all semi-colons (“;”) with semi-colon+linefeed (“;\n”) in both files.
  4. Execute an automatic comparison between the two using e.g. diff (or “compare” in Notepad++), this should give you one or more lines that will probably be almost the same, but not exactly (e.g. with a different nonce or a postid in them).
  5. Now disable JS optimization and look for similar strings in the inline and the external JavaScript.
  6. If you find it in the inline JavaScript, try to identify a unique string in there (the name of a specific variable, probably) and write that down. If the variable JS is in a file, jot down the filename.
  7. Go to the autoptimize settings page and make sure the advanced settings are shown.
  8. Now add the strings or filenames from (6) to “Exclude scripts from Autoptimize:” (which is a comma-seperated list).
  9. Re-enable JS optimization.
  10. Save settings & clear cache.

This does require some digging, but the advantages are clear; a (much) smaller cache-size on disk and better performance for your visitors. Everyone will be so happy, people will want to hug you and there will be much rejoicing, generally.

So why doesn’t Autoptimize have automatic cache pruning? Well, the problem is a page caching layer (which could be a browser, a caching reverse proxy or a wordpress page caching plugin) contains pages that refer to the aggregated JS/CSS-files. If those optimized files were to be automatically removed while the page would remain in the page caching layer, people would get the cached page without any JS- or CSS-files being available. And as I don’t want Autoptimize to break your pages, I didn’t include a automatic cache purging mechanism. But if you have a bright idea of how this problem could be tackled, I’d be happy to reconsider, off course!

27 thoughts on “How to keep Autoptimize’s cache size under control (and improve visitor experience)

  1. Nabha

    Hmm… Could you remove caches older than, say, a week? Or allow the user to set a “number of days until cache expires” setting?

    Reply
  2. Binh

    Hi again Frank,

    1. People should control their cached page. So you should have option to auto purge cache file on time interval like WP-Super-Cache.

    2. I noticed this problem that Autoptimize automatically grab all my javascript inside my HTML and put in the bunch. It should not do this because people often put js variables inside the page. It’s good you have this function but it will be much better if there is a tick to enable it.

    3. Cache file name using hash of content? You shoud use hash of filenames instead. This way there will not be a new version of cache file everytime the content change. Especially should avoid script file ending with php because that definitely is variable script file. Also, getting hash of content seem to be a CPU consuming stuff.

    4. The plugin is great but I notice a half second increase in load time. Perhaps this is due to the minify process is taking CPU and memory resource. This is a trade off that’s why I needed to use WP-Super-Cache in front of it. But it will be a nice option for “only optimize for unknown visitors” or “skip optimization for logged in user (with level)”.

    5. Deferred css is good but not that usable compare to deffered JS. I know there is a few plugins for Async JS but since you already touched to field you may consider putting a check on the option page.

    6. Inline CSS is good for small CSS only. Should there be a textbox for max file size? Also regarding this, I suppose you don’t grab all inline style and put in the cache file, do you? Because that’s really bad idea and that’s the main reason cache folder grows big.

    7. Other ideas as part of the word “Auto optimization”:
    + Async Image loading
    + Aysnc JS + Deffered JS (my idea: async external CDN js files and deferring the local js files to avoid broken script)

    Last but not least, I know this plugin is free and you won’t take money for it. So all above are just ideas. If you have time to mug around with it, it’s great. Otherwise, it’s still great and I love the plugin as well as your idea of contribution.

    Many thanks,
    Binh

    Reply
  3. frank Post author

    Hi Binh & Nabha;
    Thanks for your feedback! Cache size & automatic pruning is one of my … areas of interest. But although automatic cache pruning might be added at a later stage, I’m very reluctant to do so, as this has a big chance of breaking pages that are cached at another layer. If Autoptimize is configured correctly (i.e. excluding all random stuff as explained in this blogpost), cache size typically remains within reasonable amounts (just checked, I’m at a measly 4Mb after 2 weeks).

    Binh, for your other questions (some of which are pretty interesting by the way);
    2. you can exclude whatever you like from being aggregated (both for inline CSS & JS), so excluding variables is pretty simple.
    3. there should be a new version of the cache file if the content changes, that’s why I hash on content instead of filename. css/ js generated by PHP are never aggregated.
    4a. increase in load time indeed is due to HTML parsing, aggregating CSS/JS, calculating hash and (optionally) minimizing. I indeed explicitly advice against using Autoptimize without a page cache component for that reason. That being said, performance of the autoptimization-process itself is one of my top priorities for the next versions.
    4b. you can easily not-optimize for logged in users using the API (cfr. example for the noptimize-filter in autoptimize_helper.php_example)
    5. js is async (or deferred) by default already (i.e. expect when you force JS in head, in which case deferring or asyncing it would break things that need JS early)
    6a. based on some tests I ran (see this blogpost), inlining all CSS can be a huge performance-improvement as well in certain contexts.
    6b. why do you consider keeping inline CSS in a cachefile a problem?

    have a nice weekend!
    frank

    Reply
  4. Binh

    Hi Frank,
    Most things are good.

    3. there should be a new version of the cache file if the content changes, that’s why I hash on content instead of filename. css/ js generated by PHP are never aggregated.

    => Yes, but you can use the content hash to decide if you need to rewrite the cache file, but use the file name hash to be the cache filename, that way the cache file will still remain the same while it’s content is updated.

    => Also for this reason, you could just store the file modification date somewhere and do a interval check instead. OR for performacne’ sake you could just ignore this and give a big button: Refresh cache. Because people seldomly (perhaps once a month?) modify their CSS file.

    => Or could give a development mode which allow the cache to refresh when file change. But believe me, when doing development mode I will kindly switch your plugin off anyway.

    4a. increase in load time indeed is due to HTML parsing, aggregating CSS/JS, calculating hash and (optionally) minimizing.

    =>From 3 I can see reason why it’s slower. Doesn’t this mean you are reading the files every time a page load before deciding to return the cached file? This simply mean the cache is useless because resources are still used

    => noticeable performance lost.

    ==> If you use cache file the purpose is to not re-merge the file again, Frank. This is good news. I will try.

    5. js is async (or deferred) by default already (i.e. expect when you force JS in head, in which case deferring or asyncing it would break things that need JS early).

    => No, Frank. The plugin currently only merge the local js. Not the CDN js files. Async for CDN files is hard because CDN files most of the time available later than the local JS because of ping latency. That’s why I suggest we Async the CDN js and deffer the Local Js by few seconds (waiting). I like the way you implement “try catch” much, but it will still break my layout due to the CND files not ready yet.

    CND async is so simple because you only need to add the word “async” the the script tag. But defer in the tag only defer till the page is loaded, but it won’t defer till the other previous JS files are fully loaded. Currently I see no async attribute, just defer but it still won’t work as expected.

    Here is my little log when I activate your JS optimization:
    Uncaught Error: Bootstrap’s JavaScript requires jQuery bootstrap.min.js:6
    Uncaught ReferenceError: jQuery is not defined chosen.jquery.min.js:2
    Uncaught ReferenceError: jQuery is not defined jquery.colorbox-min.js:7
    Uncaught ReferenceError: jQuery is not defined jquery.fancybox.pack.js:46

    6a. based on some tests I ran (see this blogpost), inlining all CSS can be a huge performance-improvement as well in certain contexts.

    => certain context is that only when the CSS files are small. I am using bootstrap, chosen, colorbox, etc.. and total CSS files become >500KB…. This is huge performance lost if 1 user visits more than one page on my site. Also a big bandwidth waste, you know.

    6b. why do you consider keeping inline CSS in a cachefile a problem?

    => The same as keeping inline JS in a cache file, Frank. Because we pass the php variables from theme options to print as inline CSS and Javascript.

    I was very shocked when I opened my HTML code for 2 pages and see 2 different versions of cache file.. Just to find out I had variable post_id and a few more things in inline JS. Imagine if people have in header to set their CSS attributes like “color”, font-style, font-size for different language, or different versions… and then find out the css cache files also more than one.

    That’s a algorithm flaw, Frank. CSS HTML and JS HTML should be a one to multi relationship.

    An I suppose that’s my main idea to discuss here: “Fixing the reason why the cache folder grows big”, instead of “UGH! the cache folder become big, what should I do now?”

    Many thanks,
    Binh

    Reply
    1. frank Post author

      regarding content vs filename hash; the content has to be taken into account not for development purposes, but for updated plugins that come with a new version of their css or js.

      don’t agree as to the “algorithm flaw”; css & js are in a one to multi relationship (check e.g. my blog, but this is the case on most other implementations I saw). The only thing braking that are (as you confirm) some inlined specifics which can (and should) be excluded from being aggregated as discussed in this here blog post. That being said I might add something in the API to stop Autoptimize from aggregating inline CSS/JS in one of the next versions, but from a performance point of view you want to aggregate as much as possible.

      RE: inlining CSS; cfr. my remark about the context (which is explained in the blogpost I linked to); if you have a low pageview/visitor rate then inlining CSS makes sense. If your visitors instead on average request a lot of pages, then you’d better not inline CSS.

      Because we pass the php variables from theme options to print as inline CSS and Javascript.

      So you’re a theme developer? Nice! In that case just put those between noptimize-tags and all will be well! :-)

      frank

      PS: Don’t hesitate to provide patches/ code contributions, I always check those out and who knows you’d become a AO co-maintainer ;-)

      Reply
  5. Chris

    Hi,

    I’ve gone a bit blank on number 2 ” View source of the optimized JavaScript in those two pages.”

    I know how to view a page source of course, but I don’t have javascript on them, just links to .js files and some .js.php files.. Do you mean to view the source of those and compare, or am I missing something?

    Chris

    Reply
    1. frank Post author

      no, check if there is inline javascript in the HTML itself, typically very short snippets setting page-specific js-variables.

      Reply
  6. Eddie

    What about creating some kind of redirect to those files that are cached and then purged?

    Or using the same naming convention for each pages cached files and only overwriting that individual file when that specific page changes?

    Reply
    1. frank Post author

      Autoptimized files are re-used accross all pages that have the exact same JS (or CSS) in them. As such, there’s no one page that can be referred to in the naming convention.

      Creating a redirect-based mechanism (well, a 404-mechanism actually) has come to mind and I actually have some ideas that may one day turn into code in Autoptimize. Can I interest you in developing this part? :-)

      Reply
  7. Rob

    Hi frank,

    After finding my autoptimize cache had bloated to 9.2GB and filled up my server, I tried following your instructions. However, I couldn’t find any differences between the Javascript code on two similar pages.

    Is there anything else to look out for? At the moment I’ve had to resort to disabling the JS optimisation.

    Thanks,
    Rob

    Reply
    1. frank Post author

      You could open 2 very similar pages (2 blogposts) and see if they use the same autoptimize_xyz.js-file. If not, you can;

      1. open those 2 js-files
      2. un-optimize them both using http://jsbeautifier.org/
      3. compare them in your favorite IDE/ text editor

      If you’re not into that (it’s not … pleasant work), you could try the “only look in head”-option (easy but far from optimal as your linked JS at the bottom of the HTML won’t be optimized) or use the “autoptimize_js_include_inline” filter in AO’s API (less easy, but inherently better) to disable the autoptimization of any inline JS (which will solve 99% of all occurencies of this problem). You can see the code for that in autoptimize_helper.php_example.

      Hope this helps,
      frank

      Reply
      1. Sam

        Thanks. I did this(added the variable) and now I don’t see my javascript files cache growing up. It is less then 10 js files ie. about 1 MB only now. Earlier it was more then 10 GB in size.

  8. Barry Williams

    Hi Frank, I have read your post on Cache sizes and others comments.
    I am not an expert in all of this but understand a reasonable amount.
    What is the simplest way for me to reduce the cache from its current 200MB and to stop it growing again.
    Autoptimise works well on my site (better performance) but the cache is another matter.
    Looking forward to hearing from you.
    Regards

    Reply
    1. frank Post author

      Hi Barry;
      The technique described here is not simple, I’ll admit. A next version of AO (the big two point ohhh) will have an option to enable/ disable the aggregation of inline styles & javascript, which in 99% of the cases are the root cause of the cache size growing.

      In the mean time you could use the AO API to exclude that inline code from being optimized by adding something like this in your (child) theme’s functions.php (warning: theme updates would override those changes, so remember to re-apply untill AO 2.0 is released);


      add_filter('autoptimize_js_include_inline','baliasli_ao_include_inline',10,1);
      add_filter('autoptimize_css_include_inline','baliasli_ao_include_inline',10,1);
      function baliasli_ao_include_inline() {
      return false;
      }

      hope this helps,
      frank

      Reply
  9. Rudra

    Ok… so this way I can handle the autoptimize cache buildup issue for sometime now. I hope v2.0 is scheduled to be released soon.

    Reply
  10. mathew

    FYI, cache buildup is triggered by Jetpack, which I imagine a lot of people use.

    I think I’ve fixed the problem by adding the strings “window.WPCOM_sharing_counts,/jetpack-comment/” to the JavaScript block list.

    Reply
  11. Mark

    Real world calling – afraid it’s just not possible to go through the complicated processes you describe here to resolve this for me and I’ve had to uninstall after a 3GB js cache built up over 3 months and ground my hosting and backups to a halt. Really needs to be addressed – shame as otherwise great idea/solution.

    Reply
    1. frank Post author

      Morning Mark;
      The next version of AO will have an option not to aggregate inline JS (and CSS), which is in 99,999% of the cases the reason for the cache build-up.

      In the mean time you can add this code to your (child) theme’s functions.php to enforce the same behavior;

      add_filter('autoptimize_js_include_inline','realworld_ao_js_include_inline',10,1);
      function realworld_ao_js_include_inline() {
      return false;
      }

      hope this helps,
      frank

      Reply
  12. Khalid

    Hello
    I am new to wordpress, don’t know much you mention to control cache size. Therefore i have a simple question can i clear cache and save changes without changes any settings etc..
    Thanks in advance

    Khalid

    Reply
      1. Khalid

        Thanks Franks
        I unchecked this option aggregate inline JS and then hit save changes and clear cache.
        Wow almost 800MB clear.
        One more question, can i do same in a month or so.

      2. frank Post author

        well with “aggregate inline JS” turned off, your cache size should not grow that aggressively any more, so … :)

  13. Adam Davies

    The cache size seems ok when used in conjunction with w3tc. I disabled thier caching functions in favour of autoptimize. Now I just use it to purge the cache I customised…. Works great.

    Reply
  14. Lost

    Could you make a easy to understand step by step guide? WordPress became a platform for almost “anyone”. And having broad knowledge of web technologies is not really that desired for average joe. E.g.

    2. View source of the optimized JavaScript in those two pages. HOW ?
    4. Now disable JS optimization and look for similar strings in the inline and the external JavaScript. WHAT DOES IT EVEN MEAN?

    I would really appreciate Easy Step by Step guide on how to do it, because your plugin works but after a week my cache is 1GB !!!!!!!

    Thanks!

    Reply
    1. frank Post author

      the easy explanation; disable “aggregate inline JS” (and also CSS).

      the other stuff is complicated, no easy step by step possible.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *