Archive for the ‘drupal’ tag
AddToAny now includes Lockerz tracking
AddToAny, one of the most popular sharing-widgets around, has had 3rd party tracking by Media6degrees for quite some time already. I wasn’t too happy about that, but it did have the no_3p option to disable this “functionality”. Half a year ago however AddToAny was acquired by Lockerz.com and it now includes tracking by Lockerz.com which cannot be turned off and does not check for navigator.doNotTrack either.
I’ve contacted the developer (Pat’s a swell guy, really) and he answered he would look into honoring the DoNotTrack header, which he wrote he’d love to include in Q1 somewhere. In the mean time, if you have AddToAny on your site, you can already hide the Lockerz “Earn” tab. And if you’re on WordPress, you could install (or upgrade) WP DoNotTrack, which I’ve updated to stop the Lockerz tracking (make sure lockerz.com is your blacklist).
If there’s a Drupalista out there that uses AddToAny and would like to stop Lockerz tracking; I’d be happy to co-author a Drupal DoNotTrack module, do get in touch!
Ranting & Raving at Drupal Summit 2011
I attended Drupal Summit in Genk a couple of days ago and amidst the general “Drupal is the best thing since sliced bread” atmosphere, there were some interesting discussions about the platform’s maturity. Especially the presentations of Peter Van Den Broeck (for VRT) and Wouter Mertens (for competitor VMMA) seemed to be on opposite sides, with VMMA having multiple successful Drupal-sites in production and VRT struggling to get their projects finished, telling Captain Buytaert “Dries, we’re not there yet“. But underneath the surface and despite the differences (a dedicated team of sysadmins & developers at VMMA vs fixed price, project-based external development for VRT), both were talking about the same problems on a technical level; modules & performance.
The Drupal module community is a great bunch of enthusiastic developers, building an impressive number of modules that cover almost any feature one would want to have in a website. But module quality, support & compatibility varies enormously. Some modules seem to be true Rube Goldberg machines, providing tons of functionality that only few people need but which makes the the UI a usability-hell and the code complex, error-prone and possibly a real performance-hog.
And while we’re on the subject; performance doesn’t come out of the box and Drupal as such does not scale very well. Install it with a bunch of modules, generate a reasonable amount of page requests and you have a CPU-intensive system that generates a crapload of database-connections. Adding memcache between your Drupal & MySQL helps, as most requests will be handled from cache. And putting a caching reverse proxy (Varnish, Squid or even Apache’s mod_proxy+mod_cache if you insist) in front of your Drupal does miracles, serving visitors the same content without the need for Drupal to bootstrap. So sure, you can build a scalable solution that provides great performance, but one could say this is despite Drupal, not because of it. After all, when using Memcache & Varnish almost any CMS will have great performance, won’t it?
So yeah, Drupal can be a nice solution to your problem, but it does require more than just a superficial knowledge of how to install it together with some modules and a theme. Make sure there are smart people on your team or project, that have a profound knowledge of modules & module development and who know a lot about MySQL (and clustering, mirroring or master/slave setups), Memcache and Varnish (or Squid, forget about mod_cache if you can). You’re bound to run into some problems, as Peter Van Den Broeck confirmed, but with the right people and architecture your Drupal-project can indeed be the best thing since sliced bread.
Drupal, mod_cache & RFC2616 caching
Suppose you’re setting up a Drupal-based site for which you have to implement a caching reverse proxy and for reasons beyond your comprehension Varnish (or even Squid) are not an option. Oh no, you’re stuck with Apache’s mod_proxy and mod_cache! What should you do?
First of all, Drupal 6 doesn’t like reverse proxies. If you don’t want to wait for version 7, which should do better in this respect, you might want to look at Pressflow. This Drupal 6 “distro” has everything on board to work with reverse proxies. So install Pressflow (or try to apply this out of date diff to stock Drupal) and in the Performance-screen set “Caching Mode” to “External” and “Page Cache Maximum Age” to the number of minutes you consider a cached page valid. Voila, you’re done in Drupal (edit: almost, as you might also want to change the $base_url in sites/default/settings.php to reverse proxy URL after you configured Apache).
Next up: Apache! A simple configuration like this one should do the trick:
ProxyRequests Off ProxyPass /rp_drupal http://localhost/pressflow ProxyPassReverse /rp_drupal http://localhost/pressflow CacheEnable disk /rp_drupal/ CacheRoot c:/TEMP/apacache CacheDefaultExpire 3600
OK, this must surely work, no? Well it should, but it doesn’t! When setting your Apache-loglevel to debug you’ll see “not cached” entries in your error-log, with the following reason:
Expires header already expired, not cacheable
Expires in the past, what does Pressflow think it’s doing deep down in includes/bootstrap.inc?
// HTTP/1.0 proxies do not support the Vary header, so prevent any caching
// by sending an Expires date in the past. HTTP/1.1 clients ignores the
// Expires header if a Cache-Control: max-age= directive is specified (see RFC
// 2616, section 14.9.3).
drupal_set_header('Expires', 'Sun, 11 Mar 1984 12:00:00 GMT');
// [...]
$max_age = variable_get('cache', CACHE_DISABLED) == CACHE_AGGRESSIVE && (!isset($_COOKIE[session_name()]) || isset($hook_boot_headers['vary'])) ? variable_get('page_cache_max_age', 0) : 0;
$default_headers['Cache-Control'] = 'public, max-age=' . $max_age;
Darn, those Pressflow-guys seem to have read up on their RFC’s! And indeed, 2616 confirms that cache-control’s max-age overrules expires;
If a response includes both an Expires header and a max-age directive, the max-age directive overrides the Expires header, even if the Expires header is more restrictive. This rule allows an origin server to provide, for a given response, a longer expiration time to an HTTP/1.1 (or later) cache than to an HTTP/1.0 cache.
Mod_cache’s code seems to take a much simpler approach; at line 503 it decides not to cache based on an Expires-header in the past, totally dismissing the potential presence of cache-control’s max-age.
else if (exp != APR_DATE_BAD && exp < r->request_time)
{
/* if a Expires header is in the past, don't cache it */
reason = "Expires header already expired, not cacheable";
}
But you’re not interested in code which does or does not adhere to whatever RFC some spec-buffs came up with, you just want to cache your frigging’ Drupal-site! Well, fear not little hacker-boy, here’s some Apache-magic to cure your ailments, to be copy/pasted in the config before ProxyPass and ProxyPassReverse:
<Location /rp_drupal>
SetEnvIf Request_Protocol "HTTP/1.1" expires_overrule
# homework: add a SetEnvIf to see if cache-control max-age is present
Header unset Expires env=expires_overrule
</Location>
So there you have it, a rudimentary caching setup for Drupal (in the guise of Pressflow) using nothing but Apache’s mod_proxy and mod_cache. Now go do your homework and test and do some finetuning and test some more. Happy caching!
AddToAny: removing the “spy” from the share-ware
Update 01-2012: AddToAny now includes tracking by parent company Lockerz.com which cannot easily be disabled.
After discovering AddToAny secretly enrolls all of my blogs visitors in a behavioral marketing platform, I disabled the plugin and mailed the author for more information. He answered the media6degrees-integration was a partner-test, only providing them with non-personally identifiable data, which the company indeed can use for targeted advertising. But the good news was that AddToAny would also offer a “publisher opt-out mechanism” shortly. And indeed, last week, Pat announced the brand new a2a api and mailed me the following opt-out code;
var a2a_config = a2a_config || {};
a2a_config.no_3p = 1;
These two lines of javascript, which have to be placed in front of the http://static.addtoany.com/menu/page.js script-include, should disable all current and future 3rd party tracking. I hope the web-guys from e.g. deredactie.be and standaard.be (and there are many others) implement this as soon as possible!
So now we can opt-out from having our visitors being spied upon by media6degrees, what more could one want? Well, since you’re asking, here’s a small list of things AddToAny could really should do;
- transparency; tell users that their visitors’ information will be shared with 3rd parties (in all relevant places)
- documentation: show them how to “remove the spy” on the AddToAny api page (“no_3p” isn’t there)
- ease-of-use: allow the tracking to be disabled with a simple checkbox in the WordPress and Drupal plugins
The opt-out code is a important first step and I’m sure concerns such as those voiced on the WordPress-forums will help AddToAny to further make the right decisions!


