Category Archives: drupal

On my -limited- experience with Drupal.

On WordPress GET floods, plugin fingerprinting & keeping safe

flood sign, by alan in belfast from flickr under ccInfosec consultant and blogger Xavier Mertens suffered from a GET flood last week. The would-be DDOS originated from WordPress blogs that seemed non-related both geographically and content-wise, were using different versions of WordPress and seemed not to be compromised.

So what gives? WordPress blogs autonomously trying to DDOS other WordPress blogs? My best guess; a WordPress plugin gone rogue. The great WordPress HTTP API makes It easy enough to create a plugin that fetches targets from a control server and issues requests to those targets at a given time. It’s only a matter of hiding that behavior inside a plugin that seems useful and getting people to install that. The easiest way; finding an older plugin with an existing userbase and taking over development from the original developer (as i did with Autoptimize) is the easiest route to create your own little DDOS-ing botnet.

But all of this is pure speculation (although the UA matches the one used by WordPress HTTP API) off course. The only way to know for sure is to, for at least a sample of the flooding blogs, check what plugins they have in common. Doing so is frightfully easy using the NMAP HTTP WordPress Pugins script and if I am not mistaking Xavier is indeed looking into this.

But given the ease with which the NMAP-script can scan for WordPress plugins (there are similar scripts for e.g. Drupal modules), you might want to stop this from happening? I for one added this line in my WordPress .htaccess:

RedirectMatch 404 ^/wp-content/plugins/[^\/]*/$

Maybe you would even want to return a 404 for plugin readme.txt and index.html files as well, but I’ll consider that an assignment for you guys to chew over ;-)

And now you can even have my WordPress password!

Being slightly obsessed with security, I was delighted to discover that two factor-authentication (OTP) using Google Authenticator client is not restricted to Google applications, but is fully standardized and as such can be implemented without dependency on Google services on any system. There is code (off course varying in quality and scope) available for PHP, .NET, Java and Python (and I’m sure there are others).

As you might expect after reading the title, there is a great Google Authenticator WordPress plugin which I installed in 5 minutes time earlier today. For the Drupal-heads; Antwerp-based Attiks have a module that implements Google Authenticator OTP which looks worth checking out as well (and I’m interested in your experiences with it, actually).

Site slower with CDN? Check your CSS!

A couple of days ago we implemented a CDN for a Drupal-based website, using MaxCDN/ NetDNA and the Drupal CDN module. We were very surprised to discover the site became … slower. It took us some time to identify the problem, but CSS turned out to be the culprit;

  1. The supplier created separate CSS-file for each and every template using SASS, causing some automated duplication of CSS
  2. Some of that CSS was not in the theme .info-file and wasn’t added using drupal_add_css either, but the link was hardcoded in the template-file
  3. The CSS that was added the normal way (i.e. not hardcoded in the template) was picked up and modified by the CDN module (changing paths for e.g. images and fonts into URL’s pointing to the CDN)
  4. CSS that was hardcoded in the template was not visible for the CDN-module, so the paths were not updated and still pointed to the origin webserver
  5. Because of duplication of e.g. background-images and fonts, pointing at both the CDN and the origin-server, these assets were downloaded twice and the extra file-size resulted in a site that was slower with than without a CDN

Once we understood the problem, the solution was pretty simple;

  1. clean up the CSS, avoiding to re-declare e.g. @font-face in multiple templates (resulting in a smaller CSS file download size as well)
  2. add CSS using drupal_add_css (with the option not to aggregate, as IE8 might choke on the massive amount of CSS) for the CDN module to take that into account as well

Web Performance Optimization can be so much fun!

AddToAny now includes Lockerz tracking

Update 02-2015: things change, blogposts get out of date and indeed A2A is not owned by Lockerz any more.

AddToAny, one of the most popular sharing-widgets around, has had 3rd party tracking by Media6degrees for quite some time already. I wasn’t too happy about that, but it did have the no_3p option to disable this “functionality”. Half a year ago however AddToAny was acquired by Lockerz.com and it now includes tracking by Lockerz.com which cannot be turned off and does not check for navigator.doNotTrack either.

I’ve contacted the developer (Pat’s a swell guy, really) and he answered he would look into honoring the DoNotTrack header, which he wrote he’d love to include in Q1 somewhere. In the mean time, if you have AddToAny on your site, you can already hide the Lockerz “Earn” tab. And if you’re on WordPress, you could install (or upgrade) WP DoNotTrack, which I’ve updated to stop the Lockerz tracking (make sure lockerz.com is your blacklist).

If there’s a Drupalista out there that uses AddToAny and would like to stop Lockerz tracking; I’d be happy to co-author a Drupal DoNotTrack module, do get in touch!

Ranting & Raving at Drupal Summit 2011

I attended Drupal Summit in Genk a couple of days ago and amidst the general “Drupal is the best thing since sliced bread” atmosphere, there were some interesting discussions about the platform’s maturity. Especially the presentations of Peter Van Den Broeck (for VRT) and Wouter Mertens (for competitor VMMA) seemed to be on opposite sides, with VMMA having multiple successful Drupal-sites in production and VRT struggling to get their projects finished, telling Captain Buytaert “Dries, we’re not there yet“. But underneath the surface and despite the differences (a dedicated team of sysadmins & developers at VMMA vs fixed price, project-based external development for VRT), both were talking about the same problems on a technical level; modules & performance.

The Drupal module community is a great bunch of enthusiastic developers, building an impressive number of modules that cover almost any feature one would want to have in a website. But module quality, support & compatibility varies enormously. Some modules seem to be true Rube Goldberg machines, providing tons of functionality that only few people need but which makes the the UI a usability-hell and the code complex, error-prone and possibly a real performance-hog.

And while we’re on the subject; performance doesn’t come out of the box and Drupal as such does not scale very well. Install it with a bunch of modules, generate a reasonable amount of page requests and you have a CPU-intensive system that generates a crapload of database-connections. Adding memcache between your Drupal & MySQL helps, as most requests will be handled from cache. And putting a caching reverse proxy (Varnish, Squid or even Apache’s mod_proxy+mod_cache if you insist) in front of your Drupal does miracles, serving visitors the same content without the need for Drupal to bootstrap. So sure, you can build a scalable solution that provides great performance, but one could say this is despite Drupal, not because of it. After all, when using Memcache & Varnish almost any CMS will have great performance, won’t it?

So yeah, Drupal can be a nice solution to your problem, but it does require more than just a superficial knowledge of how to install it together with some modules and a theme. Make sure there are smart people on your team or project, that have a profound knowledge of modules & module development and who know a lot about MySQL (and clustering, mirroring or master/slave setups), Memcache and Varnish (or Squid, forget about mod_cache if you can). You’re bound to run into some problems, as Peter Van Den Broeck confirmed, but with the right people and architecture your Drupal-project can indeed be the best thing since sliced bread.

Drupal, mod_cache & RFC2616 caching

Suppose you’re setting up a Drupal-based site for which you have to implement a caching reverse proxy and for reasons beyond your comprehension Varnish (or even Squid) are not an option. Oh no, you’re stuck with Apache’s mod_proxy and mod_cache! What should you do?

First of all, Drupal 6 doesn’t like reverse proxies. If you don’t want to wait for version 7, which should do better in this respect, you might want to look at Pressflow. This Drupal 6 “distro” has everything on board to work with reverse proxies. So install Pressflow (or try to apply this out of date diff to stock Drupal) and in the Performance-screen set “Caching Mode” to “External” and “Page Cache Maximum Age” to the number of minutes you consider a cached page valid. Voila, you’re done in Drupal (edit: almost, as you might also want to change the $base_url in sites/default/settings.php to reverse proxy URL after you configured Apache).

Next up: Apache! A simple configuration like this one should do the trick:

ProxyRequests Off
ProxyPass /rp_drupal http://localhost/pressflow
ProxyPassReverse /rp_drupal http://localhost/pressflow
CacheEnable disk /rp_drupal/
CacheRoot c:/TEMP/apacache
CacheDefaultExpire 3600

OK, this must surely work, no? Well it should, but it doesn’t! When setting your Apache-loglevel to debug you’ll see “not cached” entries in your error-log, with the following reason:

Expires header already expired, not cacheable

Expires in the past, what does Pressflow think it’s doing deep down in includes/bootstrap.inc?

// HTTP/1.0 proxies do not support the Vary header, so prevent any caching
// by sending an Expires date in the past. HTTP/1.1 clients ignores the
// Expires header if a Cache-Control: max-age= directive is specified (see RFC
// 2616, section 14.9.3).
drupal_set_header('Expires', 'Sun, 11 Mar 1984 12:00:00 GMT');
// [...]
$max_age = variable_get('cache', CACHE_DISABLED) == CACHE_AGGRESSIVE && (!isset($_COOKIE[session_name()]) || isset($hook_boot_headers['vary'])) ? variable_get('page_cache_max_age', 0) : 0;
$default_headers['Cache-Control'] = 'public, max-age=' . $max_age;

Darn, those Pressflow-guys seem to have read up on their RFC’s! And indeed, 2616 confirms that cache-control’s max-age overrules expires;

If a response includes both an Expires header and a max-age directive, the max-age directive overrides the Expires header, even if the Expires header is more restrictive. This rule allows an origin server to provide, for a given response, a longer expiration time to an HTTP/1.1 (or later) cache than to an HTTP/1.0 cache.

Mod_cache’s code seems to take a much simpler approach; at line 503 it decides not to cache based on an Expires-header in the past, totally dismissing the potential presence of cache-control’s max-age.

else if (exp != APR_DATE_BAD && exp < r->request_time)
    {
        /* if a Expires header is in the past, don't cache it */
        reason = "Expires header already expired, not cacheable";
    }

But you’re not interested in code which does or does not adhere to whatever RFC some spec-buffs came up with, you just want to cache your frigging’ Drupal-site! Well, fear not little hacker-boy, here’s some Apache-magic to cure your ailments, to be copy/pasted in the config before ProxyPass and ProxyPassReverse:

<Location /rp_drupal>
     SetEnvIf Request_Protocol "HTTP/1.1" expires_overrule
     # homework: add a SetEnvIf to see if cache-control max-age is present
     Header unset Expires env=expires_overrule
</Location>

So there you have it, a rudimentary caching setup for Drupal (in the guise of Pressflow) using nothing but Apache’s mod_proxy and mod_cache. Now go do your homework and test and do some finetuning and test some more. Happy caching!

AddToAny: removing the “spy” from the share-ware

Update 02-2015: the information below does not reflect the way AddToAny works now and as such only has historical value. The comment by A2A’s developer below, explains what has been done between 2010 and 2015.

After discovering AddToAny secretly enrolls all of my blogs visitors in a behavioral marketing platform, I disabled the plugin and mailed the author for more information. He answered the media6degrees-integration was a partner-test, only providing them with non-personally identifiable data, which the company indeed can use for targeted advertising. But the good news was that AddToAny would also offer a “publisher opt-out mechanism” shortly. And indeed, last week, Pat announced the brand new a2a api and mailed me the following opt-out code;

var a2a_config = a2a_config || {};
a2a_config.no_3p = 1;

These two lines of javascript, which have to be placed in front of the http://static.addtoany.com/menu/page.js script-include, should disable all current and future 3rd party tracking. I hope the web-guys from e.g. deredactie.be and standaard.be (and there are many others) implement this as soon as possible!

So now we can opt-out from having our visitors being spied upon by media6degrees, what more could one want? Well, since you’re asking, here’s a small list of things AddToAny could really should do;

  • transparency; tell users that their visitors’ information will be shared with 3rd parties (in all relevant places)
  • documentation: show them how to “remove the spy” on the AddToAny api page (“no_3p” isn’t there)
  • ease-of-use: allow the tracking to be disabled with a simple checkbox in the WordPress and Drupal plugins

The opt-out code is a important first step and I’m sure concerns such as those voiced on the WordPress-forums will help AddToAny to further make the right decisions!