Why your WordPress blog needs DoNotTrack

So what’s with all that nagging about tracking and that DoNotTrack plugin, you might wonder? Well, it’s pretty simple actually.

  1. Some very popular WordPress plugins include 3rd party tracking, sometimes even without properly disclosing, often without means to disable this behavior
  2. 3rd party tracking has privacy implications: all your visitors are tracked by the 3rd party, in general for behavioral marketing purposes (depending on what data is captured, tracking might even be illegal in some countries)
  3. 3rd party tracking has a performance impact: every visit to your blog will include between 2 and 5 extra requests for the 3rd party tracking to succeed, effectively delaying full page rendering

It is my conviction that blog owners should be able to install and use WordPress plugins without having to worry about undisclosed tracking and that plugins should provide a way to disable such 3rd party tracking if included.
As this is not the case yet, we have to resort to (messy) solutions to stop unwanted tracking from happening. And that’s exactly what DoNotTrack does. It’s a small javascript-hack in a WordPress-plugin to stop 3rd party tracking introduced by some of the most popular plugins.
Some details from the readme.txt:

  • What works:
  • What does not work (yet): Tracking code added using innerHTML or appendChild/insertBefore is not yet intercepted (but I’m working a solution for that)
  • What else might be added:
  • How you can help:
    • Provide me with links to plugins that include browser-based tracking + domain where the tracking is done.
    • Provide me with known opt-out code (javascript) to disable tracking services on a site.
    • Tell plugin writers you’re not happy with 3rd party tracking!
    • Tell your visitors about tracking & privacy, link to e.g. http://www.privacychoice.org/

And remember: if you host your WordPress blog yourself, you and nobody else should be able to decide who tracks your users!

Google Analytics for the privacy aware

While the entire German blogosphere seems to have discovered the pretty unpleasant, secretive inclusion of Quantcast tracking in the “WordPress.com Stats” plugin, I found an article on the blog that broke the story in Germany, that explains how you can somewhat limit (valid) privacy-concerns with Google Analytics.
You just have to push “_gat._anonymizeIp” as an option in the _gaq object, as shown on line 5 in this code snippet:


According to the relevant Google Analytics docs page, this:

“Tells Google Analytics to anonymize the information sent by the tracker objects by removing the last octet of the IP address prior to its storage. Note that this will slightly reduce the accuracy of geographic reporting.”

Call me naive (or overly idealistic), but shouldn’t your Google Analytics implementation have this option on as well?

Quantcast spyware puts selfhosted WordPress blogs in Automattic network

A quick update about the WordPress.com Stats plugin secretive inclusion of Quantcast tracking:

Coding for the New Year

Just a quickie before diving into 2011;

And this is how I feel about 2011:

Jon Hopkins - Light Through The Veins (Full 9 Minute HQ Version)

Have a great New Year!

WordPress.com Stats trojan horse for Quantcast tracking

Suppose you’re a blogger who values website performance and online privacy. You may have ditched Google Analytics because you think the do-no-evilers do not have to know who is on your site. Maybe you removed AddtoAny because of the 3rd party tracking code that slows down your site ever oh so slightly. And you don’t want the omnipresent Facebook Like widget for all the above reasons. No, the only 3rd party javascript you allow is the one pushed by the WordPress.com Stats plugin; one javascript-file  and one pixel and you get some nice stats in return. And come on, WordPress, those are the good guys, right?
Well, apparently not. While performing a test on for example webpagetest.org, you’ll see two requests to the quantserve.com domain;

http://edge.quantserve.com/quant.js
http://pixel.quantserve.com/pixel;r=705640318;fpan=1;fpa=P0-450352291-1292419712624;ns=0;url=http%3A%2F%2Fblog.futtta.be%2F;ref=;ce=1;je=1;sr=1024x768x32;enc=n;ogl=;dst=1;et=1292419712624;tzo=300;a=p-18-mFEk4J448M;labels=type.wporg

Ouch, that hurts! But surely Quantcast aren’t in the same league as AddtoAny’s media6degrees, who do behavioral advertising based on data captured all across the web? Well … Quantcast might be better known, but they do exactly the same thing; collecting user information and providing that info for targeted advertising. And just so you know, Quantcast is one of the companies that is on trial for restoring deleted cookies using Flash (“zombie cookies”). So no, I’m not comfortable with Quantcast collecting data on my blog’s visitors.
Now I know that I opted in on user-tracking by WordPress (or rather Automattic). And I can live with them knowing who visits my blog, I can live with the small performance-impact that the stats-plugin has on my site that way. But I did not sign up for 3rd party tracking, the plugin-page conveniantly fails to mention the extra tracking, there’s no opt-out mechanism in the plugin and there’s no info to be found on how to disable Quantcast tracking users on my own blog. I am not a happy WordPress-blogger!
So Automattic; please fess up and at least provide instructions on how to disable 3rd party tracking, just like AddtoAny’s Pat gracefully did?


Update 20 january 2011; Automattic seems unwilling to acknowledge there is a problem, the thread on wordpress.org forums where this was discussed has been closed. I created a small WordPress plugin, DoNotTrack, to stop Quantcast tracking. you can download it here.

Venus doesn’t love noscript

Damn, Venus doesn’t love noscript!
You’ve got no clue what I’m rambling about, do you? Well, allow me to explain;

  • Venus is a planet-like web application that aggregates rss- and atom feeds from a community or about a specific subject (see e.g. planet mozilla, planet debian and the much-loved planet grep)

So now you know the context, let me reiterate; Venus doesn’t treat noscript the way it should! It not only strips out javascript as it should (are you listening tt-rss?) but it replaces noscript-tags and all HTML inside with escaped HTML (with HTML-entities actually). And that, my beloved ones, means that the HTML that WP YouTube Lyte generates, doesn’t work properly on Venus-based planets.
So I started looking at the Venus source and mailed with Planet Grep’s Wouter Verhelst to solve this issue. At first sight the solution seemed pretty straightforward; Venus shouldn’t ‘escape’ noscript but should instead just strip the opening and closing noscript-tag. Wouter installed a small sed-filter I wrote and added noscript to the whitelist of Venus’s sanitizer (which is based on Universal Feed Parser) and … it did not work.
The problem apperantly is with another sanitizing component in Venus; html5lib. Sam Ruby, the developer of Venus, wrote on the mailinglist;

There are multiple sanitization passes involved here. […] The html5parser seems to think that noscript is to be parsed as text only, which would result in the behavior that you describe.  Looking at the current HTML5 spec, it appears that this does not match the expected behavior — so perhaps that changed too.

So I started looking at html5lib and … well, I’m stuck, html5lib is a pretty complex beast for a smalltime non-developer to dive into. So earlier today I turned to the html5lib discussion list to ask how sanitization can be configured not to escape noscript, let’s hope someone will enlighten me. Because until then those poor Planet Greppers won’t be able to see (a thumbnail of) Al Jarreau’s great version of Take Five way back in 1976:

Al Jarreau 1976 -Take Five

The bulleted WP YouTube Lyte bulletin

It’s been a while since my last “state of WP YouTube Lyte” post, so here are the most important tidbits since;

Feedback is welcome; see info in the FAQ for bug reports/ feature requests and feel free to rate and/or report on compatibility on wordpress.org.
And as befits a post about WP YouTube Lyte on a Friday afternoon, here’s a little YouTube from Gold Panda you can start your weekend with;

The state of WP YouTube Lyte (now with fresh Pomplamoose)

Although it has been a few months since I last wrote about my baby WordPress plugin, time did not stand still between version 0.3.0 and 0.5.2; the player size can now be changed in the options-screen, I’ve replaced my newTube html5-hack with Google’s official (yet experimental) new html5-compatible embed code and I started migrating the CSS from the mess that had become the JavaScript-file. And I almost forgot what may be the most important change; I started searching for blogs that use WP-YouTube-Lyte to see how it behaves in the wild. Some of the bugs I discovered that way;

But with all those changes you might start to wonder if WP-YouTube-Lyte still reduces download size & rendering time substantially, no? So I ran a couple of new tests for this page on my blog (it has 3 embedded YouTube’s) on webpagetest.org (settings: 5 runs on IE7 via Amsterdam, excluding requests to stats.wordpress.com). The difference is … well, judge for yourself (or see below the tables for the summary)
With normal Flash-based embeds (full results here):

Document CompleteFully Loaded
Load TimeFirst ByteStart RenderTimeRequestsBytes InTimeRequestsBytes In
First View1.850s0.634s1.330s1.850s15343 KB5.350s22524 KB
Repeat View1.142s0.346s0.497s1.142s517 KB2.455s517 KB

And with WP YouTube Lyte (full results here):

Document CompleteFully Loaded
Load TimeFirst ByteStart RenderTimeRequestsBytes InTimeRequestsBytes In
First View1.201s0.355s0.974s1.201s1055 KB2.065s20103 KB
Repeat View0.605s0.352s0.473s0.605s212 KB1.447s514 KB

Did you see that? Less requests, less data and faster rendering for first and repeat views. Hurray for WP-YouTube-Lyte! But enough with that ego-tripping already, I’ve got an Opera-bug to look into! Or wait, I’ll watch this great new Pomplamoose+Ben Folds+Nick Hornby  videosong first:

Ben Folds, Nick Hornby, & Pomplamoose VideoSong!!!!

Protecting wp-contact-form from spam

Ever since I installed WordPress on my (virtual) server, I’ve been using the WP Contact Form plugin to provide me with simple contact form. The plugin isn’t exactly under active development (Last Updated: 2009-8-28), but it got the job done and I was quite happy with it. Until spammers found the page and started abusing it, that is. There’s a bunch of other Contactform-plugins in the wordpress.org plugins repository, but most of them were either too feature-packed or development for them seemed to have stopped.
I considered adding ReCaptcha at first, but why would I want to put my visitors through such an ordeal; the captcha’s seem to have gotten very difficult to decipher.  Next possibility; implement Akismet (Mollom would have been a great choice as well)? There’s a great Akismet PHP5-class, you just provide your API-key and off you go. But it seemed kind of inefficient to have to do all that with the official Akismet-plugin already in place?
But wait a minute, why not just piggyback on the Akismet-plugin, as the Clean-contact plugin and wp-contactform-akismet did? Keep it simple stupid and so I just copy/pasted the clean_contact_akismet-function from Clean Contact’s code into my wp-content/plugins/wp-contact-form/wp-contactform.php and on line 142 I changed:

mail($recipient, $subject, $fullmsg, $headers);
$results = '
' . $success_msg . '
'; echo $results;

into:

$akismet=clean_contact_akismet($msg,$subject,$email,$name);
if (!$akismet) {
mail($recipient, $subject, $fullmsg, $headers);
$results = $success_msg;
} else {
$results = 'If it looks like spam and smells like spam, it must be spam. Leave (or rephrase)!';
}
echo '
'.$results.'
';

That was all it took to add Akismet spam-filtering to that KISS-y wp-contact-form plugin. I wonder why this isn’t in the plugin already?

WordPress stats oddity

A couple of months ago I removed Google Analytics from this wee little blog, but I still use the less sophisticated WordPress.com stats plugin to follow up on what is being read around here. Loading the stats-page in my Android-browser is far from optimal (it uses Flash to draw the charts and it is a pretty big page), so I was pleased to read that version 1.3 of WordPress for Android features a stats-section. But the reports don’t work, I just get “No stats data found, please try again later”.
Now while playing around with the stats API over the weekend, I noticed some unusual things:

  • “blog_uri=futtta.wordpress.com” works
  • “blog_uri=blog.futtta.be” results in “Error: zero rows returned.”

The API also supports blog_id instead of blog_uri and after some digging for blog_id’s I found that they are listed in (the html source of) the blog dropdown-list on your wordpress.com-dashboard stats page. And there the problem became clear: I had two blog_id ‘s for the same blog_uri (blog.futtta.be) and the first one was defunct:

  • “blog_id=2184847” results in “Error: zero rows returned.”
  • “blog_id=2185033” works just fine

As I can’t remove the entry with the faulty blog_id from the Stats DB and as I can’t change the Android WordPress app to use one of multiple blog_id’s instead of the blog_uri, I can’t fix this little bugger myself, so I contacted WordPress Support. But how did I end up with 2 blog_id’s in the database?