PHP HTML parsing performance shootout; regex vs DOM

As I wrote earlier an Autoptimize user proposed to switch from regular expression based script & style extraction to using native PHP DOM functions (optionally with xpath). I created a small test-script to compare performance and the DOM methods are on average 500% slower than the preg_match based solution. Here are some details;

  • There are 3 tests; regular expression-based (preg_match), DOM + getElementsByTagName and DOM + XPath. You can see the source here and see it in action here.
  • The code in all 3 testcases does what Autoptimize does to start with when optimizing JavaScript:
    1. extract all javascript (code if inline, url if external) and add it to an array
    2. remove the javascript from the HTML
  • With each load of the test-script, the 3 tests get executed 100 times and total time per method is displayed.
  • That test-script was run 5 times on 3 different HTML-files; one small mobile page with some JavaScript and two bigger desktop ones with lots of JS.

The detailed results;

total time regextotal time domtotal time dom+xpath
arturo’s HP0.6114.83664.977
deredactie HP2.33225.6155.879
m deredactie HP0.06960.46040.4558

So while parsing HTML with regular expressions might be frowned upon in developer communities (and rightly so, as a lot can go wrong with PCRE in PHP) it is vastly superior with regards to performance. In the very limited scope of Autoptimize, where the regex-based approach is tried & tested on thousands of blogs, using DOM would simply create too much overhead.

Sharing widgets harm your website’s performance


[UPDATE: I reworked lyteShare into a standalone javascript-thingie]
Doing Web Performance can be so easy, really! I was asked to do a performance analysis of a new website and one of the things I didn’t like was the fact that the footer contained social media sharing buttons using the ShareThis widget. I’m not a fan of sharing widgets in general, as they tend to slow webpage loading and rendering down and as they almost invariably come with “3rd party tracking” for behavioral marketing purposes.
So why not do a quick comparison between a simple page with ShareThis, AddThis, AddToAny/ Lockerz share and one which uses inline javascript to render the buttons? For that purpose I quickly created lyteShare, an inline JavaScript thingie that dynamically adds the Facebook, Twitter and Google Plus sharing buttons after the load event has been fired. I’m not going to bother you with code (but you can look at the page’s source here if you want)  it’s probably far from perfect and it sure isn’t pretty, but it works and the webpagetest.org-results tell it all.

ShareThisAddThisLockerz/ AddToAnyinline JS (“lyteShare”)
Document Complete0.677s0.487s1.352s0.283s
Start Render0.715s0.279s0.304s0.298s
Fully Loaded1.507s3.718s1.407s0.500s
Full Download size70 KB384 KB 111 KB7 KB
Test Reportsharethis resultaddthis result lockerz/ addtoany resultlyteshare result
3rd party tracking?yes yes yesno

So yep, ShareThis, AddThis  and AddToAny/ Lockerz (and all sharing widgets really) are performance-hogs that also track your visitors’ every move while offering little or no added value to what anyone could do with some simple JavaScript (or server-side code, for that matter).
Conclusion: if performance is of any importance for your website (and it should be), you really have to avoid using 3rd party widgetery!

Tomorrow’s phone, now!

palm preAs every boy could tell you, it’s our toys that keep us kind of young. Because of that and as I work for a telco, I can’t but regularly buy a new phone.  Over the years I’ve had a.o. a Nokia 7110, a Sony-Ericsson T68i, a Qtek 9100 and I currently own a secondhand Nokia E61i. But time flies and my E61i is aging fast (maybe if I wouldn’t drop it that often …), so in a few months time I’m buying a new smartphone. Time to start shopping for pics, specs and reviews!
Smetty recently asked for advice on this topic as well, she was thinking about the Nokia E71 as a cheaper alternative to the iPhone 3G. But I won’t be buying Apple’s must-have gadget any time soon; although it has some superb features (OS, browser and that multi-touch interface), it lacks a real keyboard, has not tethering and doesn’t allow applications running in the background. And last but not least; the platform is far too closed to appeal to an open standards and open source minded wannabe-geek like me. All Windows Mobile-based devices are banned from my shortlist as well; I really don’t like the OS and its GUI, it feels too much like Windows 3.11 to me.
I’ll probably end up buying either the Nokia E71, a HTC Dream (the Google-phone) or the Palm Pré. So let’s do a pro&con-list, comparison-tables are always fun, no?

Nokia E71HTC DreamPalm Pré
Pro
  • Symbian is a proven OS
  • Lots of great software
  • Great battery life (1500mAh battery and only QVGA)
  • Builds on Nokia’s experience with the E61(i)
  • It’s a bit smaller then my E61i (which is … biggish)
  • Has tethering
  • Google Android is a Linux based OS
  • Google is an important player, lots of companies will be releasing Android-based phones in the coming months
  • HTC is one of the greatest cellphone manufacturers, they have loads of experience. My Qtek 9100 was a HTC-device as well.
  • Higher screen resolution (HVGA)
Con
  • Symbian feels old and is not always that reliable on my E61i (why does it soft-reset when the browser crashes?)
  • Lower screen resolution (QVGA)
  • Less readable then the E61i (same resolution but smaller screen)?
  • No tethering!
  • Battery life not that great (1150mAh battery combined with and thirsty HVGA)
  • It’s early days for Android, not sure if it’s mature enough
  • Not available through normal channels in Belgium, except for some obscure webshop where it’s already sold out
  • How about battery life (rumours claim 1150-1350 mAh, combined with power-hungry HVGA)?
  • Not available yet, no release date announced (not for USA, and certainly not for Europe)

The conclusion: although it still is vaporware, there’s some extreme chemistry going on between me and that darned Palm Pré. It’s the most exciting device by far and if it is for sale in Belgium, it’ll be hard to resist. The HTC Dream doesn’t seem to do it for me, no chemistry on one hand and not the “safe choice” either, as that award is easily claimed by Nokia E71. So Palm Pré if available in June/July, Nokia E71 otherwise?