Monthly Archives: May 2012

Simple HTML DOM Parser not that simple

Notwithstanding the name, using PHP Simple HTML DOM Parser isn’t always simple. While working on some issues with WP DoNotTrack‘s SuperClean mode, I encountered these two quirks:

  1. By default Simple HTML DOM removes linebreaks. That means that when you write the modified DOM back to a string for outputting, some (sloppy) JavaScript is bound to break. The solution: pass extra arguments to the DOM-creating functions, as “documented” in the Simple HMTL DOM’s source code. For str_get_html it reads:
    function str_get_html($str, $lowercase=true, $forceTagsClosed=true, $target_charset = DEFAULT_TARGET_CHARSET, $stripRN=true, $defaultBRText=DEFAULT_BR_TEXT)
    

    Set the 5th argument to false to tell the parser not to remove “\r\n”‘s.

  2. Simple HTML DOM is very liberal. It is so liberal, in fact, that it will try to make a DOM out of whatever you throw at it, without even blinking. Until you try to find elements using “find” on the DOM Object, that is, because at that point you might get a “Fatal error: Call to a member function find() on a non-object“-error thrown back at you. You can avoid that nastiness by checking the object for the existence of the find-method and, while you’re at it, also check if there is a HTML-element in the DOM:
    $html = file_get_html('http://url.to/filename.html');
    // first check if $html->find exists
    if (method_exists($html,"find")) {
         // then check if the html element exists to avoid trying to parse non-html
         if ($html->find('html')) {
              // and only then start searching (and manipulating) the dom
         }
    }

So that’s how to put the simple back into PHP Simple HTML DOM Parser. Until the next quirk comes up, because that’s what parsing HTML is all about after all, no?

WP YouTube Lyte: audio-only fixed and a small warning

So audio-only and playlist embedding were broken due to changes at YouTube’s side, but version 1.1.4 fixes that.

Some information for those users that implement WP YouTube Lyte’s audio-only embedding feature; it got broken because YouTube started to enforce the minimal player size they recently added to their TOS. Audio-only is, for many reasons, something that YouTube doesn’t want to support and indeed might even want to block, so this might break again. So if you want to be on the safe side or if you just don’t want to piss YouTube off, you probably should reconsider doing audio-only.

That being said, here’s a nice audio-only track for your audio-only ears to celebrate this new release;

Rahshaan Roland Kirk – Spirits Up Above

Watch this video on YouTube.

(Rahshaan Roland Kirk – Spirits Up Above)

As found on the web (May 23rd)

blog (feed #46)
frank published Firefox Mobile Beta:.
generic (feed #49)
generic (feed #49)
generic (feed #49)
generic (feed #49)
blog (feed #46)
blog (feed #46)
blog (feed #46)

WP DoNotTrack 0.7.0: SuperClean and EU Cookie Law

Last night I released WP DoNotTrack version 0.7.0, which adds a new filtering mode called SuperClean. Whereas the previous version only acted on elements added to the DOM, SuperClean now also allows you to filter the base HTML of your pages. To do this, SuperClean uses the PHP output buffer to catch the full HTML before it’s being sent to the browser.  That HTML is then parsed with PHP Simple HTML DOM Parser and based on your black- or whitelist the filtering is applied (SuperClean + whitelist = running a very tight ship, really). Currently SuperClean is not available if you have configured WP DoNotTrack to only stop tracking for people who have set the DoNotTrack-flag in their browser.

While we’re on the subject of conditional filtering; I’ve updated the code that checks for the DoNotTrack-flag to work around differences in browser implementations. Conditional filtering is pretty important, as it can help websites to comply with the (for now UK-only) “EU Cookie Law” which requires websites to ask their visitors for explicit consent prior to setting cookies. With WP DoNotTrack you can have your cookie and eat it too; you have your existing tracking scripts for users who give consent, while still being able to serve a “clean” website for users who enabled DoNotTrack in their browser. Given the fact that similar laws will be coming to a EU-country near you, conditional filtering is something I’ll be looking into further, so any feedback on the current implementation is more than welcome!

New Samsung firmware fixes nasty ICS Exchange bug

Last Friday I downloaded the newest official ROM from Samsung for my Galaxy SII from SamMobile.com and flashed it. I had no time over the weekend, but I just now deactivated the workaround solution I found on xda-developers and did some tests with meeting invitation responses and read receipts and I’m glad to confirm that I9100XWLPD indeed seems to solve the “connection error”-bug which ruined my initial Ice Cream Samsung experience. Yay!