Simple HTML DOM Parser not that simple

Notwithstanding the name, using PHP Simple HTML DOM Parser isn’t always simple. While working on some issues with WP DoNotTrack‘s SuperClean mode, I encountered these two quirks:

  1. By default Simple HTML DOM removes linebreaks. That means that when you write the modified DOM back to a string for outputting, some (sloppy) JavaScript is bound to break. The solution: pass extra arguments to the DOM-creating functions, as “documented” in the Simple HMTL DOM’s source code. For str_get_html it reads:
    function str_get_html($str, $lowercase=true, $forceTagsClosed=true, $target_charset = DEFAULT_TARGET_CHARSET, $stripRN=true, $defaultBRText=DEFAULT_BR_TEXT)
    

    Set the 5th argument to false to tell the parser not to remove “\r\n”‘s.

  2. Simple HTML DOM is very liberal. It is so liberal, in fact, that it will try to make a DOM out of whatever you throw at it, without even blinking. Until you try to find elements using “find” on the DOM Object, that is, because at that point you might get a “Fatal error: Call to a member function find() on a non-object“-error thrown back at you. You can avoid that nastiness by checking the object for the existence of the find-method and, while you’re at it, also check if there is a HTML-element in the DOM:
    $html = file_get_html('http://url.to/filename.html');
    // first check if $html->find exists
    if (method_exists($html,"find")) {
         // then check if the html element exists to avoid trying to parse non-html
         if ($html->find('html')) {
              // and only then start searching (and manipulating) the dom
         }
    }

So that’s how to put the simple back into PHP Simple HTML DOM Parser. Until the next quirk comes up, because that’s what parsing HTML is all about after all, no?

WP YouTube Lyte: audio-only fixed and a small warning

So audio-only and playlist embedding were broken due to changes at YouTube’s side, but version 1.1.4 fixes that.
Some information for those users that implement WP YouTube Lyte’s audio-only embedding feature; it got broken because YouTube started to enforce the minimal player size they recently added to their TOS. Audio-only is, for many reasons, something that YouTube doesn’t want to support and indeed might even want to block, so this might break again. So if you want to be on the safe side or if you just don’t want to piss YouTube off, you probably should reconsider doing audio-only.
That being said, here’s a nice audio-only track for your audio-only ears to celebrate this new release;

Rahshaan Roland Kirk - Spirits Up Above

(Rahshaan Roland Kirk – Spirits Up Above)

As found on the web (May 23rd)

blog (feed #46)
frank published Firefox Mobile Beta:.
generic (feed #49)
generic (feed #49)
generic (feed #49)
generic (feed #49)
blog (feed #46)
blog (feed #46)
blog (feed #46)

WP DoNotTrack 0.7.0: SuperClean and EU Cookie Law

Last night I released WP DoNotTrack version 0.7.0, which adds a new filtering mode called SuperClean. Whereas the previous version only acted on elements added to the DOM, SuperClean now also allows you to filter the base HTML of your pages. To do this, SuperClean uses the PHP output buffer to catch the full HTML before it’s being sent to the browser.  That HTML is then parsed with PHP Simple HTML DOM Parser and based on your black- or whitelist the filtering is applied (SuperClean + whitelist = running a very tight ship, really). Currently SuperClean is not available if you have configured WP DoNotTrack to only stop tracking for people who have set the DoNotTrack-flag in their browser.
While we’re on the subject of conditional filtering; I’ve updated the code that checks for the DoNotTrack-flag to work around differences in browser implementations. Conditional filtering is pretty important, as it can help websites to comply with the (for now UK-only) “EU Cookie Law” which requires websites to ask their visitors for explicit consent prior to setting cookies. With WP DoNotTrack you can have your cookie and eat it too; you have your existing tracking scripts for users who give consent, while still being able to serve a “clean” website for users who enabled DoNotTrack in their browser. Given the fact that similar laws will be coming to a EU-country near you, conditional filtering is something I’ll be looking into further, so any feedback on the current implementation is more than welcome!

New Samsung firmware fixes nasty ICS Exchange bug

Last Friday I downloaded the newest official ROM from Samsung for my Galaxy SII from SamMobile.com and flashed it. I had no time over the weekend, but I just now deactivated the workaround solution I found on xda-developers and did some tests with meeting invitation responses and read receipts and I’m glad to confirm that I9100XWLPD indeed seems to solve the “connection error”-bug which ruined my initial Ice Cream Samsung experience. Yay!

Firefox Mobile Beta: native UI at last!

The wait is finally over, no need to go through the daily Aurora upgrade process any more; Firefox Mobile 14 beta (available in the Google Play store) is out with all the improvements that were in the Aurora builds.
The main differences with the previous (non-Aurora) versions: Firefox on Android doesn’t use XUL (the Mozilla cross platform UI toolkit) any more, but switched to native Android UI elements. This (and other less visible changes) results in faster startup time, lower memory usage and better overall performance. There’s Flash in it as well, but with ‘tap to play’ option so the impact, I’m happy to report, is pretty limited. And the start-page is pretty nifty, with “Top Sites”, “Tabs from last time” and “Tabs on other computers” on one nice screen.
I must admit I was slightly worried at first, as I couldn’t get Sync to work at all (“could not connect to server” and similar error messages), but after uninstalling Aurora, Firefox Mobile Beta can sync just fine. All in all Firefox Mobile is an even greater browser than it was before.

As found on the web (May 16th)

generic (feed #49)
generic (feed #49)
generic (feed #49)
blog (feed #46)
generic (feed #49)
generic (feed #49)
generic (feed #49)
generic (feed #49)
generic (feed #49)

Fix Samsung ICS Exchange connection errors

[Update 21-5-2012: Samsung released new firmware, version I9100XWLPD, which seems to fix the bug.]
Since updating my Samsung Galaxy  S II to Ice Cream Sandwich, I’ve regularly been experiencing the dreaded “connection error” in the mail client when trying to fetch mail from the corporate Exchange server. A colleague of mine, who agreed to have me upgrade his SGS2 after I promised  everything worked flawlessly, had the problem even more regularly.
Searching the web turned up this interesting thread on xda-developers, which had amongst others a fix for the adventurous, but also this eye-opening comment:

The messages in question are Read Receipts, Delivery Receipts and similar messages. Once there is one of those in your inbox, you’re stuck until you delete it. […] A better solution which has worked for me is to create a folder for your receipts. Then, on your PC, create a rule to move the receipts to the folder on arrival. This will obviously also work when your PC is off, as the rules are stored and executed on the server. You will have to create a rule which processes emails on arrival, matches a series of strings in either subject or body of the message and moves them to the folder.

And that’s exactly what I did; mails sent only to me with “Declined:” or “Accepted:” or “Tentative:” or “Read:” or “Not read:” in the subject line are automatically moved into a “tmp” folder. Your mileage may vary (apparently there are other conditions under which the Android/ Samsung mail client has problems downloading items form Exchange), but based on my limited experience up until now, this workaround gets most problematic items in my Inbox out of the way. Now let’s hope Samsung fixes this blatant error (and that it isn’t in the ICS-version on that beautiful Samsung Galaxy S III)!