Venus doesn’t love noscript

Damn, Venus doesn’t love noscript!

You’ve got no clue what I’m rambling about, do you? Well, allow me to explain;

So now you know the context, let me reiterate; Venus doesn’t treat noscript the way it should! It not only strips out javascript as it should (are you listening tt-rss?) but it replaces noscript-tags and all HTML inside with escaped HTML (with HTML-entities actually). And that, my beloved ones, means that the HTML that WP YouTube Lyte generates, doesn’t work properly on Venus-based planets.

So I started looking at the Venus source and mailed with Planet Grep’s Wouter Verhelst to solve this issue. At first sight the solution seemed pretty straightforward; Venus shouldn’t ‘escape’ noscript but should instead just strip the opening and closing noscript-tag. Wouter installed a small sed-filter I wrote and added noscript to the whitelist of Venus’s sanitizer (which is based on Universal Feed Parser) and … it did not work.

The problem apperantly is with another sanitizing component in Venus; html5lib. Sam Ruby, the developer of Venus, wrote on the mailinglist;

There are multiple sanitization passes involved here. […] The html5parser seems to think that noscript is to be parsed as text only, which would result in the behavior that you describe. ¬†Looking at the current HTML5 spec, it appears that this does not match the expected behavior — so perhaps that changed too.

So I started looking at html5lib and … well, I’m stuck, html5lib is a pretty complex beast for a smalltime non-developer to dive into. So earlier today I turned to the html5lib discussion list to ask how sanitization can be configured not to escape noscript, let’s hope someone will enlighten me. Because until then those poor Planet Greppers won’t be able to see (a thumbnail of) Al Jarreau’s great version of Take Five way back in 1976:

Al Jarreau 1976 -Take Five

Watch this video on YouTube.

One thought on “Venus doesn’t love noscript

  1. Kylewm

    Sigh, seeing the same issue 5 years later in my application. Valid noscript with an image turns into

    <img class=”u-photo square” src=”…” alt=”” width=”500″ height=”500″ />

    Thanks for documenting your problem back then, to bad the html5lib folks didn’t respond!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *