Monthly Archives: January 2018

How to extract blocks from Gutenberg

So Gutenberg, the future of WordPress content editing, allows users to create, add and re-use blocks in posts and pages in a nice UI. These blocks are added in WordPress’ the_content inside HTML-comments to ensure backwards-compatibility. For WP YouTube Lyte I needed to extract information to be able to replace Gutenberg embedded YouTube with Lytes and I took the regex-approach. Ugly but efficient, no?

But what if you need a more failsafe method to extract Gutenberg block-data from a post? I took a hard look at the Gutenberg code and came up with this little proof-of-concept to extract all data in a nice little (or big) array:

add_action('the_content','gutenprint',10,1);  function gutenprint($html) {    	// check if we need to and can load the Gutenberg PEG parser    	if ( !class_exists("Gutenberg_PEG_Parser") && file_exists(WP_PLUGIN_DIR."/gutenberg/lib/load.php") ) {    		include_once(WP_PLUGIN_DIR."/gutenberg/lib/load.php");  	}      	if ( class_exists("Gutenberg_PEG_Parser") && is_single() ) {  	  // do the actual parsing  	  $parser = new Gutenberg_PEG_Parser;  	  $result = $parser->parse(  _gutenberg_utf8_split( $html ) );  	    	  // we need to see the HTML, not have it rendered, so applying htmlentities    	  array_walk_recursive($result,  		function (&$result) { $result = htmlentities($result); }  		);    	  // and dump the array to the screen  	  echo "<h1>Gutenprinter reads:</h1><pre>";  	  var_dump($result);  	  echo "</pre>";  	} else {  	  echo "Not able to load Gutenberg parser, are you sure you have Gutenberg installed?";  	}        	// remove filter to avoid double output    	remove_filter('the_content','gutenprint');        	// and return the HTML  	return $html;  }

I’m not going to use it for WP YouTube Lyte as I consider the overhead not worth it (for now), but who know it could be useful for others?

Of bugs, inconsistencies and tag soup in (future) core

In general i rarely bother looking into WordPress core code or what’s on the horizon. The last month or so however I came across 3 problems that were due to core.

1. Shortly after pushing Autoptimize 2.3.x out, a subset of users started complaining about a flood of “cachebuster”-requests bringing their site to a crawl. It turned out all of them were using Redis or Memcached and that due to a longstanding bug in WordPress core Autoptimize did not get the correct version-string from the database, triggering the update-procedure over and over, purging and then preloading the cache. A workaround -involving a transient featuring my wife and daughter- was introduced to prevent this, but “oh my gawd” what an ugly old WordPress core bug that is! Can someone please get this fixed?

2. A couple of users of WP YouTube Lyte noticed their browsers complaining about unbalanced tags in the Lyte HTML output (which is part of the_content). It took me a lot of time to come to the conclusion that WordPress core’s wpautop was messing things up severely due to the noscript and meta-tags in Lyte’s output. As wpautop has no filters or actions to alter the way it works, I ended up disabling wpautop when lyte’s were found in the_content.

3. My last encounter was with the ghost of WordPress Yet-to-come; Gutenberg … To allow WP YouTube Lyte to turn Gutenberg embedded YouTube’s into Lyte’s, it turned out I had to dive into the tag soup that Gutenberg adds as HTML-comments to the_content. I have yet to find documented functions to extract the information the smart way, so regexes to the rescue. But any plugin that hooks into the_content better be aware that Gutenberg (as part of WordPress 5.0) will potentially mess things up severely.

Although I cursed and sighed and I am now complaining, I felt great relief when having fixed/ worked around these issues. But working in the context of a large and successful open source software project and depending on it’s quality can sometimes almost be a pain as much as it is a joy. Almost … ;-)