Git

Quick & Dirty WordPress Plugin Benchmarking in Debug Bar

At tonight's PDX WordPress Dev meetup (thanks for the pizza Digital Trends) Daniel Bachhuber had some questions about benchmarking a plugin. Benchmarking WordPress itself is easy, but it's harder to isolate a specific plugin, much less a few calls to preg_match_all() within it. The questioned SEO Auto Linker plugin does this on every page load, so any running time adds latency on every page. Speculation from the meetup is that a PHP regex operating on post content, a blob, and looping through hundreds of links could be pretty slow. Too much caffeine today meant I had to give it a try.


Filed under:

At tonight's PDX WordPress Dev meetup (thanks for the pizza Digital Trends) Daniel Bachhuber had some questions about benchmarking a plugin. Benchmarking WordPress itself is easy, but it's harder to isolate a specific plugin, much less a few calls to preg_match_all() within it. The questioned SEO Auto Linker plugin does this on every page load, so any running time adds latency on every page.

Speculation from the meetup is that a PHP regex operating on post content, a blob, and looping through hundreds of links could be pretty slow. Too much caffeine today meant I had to give it a try.

Attempted Benchmarks

My first attempt to benchmark the plugin was the S3 Profiler. All that did is show my dev machine spent a good amount of time running plugins, almost as much for the SEO Auto Linker as for Relevanissi and a bit less than my theme. So ... yeah. That really didn't tell me whether the plugin was doing a lot or if just certain parts of it were slow. Plus, adding a big volume of links to test with was taking a long time to mock up in SQL, while running the plugin multiple times to get a true benchmark wasn't possible with a whole-WP testing tool.

Then I gave wp-cli a try. Daniel is a fan so it seemed like a fast option to get more insight. Unfortunately enabling the Auto Linker for wp-cli was taking far more time than I'd hoped. OK, not really, enabling WP_CLI and testing to resolve mistakes is actually what was taking forever.

A Working Stub (You're using Debug Bar, right?)

After those dead ends I hammered out a brute force benchmarking script:

<?php
/*
* Usage:
* 1. Paste into Debug Bar console and Run
* 2. Run it several times (just in case iTunes is running on your dev server)
*
* Benchmarking Alternatives:
* - Wrap the real SEO Auto Linker for wp-cli and replace the call to content()
* - Add this quick & dirty benchmark loop to Debug Bar
*
*/
// Get us some content to filter
$remote = wp_remote_get( 'http://www.danielbachhuber.com/' );
$content = $remote['body'];
// Repeat times
$repeat = 20;
// For the record
echo "Looping $repeat times<br /><br />";
// Time the preg_match way
$time_a1 = microtime( true );
echo "Start preg_match: $time_a1<br />";
// The real work
$filtered = null;
for ( $i = 0; $i < $repeat; $i++ ){
$filtered .= content( $content );
}
// Timing output
$time_a2 = microtime( true );
echo "End preg_match: $time_a2<br /><br />";
echo "Total time: " . 1000 * ( $time_a2 - $time_a1 ) . "&nbsp;milliseconds<br />";
echo "Time per execution: " . 1000 * ( ( $time_a2 - $time_a1 ) / $repeat ) . "&nbsp;milliseconds<br /><br />";
echo "<h2>Original</h2>" . $content;
echo "<br /><br />";
echo "<h2>Filtered</h2>" . $filtered;
// -- Plugin Stubs--------------------------------------------------------------
// From SEO Auto Linker, http://wordpress.org/extend/plugins/seo-auto-linker/
// -----------------------------------------------------------------------------
function content( $content ) {
$header_replacements = array();
$link_replacements = array();
$other_replacements = array();
$shortcode_replacements = array();
$filtered = $content;
preg_match_all( '/' . get_shortcode_regex() . '/', $filtered, $scodes );
if( ! empty( $scodes[0] ) ) {
$shortcode_replacements = gen_replacements( $scodes[0], 'shortcode' );
$filtered = replace( $shortcode_replacements, $filtered );
}
preg_match_all( '/<h[1-6][^>]*>.+?<\/h[1-6]>/iu', $filtered, $headers );
if( ! empty( $headers[0] ) ) {
$header_replacements = gen_replacements( $headers[0], 'header' );
$filtered = replace( $header_replacements, $filtered );
}
preg_match_all( '/<(img|input)(.*?) \/?>/iu', $filtered, $others );
if( ! empty( $others[0] ) ) {
$other_replacements = gen_replacements( $others[0], 'others' );
$filtered = replace( $other_replacements, $filtered );
}
// Not using Links post types, build here
preg_match_all(
'/<a(.*?)href="(.*?)"(.*?)>(.*?)<\/a>/iu',
$filtered,
$links
);
if( ! empty( $links[0] ) ) {
$start = count( $link_replacements );
$tmp = gen_replacements( $links[0], 'links', $start );
$filtered = replace( $tmp, $filtered );
$link_replacements = array_merge(
$link_replacements,
$tmp
);
}
$regex = get_kw_regex();
$url = 'http://www.danielbachhuber.com';
$max = 1;
if( ! $regex || !$url || !$max )
continue;
$target = '_self';
$filtered = preg_replace(
$regex,
'$1<a href="' . esc_url( $url ) . '" title="$2" target="' . $target . '">$2</a>$3',
$filtered,
absint( $max )
);
return $filtered;
}
function replace( $arr, $content ) {
return str_replace(
array_values( $arr ),
array_keys( $arr ),
$content
);
}
function gen_replacements( $arr, $key, $start = 0 ) {
$hash = md5( 'seo-auto-linker' );
$rv = array();
$h = $hash;
foreach( $arr as $a ) {
$rv["<!--{$h}-{$key}-{$start}-->"] = $a;
$start++;
}
return $rv;
}
function get_kw_regex() {
$keywords = array(
'wordpress',
'post',
'url',
'test',
'image',
'hello',
'world',
'WordPress',
'code',
'git'
);
return sprintf( '/(\b)(%s)(\b)/ui', implode( '|', $keywords ) );
}

By running just a stubbed version of the plugin it only tests the questionable parts. And, there's a neutering problem with the above code &emdash; I held back from stubbing a links post type, so it parses content through the regexes but doesn't have any links to replace matches with. Note that it also requires the Debug Bar Console, but that seemed a light prerequisite after all the dead ends.

Results & Conclusions

  1. Running time depends most on the length of the content it parses. This indicates preg_match_all() is the most significant impact factor and str_replace() could be a good alternative
  2. The number of filtered keywords or links does affect running time. But within a loop they aren't as big an impact as the regex. Limiting these keywords will not have as big an impact
  3. Regardless of post size or keyword count, I think the parsing this plugin does should be cached: parse times for a single post on my dev machine ranged from 10ms to 60ms, with about 10 to 50 keywords to parse (as noted, links don't work right in the stub above). If caching isn't built-in to the plugin itself, then a page cache should be required for larger sites

Similar posts

Get notified on new marketing insights

Be the first to know about new B2B SaaS Marketing insights to build or refine your marketing function with the tools and knowledge of today’s industry.