Drupal

Fixing a segmentation fault in Drupal

"[notice] child pid 45617 exit signal Segmentation fault (11)": This is usually the start of a very bad day. Since a segfault is a low-level error in native machine code (in this case the PHP interpreter), many typical debugging techniques don't apply. Today I decided to try something new:


Filed under:

"[notice] child pid 45617 exit signal Segmentation fault (11)":

This is usually the start of a very bad day. Since a segfault is a low-level error in native machine code (in this case the PHP interpreter), many typical debugging techniques don't apply. Today I decided to try something new:

XCode Instruments

Apple's XCode includes an analyzer called Instruments (which is driven by Sun's DTrace). A stack trace from any process can be captured with point-and-click simplicity:

  1. Launch Instruments
  2. Start recording an activity monitor
  3. Attach a spin monitor to an httpd or php process
  4. Visit the page causing the segfault (you might have to repeat this a few times depending on how many PHP processes are running)
  5. Uncheck "hide missing symbols"
stack trace showing PCRE
(I arrived at the spin monitor with a little trial and error, if someone with more XCode expertise knows of a better procedure please let me know in the comments.)

Monkey-patching with Runkit

Now we know the segfault is caused by PCRE (the Perl Compatible Regular Expressions library). This isn't a big surprise, since PCRE is by far the most frequent cause of PHP segfaults. However, it still doesn't reveal which regex triggered the error.

I wanted a way to log every call to preg_match() (it's not the only regex function, but a good start). Since this is a built-in function it's not possible to add debugging code. That is, unless Runkit is installed. Runkit enables "Monkey patching" – replacing code at runtime. (If this sounds like a bad idea, that's because it usually is.)

I added the following code to settings.php:

runkit_function_copy('preg_match', 'preg_match_original');
 
runkit_function_redefine('preg_match', '', 'file_put_contents(\'/tmp/args\',
   print_r(func_get_args(), TRUE));call_user_func_array
   (\'preg_match_original\', func_get_args());');

After visiting the page again, my log file contained this:

Array (
  [0] => /^([0-9]+,)*[0-9]+$/
  [1] => 69726,69731,541476,...
)

Which matches:

  profiles/commons/modules/contrib/views/includes/handlers.inc
  1157:  elseif (preg_match('/^([0-9]+,)*[0-9]+$/', $str)) {
 
  1133  function views_break_phrase($str, &$handler = NULL) {
  ...
  1157    elseif (preg_match('/^([0-9]+,)*[0-9]+$/', $str)) {
  1158      $handler->operator = 'and';
  1159      $handler->value = explode(',', $str);
  1160    }

The regex is part of Views's argument handling, and the subject contains almost 10,000 arguments! This strange crash results from the confluence of two issues:

At this point you might be wondering why preg_match() is allowed to segfault in the first place. Why doesn't it handle backtracking overflows more gracefully? Rasmus Lerdorf has the answer:

The problem here is that there is no way to detect run-away regular expressions here without huge performance and memory penalties. Yes, we could build PCRE in a way that it wouldn't segfault and we could crank up the default backtrack limit to something huge, but it would slow every regex call down by a lot. If PCRE provided a way to handle this in a more graceful manner without the performance hit we would of course use it.

Lessons

  • Instruments was fun to expirement with; however if you don't have it available it's usually a good guess that a PHP segfault is PCRE-related.
  • I got lucky by guessing preg_match() on the first try, but the Runkit method would be time-consuming if you had to trace a lot of different functions. XDebug tracing can be triggered with xdebug_start_trace(), and will trace all functions at once.
  • Take an hour to read about and understand Catastrophic Backtracking.

Similar posts

Get notified on new marketing insights

Be the first to know about new B2B SaaS Marketing insights to build or refine your marketing function with the tools and knowledge of today’s industry.