Fixing a segmentation fault in Drupal

"[notice] child pid 45617 exit signal Segmentation fault (11)":

This is usually the start of a very bad day. Since a segfault is a low-level error in native machine code (in this case the PHP interpreter), many typical debugging techniques don't apply. Today I decided to try something new:

XCode Instruments

Apple's XCode includes an analyzer called Instruments (which is driven by Sun's DTrace). A stack trace from any process can be captured with point-and-click simplicity:

  1. Launch Instruments
  2. Start recording an activity monitor
  3. Attach a spin monitor to an httpd or php process
  4. Visit the page causing the segfault (you might have to repeat this a few times depending on how many PHP processes are running)
  5. Uncheck "hide missing symbols"
stack trace showing PCRE
(I arrived at the spin monitor with a little trial and error, if someone with more XCode expertise knows of a better procedure please let me know in the comments.)

Monkey-patching with Runkit

Now we know the segfault is caused by PCRE (the Perl Compatible Regular Expressions library). This isn't a big surprise, since PCRE is by far the most frequent cause of PHP segfaults. However, it still doesn't reveal which regex triggered the error.

I wanted a way to log every call to preg_match() (it's not the only regex function, but a good start). Since this is a built-in function it's not possible to add debugging code. That is, unless Runkit is installed. Runkit enables "Monkey patching" – replacing code at runtime. (If this sounds like a bad idea, that's because it usually is.)

I added the following code to settings.php:

runkit_function_copy('preg_match', 'preg_match_original');
runkit_function_redefine('preg_match', '', 'file_put_contents(\'/tmp/args\',
   print_r(func_get_args(), TRUE));call_user_func_array
   (\'preg_match_original\', func_get_args());');

After visiting the page again, my log file contained this:

Array (
  [0] => /^([0-9]+,)*[0-9]+$/
  [1] => 69726,69731,541476,...

Which matches:

  1157:  elseif (preg_match('/^([0-9]+,)*[0-9]+$/', $str)) {
  1133  function views_break_phrase($str, &$handler = NULL) {
  1157    elseif (preg_match('/^([0-9]+,)*[0-9]+$/', $str)) {
  1158      $handler->operator = 'and';
  1159      $handler->value = explode(',', $str);
  1160    }

The regex is part of Views's argument handling, and the subject contains almost 10,000 arguments! This strange crash results from the confluence of two issues:

At this point you might be wondering why preg_match() is allowed to segfault in the first place. Why doesn't it handle backtracking overflows more gracefully? Rasmus Lerdorf has the answer:

The problem here is that there is no way to detect run-away regular expressions here without huge performance and memory penalties. Yes, we could build PCRE in a way that it wouldn't segfault and we could crank up the default backtrack limit to something huge, but it would slow every regex call down by a lot. If PCRE provided a way to handle this in a more graceful manner without the performance hit we would of course use it.


  • Instruments was fun to expirement with; however if you don't have it available it's usually a good guess that a PHP segfault is PCRE-related.
  • I got lucky by guessing preg_match() on the first try, but the Runkit method would be time-consuming if you had to trace a lot of different functions. XDebug tracing can be triggered with xdebug_start_trace(), and will trace all functions at once.
  • Take an hour to read about and understand Catastrophic Backtracking.
Thanks for writing this! Had a similar problem with Advagg + YUI minifier and it really helped me out!

