"[notice] child pid 45617 exit signal Segmentation fault (11)":
This is usually the start of a very bad day. Since a segfault is a low-level error in native machine code (in this case the PHP interpreter), many typical debugging techniques don't apply. Today I decided to try something new:
Apple's XCode includes an analyzer called Instruments (which is driven by Sun's DTrace). A stack trace from any process can be captured with point-and-click simplicity:
httpd
or php
processNow we know the segfault is caused by PCRE (the Perl Compatible Regular Expressions library). This isn't a big surprise, since PCRE is by far the most frequent cause of PHP segfaults. However, it still doesn't reveal which regex triggered the error.
I wanted a way to log every call to preg_match()
(it's not the only regex function, but a good start). Since this is a built-in function it's not possible to add debugging code. That is, unless Runkit is installed. Runkit enables "Monkey patching" – replacing code at runtime. (If this sounds like a bad idea, that's because it usually is.)
I added the following code to settings.php:
runkit_function_copy('preg_match', 'preg_match_original'); runkit_function_redefine('preg_match', '', 'file_put_contents(\'/tmp/args\', print_r(func_get_args(), TRUE));call_user_func_array (\'preg_match_original\', func_get_args());');
After visiting the page again, my log file contained this:
Array ( [0] => /^([0-9]+,)*[0-9]+$/ [1] => 69726,69731,541476,... )
Which matches:
profiles/commons/modules/contrib/views/includes/handlers.inc 1157: elseif (preg_match('/^([0-9]+,)*[0-9]+$/', $str)) { 1133 function views_break_phrase($str, &$handler = NULL) { ... 1157 elseif (preg_match('/^([0-9]+,)*[0-9]+$/', $str)) { 1158 $handler->operator = 'and'; 1159 $handler->value = explode(',', $str); 1160 }
The regex is part of Views's argument handling, and the subject contains almost 10,000 arguments! This strange crash results from the confluence of two issues:
At this point you might be wondering why preg_match()
is allowed to segfault in the first place. Why doesn't it handle backtracking overflows more gracefully? Rasmus Lerdorf has the answer:
The problem here is that there is no way to detect run-away regular expressions here without huge performance and memory penalties. Yes, we could build PCRE in a way that it wouldn't segfault and we could crank up the default backtrack limit to something huge, but it would slow every regex call down by a lot. If PCRE provided a way to handle this in a more graceful manner without the performance hit we would of course use it.
preg_match()
on the first try, but the Runkit method would be time-consuming if you had to trace a lot of different functions. XDebug tracing can be triggered with xdebug_start_trace()
, and will trace all functions at once.