Mechanizing Git bisect: Bug hunting for the lazy

Git bisect is a powerful automated tool for searching deep into a project's history. Instead of searching for relevant commit messages (git log) or patches (git log -S), bisect actually allows you to run a functional test on each revision until the first bad commit is identified. (Okay, it doesn't test every revision, it performs a binary search, which results in at most log2(N) tests. This allows a relatively large history to be searched quickly.)

The test can be done interactively, with the human performing each check, or mechanically if you can supply a testing script. Randy Fay has done a nice screencast on the interactive method; this post will instead focus on mechanizing the process.

For an example, let's look at a core Drupal bug that impacts this very site: #812990: Search page title changes to Home. For the moment, we'll pretend the cause of this bug isn't already known, and hunt it with git-bisect.

Before running bisect, I'll need to find a known good revision. I picked an arbitrary revision on April 1, and manually confirmed the title "Search" is correct.

Note - Make sure you test that the <good> commit you specify is genuinely good, or git will simply report it's next child as the first bad commit.

Starting bisect

git bisect start HEAD 6e894d75b5160b703566ab674f7dd99d998ac4a6
git bisect run ./test.php

test.php

This is a basic test that uses PHP's DOM and SimpleXML extensions. There are many ways this test could be written; Selenium, SimpleTest, or even just wget.

#!/usr/bin/php
<?php
 
# Clean the repository to prevent collisions with untracked files.
# Make sure your .gitignore is up-to-date!
`git clean -f`;
 
# Re-install core.  This is necessary because HEAD
# revisions aren't necessarily compatible.
`drush -y site-install`;
 
# Grant "anonymous" permission to use the search page.
`drush sql-query "INSERT INTO role_permission (rid, permission, \
  module) VALUES (1, 'search content', 'search')"`;
 
# Core likes a non-writeable sites directory, but this can interfere
# with git's ability to update default.settings.php.
# So the directory is made writeable again.
chmod('sites/default', '0755');
 
# The path to a search result page.
$url = 'http://d7.jasper/search/node/foo';
 
# Use SimpleXML to locate the page title.
$html = new DOMDocument();
if (@$html->loadHTMLFile($url)) {
  chmod('sites/default', '0755');
  if ($xml = simplexml_import_dom($html)) {
    $title = trim(array_shift($xml->xpath('//h1[1]')));
    if (!empty($title)) {
      print "title: $title\n";
      # Exit status of zero indicates the correct title was found.
      exit($title == 'Search' ? 0 : 1);
    }
  }
}
 
// Indicates this revision cannot be tested.
exit(125);
?>

Results

A few minutes after starting bisect, we have the answer:

Bisecting: 441 revisions left to test after this (roughly 9 steps)
[33f4bfc13d106f697a1f09fb6fedd9ec16696b74] - Patch #312144 by CorniI, dropcube, EvanDonovan, Damien Tournoud, David_Rothstein, tstoeckler: install fails when default.settings.php is not present.
running ./test.php
title: Home
Bisecting: 220 revisions left to test after this (roughly 8 steps)
[cddd4f8cf2dce0a0bad37186bd3c939c3200e724] #777738 by Garrett Albright: Fixed #states: Correct spelling of Dependent.
running ./test.php
title: Search
Bisecting: 110 revisions left to test after this (roughly 7 steps)
[83c375da758447e7336dc6c7693970f32b97fd41] #721400 follow-up by pwolanin: Remove unnecessary query strings from CSS/JS files.
running ./test.php
title: Home
Bisecting: 54 revisions left to test after this (roughly 6 steps)
[284cd3c472a5d561cff5e54f973fe6a72643e48c] #596614 by asimmonds: Rename node_update_7006() context parameter to sandbox, for consistency.
running ./test.php
title: Home
Bisecting: 27 revisions left to test after this (roughly 5 steps)
[f4dd428815e58b48ebd023ea155d5477036b7a1c] - Patch #555830 by effulgentsia: better code comments.
running ./test.php
title: Search
Bisecting: 13 revisions left to test after this (roughly 4 steps)
[3bd9de3c96751c853d370df76bdfc3f5b830570c] #679960 follow-up by lambic: Add test for 'Notice: undefined variable cids' when there are no comments available.
running ./test.php
title: Home
Bisecting: 6 revisions left to test after this (roughly 3 steps)
[1b426dc86a6cb5ee1fe1da689073489691484dfa] - Patch #719686 by duellj: tests for search weighting for HTML tags.
running ./test.php
title: Home
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[bf354f944559f21ee5cd0c3dd435cca0a417c8f5] #777100 by jhodgdon: Fixed  hook_field_storage_update_field() is not documented.
running ./test.php
title: Search
Bisecting: 1 revision left to test after this (roughly 1 step)
[f792269874c012edc0e2f28794ec8178c3403c07] #245103 by chx, jhodgdon, merlinofchaos, douggreen, jrbeeman: Fixed Search page tabs not highlighting.
running ./test.php
title: Home
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[6c2976e6b3f89a5c8427c8929da0c9198d046272] #783438 by sun: Fixed #states doesn't work for #type item.
running ./test.php
title: Search
f792269874c012edc0e2f28794ec8178c3403c07 is the first bad commit
commit f792269874c012edc0e2f28794ec8178c3403c07
Author: webchick <webchick>
Date:   Sat May 1 01:04:24 2010 +0000
 
    #245103 by chx, jhodgdon, merlinofchaos, douggreen, jrbeeman: Fixed Search page tabs not highlighting.
 
:040000 040000 2ef1c3ea6f6e198977257e1fc9825119e11eaf8f 9a04134995cfba314e2ebf72c4bb5315b20acd53 M	modules
bisect run success
Filed under 


Dylan,

I think what Joaquin might have been asking is how the script tells git whether the commit is bad.

The answer of course is the exit code. 0 means success (or a good commit) and 1 (or any other number(?)) means failure. Bash has a nicely backwards interpretation of true (0) and false (1).

Most scripting languages have an exit function that allows you to pass an exit code.

Of course if you have unit tests the command you use to run these should return the appropriate exit code too without any extra fuss. What I'm not sure about is what if you only just added the unit test in the latest commit, presumably the command to run the test would fail everywhere else because the test file would disappear whenever you check out an earlier revision.


Ah, yes - git is looking for the exit status. Although it's not exactly fair to blame bash for having "backwards" truth values. The exit status isn't boolean (although it's commonly used that way in practice), and this has been a standard part of Unix at least since the 80's and is probably much older.

You can look at sysexits.h on your system to see some standard status codes. (It's in /usr/include/sysexits.h on OS X).

About the Author

Dylan Tack, Director of Technology

Dylan is a software engineer with more than a decade of experience working with a wide variety of clients including the Linux Foundation, PBS, Habitat for Humanity, TV.com and the Emmys. His background includes training as an electrical engineer, but he became passionate about open source through his work with a university genetics lab.

Dylan is a proud member of the Drupal community, a member of the Drupal security team, and has extensive experience with Perl and Java. His other interests include computer security, embedded design, climbing, and brewing.

His latest talk at the Pacific Northwest Summit was titled: "Drupal Security for People Who Don't Care".

Interested? Let's talk.