Using PHP/MySQL w/Drupal to update missing Google Geocode data

Barrett

We recently needed to fill missing geocode values in a project's database. Using some PHP magic and taking advantage of hook_cron() this is how we filled the missing values:


/**
 * Implementation of hook_cron()
 * When the cron runs, try to resolve any locations that have a lat or lon = 0
 */
function overmap_cron() {
  // Fetch all locations with lat == 0 || lon == 0 and bin into $locations array
  $locations = array();
  $query = "
    SELECT *
    FROM {location} l
    WHERE l.latitude=0 
    AND l.longitude=0";
  $result = db_query($query);
  while ($data = db_fetch_array($result)) {
    $locations[] = $data;
  }
  
  // Loop through the geocode-less addresses and re-submit to google maps to try to get valid location data
  $identifiers = array('name', 'street', 'city', 'province', 'postal_code', 'country');
  foreach ($locations as $location) {
    // Boolean sentinel variable that control while loop
    $geocode_pending = true;
    while ($geocode_pending) {
      // Build address array and implode into single string
      $search = array();
      foreach ($identifiers as $identifier) {
        if ($location[$identifier] != '') {
          $search[] = trim($location[$identifier]);
        }
      }
      $search = implode(' ', $search);
    
      // Execute query and loda into an xml object
      $search = 'http://maps.google.com/maps/geo?output=xml&q=' . urlencode($search);
      $xml = simplexml_load_file($search);
      
      // Deal with returned values based on the status code
      $status = $xml->Response->Status->code;
      if (strcmp($status, "200") == 0) {
        // Successful geocode
        $geocode_pending = false;
        $coordinates = $xml->Response->Placemark->Point->coordinates;
        $coordinatesSplit = split(",", $coordinates);
        // Format: longitude, latitude
        $lat = $coordinatesSplit[1];
        $lng = $coordinatesSplit[0];
        // Create new object and update record in DB correct geocode data
        $record = new stdClass();
        $record->lid = $location['lid'];
        $record->latitude = $lat;
        $record->longitude = $lng;
        drupal_write_record('location', $record, 'lid');
      } else if (strcmp($status, "620") == 0) {
        // sent geocodes too fast
        $delay += 100000;
      } else {
        // failure to geocode
        $geocode_pending = false;
      }
      usleep($delay);
    }
  }
}

Comments

You may want to limit the

You may want to limit the number of records updated so that your cron job doesn't run out of control...

Good point. We were working

Good point. We were working with a relatively small dataset.

This is an amazing little

This is an amazing little tool. Thank You. I have been looking for it for while.

My only comment is for people that are only getting some locations to update. If you remove the 'name' from the identifiers array google will find the addresses easier.

cheers,

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd><blockquote>
  • Lines and paragraphs break automatically.
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <php>, <python>, <ruby>. Beside the tag style "<foo>" it is also possible to use "[foo]".

More information about formatting options