cool tech graphics

URL Shorteners Must Die

Filed under:

URL shorteners (such as bit.ly and tinyurl) have been called the "herpes of the web". Beyond just link-rot, a public shortening service is per se an open redirect vulnerability. Their ubiquity makes them an easy vector for spammers, phishers, and cross-site forgery attacks.

Joshua Schachter writes:

With a shortening service, you're adding something that acts like a third DNS resolver, except one that is assembled out of unvetted PHP and MySQL, without the benevolent oversight of luminaries like Dan Kaminsky and St. Postel.

Luckily, you don't have to contribute to this scourge.

Drupal 7 has adopted the shortlink microformat, which adds a <head> element like so:

<link rel="shortlink" href="http://www.example.com/node/1" />

When we rebuilt our site in D7, we decided to ditch bit.ly in favor of these built-in shortlinks. However, I also felt the /node/ piece of the path was superfluous, and even strange-looking to visitors outside the Drupal community.

So, we decided to shorten them even further, removing both the "www" and /node/ from the URL. This required only a few minor changes:

.htaccess

Care was taken not to add the "www" prefix for these shortlinks, because doing so would result in multiple redirects (which still works, but is inefficient).

# Redirect all paths to the "www" prefix, except for /NNNN
RewriteCond %{HTTP_HOST} ^metaltoad\.com$ [NC]
RewriteCond %{REQUEST_URI} !^/\d+$
RewriteCond %{REQUEST_URI} !^/index.php
RewriteCond %{REQUEST_URI} !^/node/\d+
RewriteRule ^(.*)$ http://www.metaltoad.com/$1 [L,R=301]
 
# Allow short URLs of the form metaltoad.com/123
RewriteCond %{REQUEST_URI} ^/\d+$
RewriteRule ^(.*)$ index.php?q=node/$1 [L,QSA]

Redirect node/NNN to the alias

The next piece was to redirect to the actual path alias. At the time this site was built, neither the Global Redirect module nor it's successor Redirect, were deemed production-ready, so we rolled our own interim solution:

/**
 * Implements hook_init().
 */
function mymodule_init() {
  // Redirect node/NNN to the path alias, if available.
  // The globalredirect module isn't currently available for D7.
  if (preg_match('/^node\/\d+$/', request_path())) {
    $alias = drupal_get_path_alias();
    if ($alias != request_path()) {
      // Setting the redirect headers manually allows them to be
      // cached, which drupal_goto does not.
      drupal_add_http_header('Location', url($alias,
        array('absolute' => TRUE)));
      drupal_add_http_header('Status', '301 Moved Permanently');
      print '301 Moved Permanently';
      drupal_page_footer();
      exit();
    }
  }
}

Altering the <head> in template.php

This code was added to template.php to alter the <head> element. Since this is a renderable array in D7 it's easy!

/**
 * Implements hook_html_head_alter().
 * Generates shorter shortlinks of the form metaltoad.com/NNN.
 */
function metaltoad_html_head_alter(&$head_elements) {
  foreach ($head_elements as $key => $element) {
    if (isset($element['#attributes']['rel']) &&
      $element['#attributes']['rel'] == 'shortlink') {
      $href =& $head_elements[$key]['#attributes']['href'];
      if (preg_match('/^\/node\/\d+$/', $href)) {
        $href = str_replace('/node/', '', $href);
        $href = "http://metaltoad.com/$href";
      }
    }
  }
}

$base_url

Lastly, we made sure to explicitly set $base_url in settings.php. This ensures that when the Location header set, it uses the correct domain including the "www", again avoiding inefficient multiple redirects.

$base_url = 'http://www.metaltoad.com';

Now, we have our own share-friendly links that are only a few characters longer than bit.ly!
Before: http://bit.ly/c9Vk1R
After: http://metaltoad.com/318

Date posted: October 5, 2010

Comments

Cool idea for ones own site, but generally speaking I believe Google's url shortener, launched just last week, has gone a long way to addressing the concerns you list (link rot, security)

shorturl looks excellent as well. As a matter of taste, I thought the /NNN numeric paths looked less strange than the encoded output of shorturl (which uses letters and digits). But shorturl definitely has some advantages, especially if you have a larger number of nodes.

shorturl also allows you to create redirects for any URL, even external sites - not just nodes. This isn't a feature we really needed but I can see the utility of it.

Nice work, D! Quick question though, what's the difference between this approach and simply setting pathauto to generate an alias with only the node ID?

With this method your pages have 2 URLs:
A "canonical" URL, which is the big SEO friendly pathauto version
A "shortlink", which redirects to the longer canonical URLs.

Also last I checked, the [nid] token didn't really work in pathauto because the alias is generated prior to saving the node, so the nid doesn't exist yet.

Thanks for the post Dylan. I like the concept of avoiding third party shorteners and your implementation.

<sigh>

I guess I need to walk the walk. i just like the immediate gratification and easy view of how many clicks came through bit.ly...

Seems like you should be able to get the same information from our Google Analytics - that'll tell you what percentage of your traffic came through Twitter, Facebook, etc. Or are you looking for some other bit of information?

I love bit.ly because of the easy to read graphic display of how many people have clicked through on a particular link - which generally corresponds to what I send out via Twitter:

Screenshot of bit.ly after being logged in
In this way I can get a quick sense of what my follower care about, and what they don't. Can we reproduce that within Drupal?

I had coded the 'redirect to canonical URL feature' in redirect.module but at one point it didn't work, so I had commented it out. I went back today after reading this, tested it, and confirmed it's working again. I've also filed a patch for redirect.module to support the nid short-link redirect. http://drupal.org/node/933888

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <cpp>, <java>, <php>. The supported tag styles are: <foo>, [foo].
  • Web page addresses and email addresses turn into links automatically.
  • Lines and paragraphs break automatically.

Metal Toad is an Advanced AWS Consulting Partner. Learn more about our AWS Managed Services

Have questions?