
Using Amazon Cloudfront with Drupal
We like to use our own site to experiment with different technologies. CDN's are nothing new, and Metal Toad has projects running on competing systems including Akamai and Level 3. Still, I think Amazon Cloudfront is an interesting offering and I wanted to give it a spin. Here's my review of the service after setting it up with Drupal:
Pros:
- Easy setup, low-cost, only pay for what you use
- Support for CNAMES and domain sharding (e.g. static[1234].metaltoad.com)
- Supports
Accept:Byte-Range
headers (important for media files) - Supports gzip compression (assuming your origin server does the compression)
- Supports HTTPS
- Honors cache-control headers
- Custom HTTP headers are passed through to edge requests (including CORS)
Cons:
- No global or directory "purge" command - each file must be invalidated individually
- No custom SSL Certificates, so you can't use CNAMES with HTTPS
- Fussy about SSL ciphers on the origin server (Cloudfront wasn't compatible with the ciphers configured on our load balancer, so I had to use "http-only" instead of "match-viewer")
- Only minimal logging, no reports or graphs
Domain sharding
Multiple CNAMES add a little overhead with extra DNS lookups, but increase the number of parallel downloads (browsers impose a per-hostname limit). To minimize upstream bandwidth needed, these domains should be cookie-free. Since I chose subdomains of the site itself, I needed to adjust some cookie settings in Drupal:
- In settings.php, set
$cookie_domain = 'www.metaltoad.com';
- In the Google Analytics module, set the tracking to "One domain with multiple subdomains"
Thanks to some new alter hooks in Drupal 7, all you need to implement a static file CDN is hook_file_url_alter()
. There's an actively maintained CDN module, but in the spirit of inquiry I decided to implement the hook directly.
Module file
/** * Implements hook_file_url_alter(). */ function mymodule_file_url_alter(&$uri) { // Route static files to Amazon CloudFront. if ($_SERVER['HTTP_HOST'] == 'www.metaltoad.com') { if ($GLOBALS['is_https']) { // Cloudfront doesn't support custom SSL certs, so we need to use Amazon's. $cdn = 'https://abcdef12345.cloudfront.net'; } else { // Multiple hostnames to parallelize downloads. $shard = crc32($uri) % 4 + 1; $cdn = "http://static$shard.metaltoad.com"; } $scheme = file_uri_scheme($uri); if ($scheme == 'public') { $wrapper = file_stream_wrapper_get_instance_by_scheme('public'); $path = $wrapper->getDirectoryPath() . '/' . file_uri_target($uri); $uri = "$cdn/$path"; } else if (!$scheme && strpos($uri, '//') !== 0) { $uri = "$cdn/$uri"; } } } /** * Implements hook_boot(). */ function mymodule_boot() { // Make sure Amazon CloudFront doesn't serve dynamic content. if (!empty($_SERVER['HTTP_X_AMZ_CF_ID']) && !strstr($_GET['q'], 'files/styles')) { header("HTTP/1.0 404 Not Found"); print '404 Not Found'; exit(); } } /** * Implements hook_css_alter(). */ function mymodule_css_alter(&$css) { // Mangle the paths slightly so that drupal_build_css_cache() will generate // different keys on HTTPS. Necessary because CDN URL varies by protocol. if ($GLOBALS['is_https']) { foreach ($css as $key => $style) { if ($style['preprocess'] && $style['type'] == 'file') { $css[$key]['data'] = './' . $style['data']; } } } }
.htaccess
# Set CORS header on static assets for CDN. <FilesMatch "\.(ttf|otf|eot|woff|css|css\.gz|js|js\.gz)$"> <IfModule mod_headers.c> Header set Access-Control-Allow-Origin "*" </IfModule> </FilesMatch>
The HTTPS and sharding support adds a little complexity, but overall the integration is straightforward. I'd recommend Cloudfront to anyone who wants an easy and cost-effective scalability win.
Comments
Great article! :)
Just one nitpick: you should be using Far-Future expiration for optimal performance. Then the ability to purge isn't meaningful anyway: if you want to force users to get new content, just generate a new URL for it!
For those who don't want to maintain manual code, here's a tutorial for Amazon CloudFront + the CDN module.
Fri, 10/05/2012 - 20:34
Hi Wim, Thanks for the feedback! We've left Drupal's default 2 week expiration in place. What do you think of opening a core issue to raise this value? In terms of perceived freshness 2 weeks is essentially infinite (no visitor or site operator would wait this long for new content), yet this value is shorter than what's commonly recommended (I've seen everything from 30 days to 10 years).
Also, do you have any suggestions on handling updates to image style presets? From what I've seen, Drupal automatically purges the directory when the form at admin/config/media/image-styles/edit/%style
is submitted, but the URLs aren't versioned so there's no way to update downstream caches.
I agree the global purge is mostly unnecessary, I mention it mostly as a difference from other services.
Fri, 10/05/2012 - 20:59
If core's going to use Far Future expiration, it should also ensure URLs change whenever files change. I don't believe it's core's responsibility to do this — yet. Especially because it can present a scalability issue for certain sites (Drupal may hit the FS for many files for each generated page, depending on how the unique file identifier generation is configured, the number of files and whether you have page caching enabled).
Your point about image styles is a great one. It's one that's actually addressed through the CDN module's Far Future expiration functionality as well. A simple solution could be to include a hash of the image style's configuration in the URL — whenever the image style configuration changes, the URLs would also change.
In general: when you have any feedback on the CDN module, let me know in the issue queue — I'd be more than happy to work with you! :) I'm always trying to make it better.
Likewise, if you're working on WPO issues in Drupal core — let me know :)
Fri, 10/12/2012 - 17:20
Interesting, in practice I've observed Drupal often does take responsibility for changing the filename when new versions are uploaded (FILE_EXISTS_RENAME
is the default for file_copy()
and related functions, and it's effects are seen when e.g. uploading a new theme logo or filefield). I would agree we're not very intentional about following this practice in core.
Anyway, I'm aware of a few WPO issues that might interest you:
- #1514760: Image style URLs (the question mentioned above)
- #1034208: drupal_sort_css_js() ignores media and browser keys. CSS and JS sorting (for aggregation) gives insertion order a higher precedence than the browser and media keys, which really makes no sense and results in splitting aggregates in unnecessary and arbitrary places.
BTW Metal Toad is using the CDN module on fearnet.com; many thanks for your contributions!
Wed, 01/13/2016 - 11:31
Hi Tack,
Great article to help someone like me, who needs it @earliest.
Want to know - What if we want to serve static content only by using CDN & rest load on primary server only. I wonder this will be great for Search Engine Optimization (So want solution for this in particulate thread), As leads to no Duplicacy.
Wed, 09/12/2012 - 17:42