For a staging site, it's important to exclude crawlers. You wouldn't want your content to get indexed at the wrong URL! The conventional wisdom is to use HTTP Basic authentication. There are some disadvantages to this approach however, and I've found I prefer using a new HTTP header called X-Robots-Tag
. Note that this assumes your only objective is to prevent indexing by benevolent crawlers. If you do need to keep secrets this method is obviously unsuitable.
<meta name="robots"...>
it works for all file types, not just HTML.<Directory /> # Globally disallow robots from the development sever Header Set X-Robots-Tag "noindex, noarchive, nosnippet" </Directory>