Robots.txt optimisation
When we first wrote this post back in 2011 we outlined best practice concerning the robots.txt file, blocking un-relavant parts of file structures to prevent search engines from indexing certain parts of a site, this was considered best practice until late 2014 when Google released the Fetch and Render Tool, from this point blocking parts of your site such as the theme and plugin folders can have negative effects on your site. Google now give guidelines to allow bots to crawl parts of the site which contain CSS, Scripting and anything else which can alter the appearance of a website to a user.
As a result of these we generally leave the robots.txt to bare basics and adjust it per situation, but for reference here is our base robots.txt which blocks some resource hogs, scanners and potentially bad bots.
User-agent: * Disallow: /cgi-bin/ Sitemap: https://www.YOURDOMAIN.co.uk/sitemap_index.xml User-agent: MJ12bot Disallow: / User-agent: Yandex Disallow: / User-agent: moget User-agent: ichiro Disallow: / User-agent: NaverBot User-agent: Yeti Disallow: / User-agent: Baiduspider User-agent: Baiduspider-video User-agent: Baiduspider-image Disallow: / User-agent: sogou spider Disallow: / User-agent: YoudaoBot Disallow: / user-agent: AhrefsBot disallow: /
Redirect non Www to Www. :
RewriteEngine On RewriteBase / RewriteCond %{HTTP_HOST} ^yourdomain.com RewriteRule (.*) http://www.yourdomain.com/$1 [R=301,L]
Redirect Www to non Www :
RewriteEngine On RewriteBase / RewriteCond % ^www.yourdomain.com [NC] RewriteRule ^(.*)$ http://yourdomain.com/$1 [L,R=301]
3. Protecting your .htaccess file
You really dont want people looking at your .htaccess file so use this to block them.
order allow,deny deny from all
Redirect Dedicated IP to domain ( duplicate content fix )
# IP TO DOMAIN REDIRECT RewriteCond %{HTTP_HOST} ^211\.122\.10\.10$ RewriteRule ^(.*)$ https://www.yourdomain.co.uk/$1 [L,R=301]
Redirect http to https/SSL
RewriteEngine On RewriteCond %{HTTPS} !=on RewriteRule ^(.*) https://%{SERVER_NAME}/$1 [R,L]
Single Page 301 htaccess redirect
Redirect 301 /oldfileorurl.php https://www.domain.co.uk/new-page
Redirect entire directory and child URLs to one page or domain
redirectMatch 301 ^/old-directory/ https://www.newdomain.co.uk/new-directory
Enable Gzip Compression
AddOutputFilterByType DEFLATE text/plain AddOutputFilterByType DEFLATE text/html AddOutputFilterByType DEFLATE text/xml AddOutputFilterByType DEFLATE text/css AddOutputFilterByType DEFLATE application/xml AddOutputFilterByType DEFLATE application/xhtml+xml AddOutputFilterByType DEFLATE application/rss+xml AddOutputFilterByType DEFLATE application/javascript AddOutputFilterByType DEFLATE application/x-javascript AddOutputFilterByType DEFLATE application/x-httpd-php AddOutputFilterByType DEFLATE application/x-httpd-fastphp AddOutputFilterByType DEFLATE image/svg+xml SetOutputFilter DEFLATE
Expires Header caching (Leverage Browser Caching)
ExpiresActive On ExpiresByType image/jpg "access 2 week" ExpiresByType image/jpeg "access 2 week" ExpiresByType image/gif "access 2 week" ExpiresByType image/png "access 2 week" ExpiresByType text/css "access 2 week" ExpiresByType application/pdf "access 2 week" ExpiresByType text/x-javascript "access 2 week" ExpiresByType application/x-shockwave-flash "access 2 week" ExpiresByType image/x-icon "access 2 week" ExpiresDefault "access 2 week"
Prevent directory listing / Browsing
IndexIgnore *
WordPress Hardening
Below are a couple of htaccess additions which will help harden a WordPress based website.
Protect wp-config.php
order allow,deny deny from all
Secure / Harden wordpress includes folder
RewriteEngine On RewriteBase / RewriteRule ^wp-admin/includes/ - [F,L] RewriteRule !^wp-includes/ - [S=3] RewriteRule ^wp-includes/[^/]+\.php$ - [F,L] RewriteRule ^wp-includes/js/tinymce/langs/.+\.php - [F,L] RewriteRule ^wp-includes/theme-compat/ - [F,L]