Archive for July, 2006
Case studies, blog postings, research papers, tools info and eBook notes
We have bumped into a number of questions recently about how to migrate a site from one technology to another, or even from one domain to another, without massacring Google referrals. We wrote the following post to one of the email lists we contribute to and figured it might be a good time to mention it again here.
If your primary concern is to defend the traffic you already have from Google then the issue of rolling out a new technology platform for your site is a very simple one to manage:
Your existing site will have a number of locations (URLs) indexed in Google, a bunch of which will also appear on third-party sites around the web. As a consequence Google will return periodically to these pages to attempt to re-index them and in the meantime will present these locations to potential users as relevant locations on which to find information to support their search queries. Google also evaluates the third-party links, in terms of their subject matter and authority, and confers additional relevance to the pages they point at.
So, when you dump all the old locations of a site by re-publishing it under a completely new organisational structure, paths and file names, a number of things happen – if you do not act in advance:
1. Google attempts to reindex a location it has in its index and is presented with a 404 File Not Found. Shortly after this, Google will dump this location from its index. It may hold onto it for a bit longer if third-party sites continue to point at it, but not for ever.
2. Users referred into the site by external links, to locations that are then superseded by the new organisation of the site, are also presented with a 404 and their journey comes to an end. Some sites present a Custom 404 that attempts to be sensitive to the users’ needs, but this is never what the visitor expects and it can often be very hard for them to find the content they were looking for afterwares – if they can be bothered to look – usually they’ll head back to Google and look for an easier source.
3. Behind the scenes Google is devaluing the inbound links to the site, as they no longer point to content, but to 404 dead locations. These links will still have some value in Google’s eyes, because they point to the domain, but not the value that they once had. The site’s performance in Google will start to fall off rapidly as the extent of dead locations is found by Google’s crawling activities.
4. Slowly Google is also discovering new locations on the site because of the new technology, but depending on the scale and popularity of the site, it make take a long time to re-index all of it. When it has re-indexed it all it won’t then be performing as well, because the site has dumped the value of all of the links into it (other than to the home page) by not respecting their individual locations.
The sad thing is that this scenario is happening daily on big and small sites alike.
AND THE WAY TO AVOID THIS HAPPENING IS ASTONISHINGLY SIMPLE!!!
For every location of the old site create a map to identify where that content is going to be located on the new site. For example:
Will be found at
A spread sheet of such locations, which can be converted into rules on bigger sites, can then be given to the implementation team with the following clear instruction:
A. When the server receives a request for one of the old locations, serve a 301 status code (this tells the visitor that the location has moved permanently – it is called a 301 Redirect) and then serve the new location (as defined by the map).
A visiting web user, may not even notice what happens; they’ll have clicked on a link on a site or in Google and will have ended up on the right piece of content. The fact that it is now at a different location in the address bar is likely to be lost on them.
A visiting search engine will definitely notice what happens; they will have received a clear instruction that the old location they had for the content is now obsolete, but they should replace it with a new location, as served.
The consequences of this simple change to the outcomes described above couldn’t contrast more clearly:
1. Google attempts to re-index the old location and is told that the content is no longer there, but is to be found at a new location permanently. Google will then simply swap out the location information in its index entry for the page and carry on as if nothing had happened.
2. Users seamlessly arrive on the new location for the content and carry on as if nothing had happened.
3. Behind the scenes, Google is conferring all existing reputation from inbound links to the new locations.
4. Google will slowly swap out all the old locations for the new ones as it crawls the site with no noticeable impact on traffic or relevance.
Time and again, such a strategy sees new technology implemented without a moment’s blip in the traffic from Google.
(Filed in Blog, July 28th, 2006)
Google is returning the DMOZ title for my site (screenshot) when a search reflects the precise words in my DMOZ listing. So I have added the NOODP meta tag, as below, so that I get the full title in a search for me. Let’s see how long it will take to change. Update: Not any more, it isn’t.
(Filed in Blog, July 25th, 2006)
To prevent all search engines that support the meta tag from using ODP information (Google’s words, not mine), use the following:
<META NAME=”ROBOTS” CONTENT=”NOODP”>
This is a welcome addition, as Google’s methodology for the inclusion of this data has appeared inconsistent. Of course, Google’s appetite for your pages will determine how quickly they will change, if you are suffering from this condition. If you are not, do it anyway. That’s all you need to know for now, so off you go, get on with it!
(Filed in Blog, July 13th, 2006)