An XML sitemap is a great way to help crawlers such as Google, MSN Search, Yahoo and ASK.com to crawl your site and also allow you to tell them how important pages on your site are relative to each other.
Luckily there is a great module for creating XML sitemaps with EPiServer available on epicode. You can read more about it on Jacob Khans blog.
Once you have an XML sitemap set up the search engines needs to be told about it. One place to do so, which should always be done, is in a robots.txt file. There are however two other options that also allow you to notify the search engines when the sitemap has been updated which should speed up their indexing of the new content. And as we all know, time, is money :)
The first way of telling the search engines that your sitemap has been updated is to use their webmaster tools. This does however require a registered account with each search engine and some manual work from the editor. Another option is to ping the search engines (so far it works with Google and MSN) with the URL of your updated sitemap.
I like the idea of new content being indexed as quickly as possible without the editor having to think about it so I put together a small class library for doing so. You can download the source code here.
Using the class library
To use the class library simply put it in your bin folder and configure it in web.config (see the Configuration section below). Once that’s done you can create an event handler that calls the UpdatedSitemapNotify method in the SitemapUpdateNotifier class each time a page that is visible in menus is published.
Another option, which is probably better as Google recommends that we should not ping it more than once per hour, is to use the scheduled job "Search engine pinger" which should be able in admin mode when the class library is deployed to your bin folder. It will call an overload of the UpdatedSitemapNotify method with the date of the last successful execution of the job. The UpdateSitemapNotify method will then check if the sitemap contains any page that has been updated after that date, and if so ping the configured search engines. The job should be configured to run once per hour, but could of course be manually started in when a major release is done.
Configuration
The class library will require a new configuration section in web.config that specifies the URL of the sitemap (or sitemap index) and the notification targets, that is the URLs that should be pinged. For a site hosted at http://www.mydomain.com/ using the above mentioned XML sitemap generator that should ping Google and MSN the configuration should look like this:
<configuration>
<configSections>
<section name="sitemapUpdateNotification"
type="SitemapUpdateNotification.Configuration.SitemapUpdateNotificationConfiguration,
SitemapUpdateNotification" />
</configSections>
<sitemapUpdateNotification
sitemapUrl="http://www.mydomain.com/SearchEngineSitemaps.SiteMapIndex.aspx">
<notificationTargets>
<add name="Google"
targetUrl="http://www.google.com/webmasters/tools/ping?sitemap=" />
<add name="MSN"
targetUrl="http://webmaster.live.com/ping.aspx?siteMap=" />
</notificationTargets>
</sitemapUpdateNotification>
</configuration>
A sample configuration is also included in the download.