About Me

header ads

How To Locate A Sitemap In A Robots.txt File

                                   



Hello all welcome back to APAJR Lab in last some days i post about 10 article in SEO Topic To read that all click here, Today i am going to show you how to locate a Sitemap In A Robots.txt files. Let begain. If you are a owner, webmaster or a website developer of your website, you will want your website or blog to be seen in search results. And in order to be shown in search results you need your website or blog and its various web pages crawled and indexed by search engine bots (robots) (such as Googlebot, yahoobot, bingbot etc).

There are two different files on the coded side of your website or blog that helps these search engine bots find what they need.

They are:
  • Robots.txt
  • Sitemap


Robots.txt and Sitemap :

Robots.txt is a simple text file that is placed on your website’s root directory. This is that file on your website or Blog that tells these search engine robots (bots) what to crawl and what not to crawl on your website pages. It also contains commands that describe which search engine robots are allowed to crawl and which are not allowed to crawl.

Usually, search engine bots look for the robots.txt file in a website and as soon as they enter one. It is therefore, significant to have a robots.txt file in the first place. Even if you want all the search robots to crawl all the pages on your website or Blog, a default robots.txt that allows, this is necessary. Please read our beginner’s guide on robots.txt if you want to  learn more about SEO.

Robots.txt also contain one of the important information and that is about sitemaps. In this article, we are going to elaborate on this very feature of robots.txt. But before that lets see what is a sitemap and why is it important for your website or blog.

Sitemap :

There are two different types of sitemap, There are :
  • XML Sitemap
  • HTML Sitemap

 XML Sitemap

A sitemap is an XML file that contains a list of all webpages on your website. It may also contain additional information about each URL in the form of meta data. And it just like a robots.txt, a sitemap is a must-have. It helps search engine bots explore, crawl and index all the webpages in a site through the sitemap.

HTML Sitemap

 A HTML sitemap is a HTML file that also contact a list of all important webpages on your website, This sitemap is totally different then XML Sitemap. This sitemap not used for search engine bots. This sitemap used for visitor of your website to show correct navigation.

How Are Robots.Txt And Sitemaps Related?

Back in 2006, Yahoo, Microsoft and Google united to support the standardized protocol of submitting pages to a website via sitemaps. You were required to submit your sitemaps through Google webmaster toolsBing webmaster toolsYahoo while some other search engines.

After about six months, in April 2007, they joined in support of a system of finding the sitemap via robots.txt called auto discovery of sitemaps. This means that even if you did not submit the sitemap to individual search engines it was OK no problem. They would find the sitemap location from your website’s robots.txt file first. (NOTE: Submitting of sitemaps is still,  however, done on most search engines that allow submissions of URL)

And hence, robots.txt file became even more significant for webmasters because they can easily pave way for search engine robots (bots) to discover all the pages on their website.

How To Create Robots.txt File With Sitemap Location?

Here are three simple steps to create a robots.txt file with sitemap location:

 

Step #1: Locate Your Sitemap URL

If your website or Blog has been developed by a third-party developer, you need to first check if they provided your website with a sitemap or Not. The URL to the sitemap of your site usually looks like this:
http://www.yourwebsite.com/sitemap.xml

So type this URL in your browser with your domain in place of ‘yourwebsite’.

You can also locate your sitemap via Google search by using search operators as shown in examples below:

site:example.com filetype:xml

OR

filetype:xml site:example.com inurl:sitemap

But this will only work if your site is already crawled and indexed by Google.

If you do not find a sitemap on your website or blog, you can create one yourself using this XML Sitemap generator or follow the protocol explained at Sitemaps.org.

Step #2: Locate Your Robots.txt File

You can check whether your site has a robots.txt file by typing www.yourdomainname.com/robots.txt.

If you do not have a robots.txt file then you will have to create one and add it to the top-level directory (root directory) of your web server. You would need access to your web server. Usually, it is put in the same place where your site’s main “index pages” lies. The location of these files depends on the kind of web server software you have. You must take the help of a web developer if you are not well accustomed to these files.
Just remember to use all lower case for the file name that contains your robots.txt content. Do not use Robots.TXT or Robots.Txt as your filename.

Step #3: Add Sitemap Location To Robots.txt File

Now, open up robots.txt at the root of your websiteAgain, you need access to your web server to do so. So, ask for a web developer to do it for you, if you are not aware how to locate and open up your website’s robots.txt file.

To facilitate auto-discovery of your sitemap file through your robots.txt, all you have to do is place a directive with the URL in your robots.txt, as shown in the sample below:

Sitemap:  http://www.example.com/sitemap.xml

So, the robots.txt file looks like this:

Sitemap: http://www.example.com/sitemap.xml
User-agent:*
Disallow:

NOTE: The directive containing the sitemap location can be placed anywhere in the robots.txt file. It is independent of the user-agent line, so it does not matter where it is placed.

What If You Have Multiple Sitemaps?

Every sitemap can contain up-to 50,000 URLs, Sitemap contain not more than 50,000 URLs. So in case of a larger website with large quantity of URLs, you can create multiple sitemap files. You must list these multiple sitemap file locations in a sitemap index file. The XML format of the sitemap index file is similar to the sitemap file, which means that it is a sitemap of sitemaps.

When you have multiple sitemaps, you can either specify your sitemap index file URL in your robots.txt file as shown in the example below:

Sitemap: http://www.example.com/sitemap_index.xml
User-agent:*
Disallow

Or, you can specify individual URLs of your multiple sitemap files, as shown in the example below:

Sitemap: http://www.example.com/sitemap_host1.xml
Sitemap: http://www.example.com/sitemap_host2.xml
User-agent:*
Disallow

Finally, there is one thing you need to pay attention to when adding the Sitemap directive to the robots.txt file.

Generally, it is advised to add the ‘Sitemap’ derivative along with the sitemap URL anywhere in the robots.txt file. But in some cases it has known to give some parsing errors. You can use Google Webmaster Tools for any such errors detected, about a week after you have updated your robots.txt file with your sitemap location.

To avoid this error it is recommended that you leave a line space after the sitemap URL.

I hope it is pretty clear now on how to create a robots.txt file with a sitemap location. Do it, it will help your website to get more traffic.




Post a Comment

3 Comments

  1. This comment has been removed by the author.

    ReplyDelete