What is x-robots-tag
When you want to block the indexing of some specific pages of a site, you just have to use the meta robots noindex tag. But to do this with documents that are not in HTML format, like PDF documents, audio, Word, Excel, Powerpoint, etc., Google and Yahoo manage the X-Robots-Tag directive. This helps declare directly in the HTTP header. This makes it usable with any document format. Here are some pointers on what the x-robots-tag is and how to consider it in your SEO campaign.
How to enforce it
The name of this file must necessarily be written this way, in the plural: robots.txt. Any spelling mistake will make it useless. When a website has a robots.txt file but it cannot be interpreted by Google for various reasons, then the robot stops crawling the address and all its contents. So if you decide to integrate the robots.txt, it must be accessible, readable and indicate instructions that robots are able to assimilate or else they will not explore (and therefore index) the new information that you offer.
If you don’t specify an appropriate value (“none” value in particular), this could pass as “no-index” by default. It would be a shame to generate headers automatically or without thinking and find yourself de-indexed quickly. The X-Robots-Tag header corresponds more or less to the <meta name=”robots” content=”…”/> tag. Its purpose is to give guidelines (followed or not) for GoogleBot. However, the value “none” is likely to be interpreted as the value “noindex”.
It helps your SEO
Indexing and crawling pages by bots are two essential elements for any SEO strategy that wants to be effective. However, to have good natural referencing and succeed on the web, it is sometimes necessary to prevent crawlers from accessing certain content. This includes confidential documents and obsolete or duplicated pages that do not add value to your online visibility. With its multiple instructions and directives, this HTTP header tag is useful for redirecting bots to the content that is most valuable to your SEO.
Remember: Google will only see the X-Robots-Tag if it has the right to crawl the page. So, as for the meta name=’robots’ tag, here’s a tip. If you want crawlers to undesrstand and respect your guidelines, do not block the crawl of the concerned urls by a disallow in the robots.txt. It’s technical, but it’s flexible, efficient, and elegant.