Your have strong immunity which protects you from Coronavirus.

robots.txt File

robots.txt
All about robots.txt

This article will tell you, How to Create and Optimize robots.txt File for Your Site?

What is robots.txt File

robots.txt file is meant to be read by search engine spiders (robots, bots), not by by humans like you and me.

robots.txt indicates parts of site (posts, pages and directories), you don’t want search engine spiders to access.

All pages and images included in Sitemap.xml file get automatically crawled and indexed by spiders.

Meta Tag decides what posts and pages get indexed and or cached by search bots. Meta Tag always super-cedes Sitemap.xml.

A search engine robot will always visit your robots.txt file, before indexing your blog.

A search spider first visits robots.txt, then Sitemap.xml, and ultimately Meta Tag on the page.

Take advantage of this, and add location of your XML sitemaps your robots.txt file.

robots.txt allows blocks access only in the root directory.

By default, anything outside the root directory is blocked by the hosting company.

Best Location to Place robots.txt

Always place robots.txt file at the root directory of your domain.

If your domain is https://kunaldesai.blog/, then your Robots.txt file should be found at https://kunaldesai.blog/robots.txt.

How to Create Robots.txt file for Your Site

You need to create and upload robots.txt file for WordPress blog.

Create a new file in notepad if using Microsoft Windows Operating System. Save it as “robots.txt”. Then upload to root directory of your domain with the help of a FTP client like FileZilla.

Sample WordPress robots.txt File

sample WordPress robots.txt
Sample WordPress Robots.txt

How to Optimize WordPress robots.txt File

A robots.txt file consists of one or more blocks of directives [or syntax].
You can address a specific spider with the name “user-agent”.
You have two options to achieve this:
1) Use wildcard character (*) for all search engines.
2) Or a specific user-agent for specific search engine.

To Index all Posts/Pages and Directories by all Search Engines
User-agent: *
Disallow:

Here * denotes all search engines and Disallow blocks access to specified posts/pages or directories.

To Address Specific Search Engine using User-Agent
User-agent: Googlebot
Disallow:

To Noindex all Posts/Pages and Directories
User-agent: *
Disallow: /

To Noindex Whole Directory but Allow Specific Page
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Names (user agents) of Search Engine Spiders

  • Googlebot
  • Googlebot-Image
  • Googlebot-Mobile
  • Googlebot-News
  • Googlebot-Video
  • Mediapartners-Google
  • AdsBot-Google
  • bingbot

How to Validate robots.txt File

One of the best place is robots.txt Tester in Google Search Console.

Test robots.txt in Google Search Console
Test robots.txt in Google Search Console

robots.txt Notes

  • robots.txt will not remove a page/post from search engine index.
  • Allow search engine to access even low quality pages.

Moral of the Story

robots.txt file tells search engine spiders what post, pages, directories, and media it can crawl and not crawl.

Email Newsletter

Be first to receive notifications of new articles.