This article will tell you, How to Create and Optimize Robots.txt File for Your WordPress Site?
What is Robots.txt File
Robots.txt file is meant to be read by search engine spiders (robots, bots), not by by humans like you and me.
Robots.txt indicates parts of site (posts, pages and directories), you don’t want search engine spiders to access.
All pages and images included in Sitemap.xml file get automatically crawled and indexed by spiders.
Meta Tag decides what posts and pages get indexed and or cached by search bots. Meta Tag always super-cedes Sitemap.xml.
A search engine robot will always visit your Robots.txt file, before indexing your blog.
A search spider first visits Robots.txt, then Sitemap.xml, and ultimately Meta Tag on the page.
Take advantage of this, and add location of your XML sitemaps your Robots.txt file.
Robots.txt allows blocks access only in the root directory.
By default, anything outside the root directory is blocked by the hosting company.
Best Location to Place WordPress Robots.txt
Always place Robots.txt file at the root directory of your domain.
If your domain is https://kunaldesai.blog/, then your Robots.txt file should be found at https://kunaldesai.blog/robots.txt.
Create Robots.txt file for WordPress
You need to create and upload Robots.txt file for WordPress blog.
Create a new file in notepad if using Microsoft Windows Operating System. Save it as “Robots.txt”. Then upload to root directory of your domain with the help of a FTP client like FileZilla.
Sample WordPress Robots.txt
Optimize WordPress Robots.txt
A Robots.txt file consists of one or more blocks of directives [or syntax].
You can address a specific spider with the name “user-agent”.
You have two options to achieve this:
1) Use wildcard character (*) for all search engines.
2) Or a specific user-agent for specific search engine.
Index all Posts/Pages and Directories by all Search Engines
Here * denotes all search engines and Disallow blocks access to specified posts/pages or directories.
Address Specific Search Engine using User-Agent
Noindex all Posts/Pages and Directories
Noindex Whole Directory but Allow Specific Page
Names (user agents) of Search Engine Spiders
Validate WordPress Robots.txt
One of the best place is Robots.txt tester in Google Search Console.
WordPress Robots.txt Notes
1) WordPress Robots.txt will not remove a page/post from search engine index.
2) Never block xmlrpc.php and /wp-includes/ from search bots.
3) Allow search engine to access even low quality pages.
Moral of the Story
WordPress Robots.txt allows or blocks access to post, pages and directories to search engine spiders.