Content:

Get a free consultation from an expert on your project

The correct robots.txt and its importance

Want to know how to close the site from indexing by search engines when you do not need it?

It turns out it's not that difficult. It will only require the correct robots.txt, placed in the root folder of your web resource.

Well, now in order. 

robots.txt is a text file that prescribes recommendations for the actions of search engines robots. It is him that they are looking for first thing, barely “crossing the threshold” of your web resource. If it is not there or it is present, but does not contain any information, search bots perceive this as permission to “walk” throughout the site without any restrictions.

Conversely, if it spells out certain instructions to ban indexation, search robots will try to stick to them.

Get a free consultation from an SEO expert on your site

Principle of action and setting robots.txt

The correct robots.txt contains entries in its body, each of which begins with the line in which User-agent client application. It says name of the robot to which the instructions relate in the next line / lines.

If the instruction applies to all indexer spiders, the “star” symbol is used instead of the name:


 
Next, a string with the Disallow directive and several specials are prescribed. characters that are selected depending on the purpose of the instruction.

Close the site from indexing? There is nothing simpler!

Actually, the main function of robots is to prohibit indexation. What exactly? Here you have to choose. There are plenty of options:

  1. Completely prohibit site indexation. The possibility of refusing a robot that has come to visit to enter your web resource and do its job is understood. It may be useful in the early stages of site development when the publication of content has already begun, but has not yet been brought to the right level. In this case, indexing of non-optimized pages is undesirable in order not to “spoil” the reputation of the site in advance.


     
  2. Close section / category from indexing. It is used in the case of a fully operational web resource that has a certain rating in the eyes of search engines, when a new section or category is being prepared, the indexing of which is still undesirable. 


     
  3. Deny page indexation. It is convenient to use if the site contains documents that are needed, but should not index and influence the overall rating of the web resource. For example, it could be a “Privacy Policy” consisting of a non-unique text.

Configuring robots.txt. 10 important chips

  1. If indexation is prohibited in robots, then it will act on the principle of seniority. That is, the ban applies to all files, pages and directories that are subordinate to the specified element.
     
  2. Correct robots.txt always contains at least one string User-agent, to be noted.
     
  3. Possible adjustment of robots.txt, at which for one bot, a record consisting of several instructions at once can be prescribed.


     
  4. The “*” symbol in front of the name will help prohibit indexing of all objects with the specified word.


     
  5. The “/” symbol is used both at the beginning and at the end of the directory name. Otherwise, robots may prohibit indexing of all pages in whose name “slovo” occurs. 


     
  6. The empty Disallow directive gives the robot permission to index all web resource pages.


     
  7. It is advisable that the correct robots.txt indicate where site map. This will significantly accelerate page indexation and eliminate the likelihood of a random robot passing of some of them. 


     
  8. The correct robots.txt may contain instructions prescribed only when using the lower register.
     
  9. Any Disallow can only point to one file / section / page and should be prescribed from the new line.
     
  10. You can not prescribe Disallow first, and then User-agent. A similar setting of robots.txt will be a waste of time, since bots will not be able to understand such instructions.

 
And the most important rule is that before you pour the correct robots.txt into the root of the website, you need to convince it of its correctness. It is recommended to check it for errors several times. Better yet, let someone else check. It will be easier for a fresh look to see typos and other troubles in the file body.

Only the correct setting of robots.txt will help prevent the indexing of precisely those elements Your site, which you for now decided to hide from the "shameful look" of search engines.

Any questions left? Ask! We are waiting for you in the comments!