- How to get rid of page takes
- Where to start page take deletion
- Ways to solve identified problems
- If it is not possible to remove duplicate pages…
How to get rid of page takes
Once copies of web resource pages have been discovered, you need to decide how to remove duplication. After all, even if there are few such repetitions, it will still negatively affect the ratings of your web resource - search engines can punish you by reducing your position. Therefore, it is important to remove page duplicates regardless of their quantity.
Where to start page take deletion
To begin with, it is recommended to identify the reason why duplication of content has appeared. Most often this:
- Errors in shaping the structure of a web resource.
- “Transactions” of some modern engines for sites that, with incorrect settings, quite often automatically generate copies and store them at different addresses.
- Wrong site search filter settings.
Ways to solve identified problems
After clarifying the reason why duplication appeared, and eliminating it, you need to decide how to remove duplicate pages. In most cases, one of these methods is suitable:
- Delete duplicate pages manually. This method is suitable for small web resources containing up to 100–150 pages, which you can completely sort through yourself.
- Customize robots.txt. Suitable to hide duplicate pages that have not yet been indexed. Using the Disallow directive forbids bots from accessing unnecessary pages. To indicate the Yangex bot that it should not index the pages containing the “stranitsa” URL, you need to add:
- Use meta tag "noindex. This will not help remove duplicate pages, but it will hide them from indexing, as in the previous method. It is prescribed in the HTML code of the page (in the head section), about which search engines should “forget”, in this form:
There is one nuance - if the duplicate page already appears in the results of the issue, then it will continue to do so until re-indexing, which could be locked in the robots.txt file.
- Delete page takes using redirection 410. A good option instead of the previous two methods. Notifies the search engine robot that has visited that the page does not exist and there is no data on an alternative document. It is inserted into the server configuration file .htaccess as:
As a result, when you try to go to the address of the take page, you will see:
- Specify a canonical page for indexing. For this purpose, the rel = ”canonical” attribute is used. Added to the head HTML page code, which are unnecessary copies.
This will not help physically get rid of page takes, but only indicate the bots of search engines canonical (baseline), which needs indexation.
- Sticking pages. For this, redirection 301 is used. Such an option will also not help to remove duplicate pages, but will allow you to transfer the desired page to 99 % external and internal reference weight. Example:
If it is not possible to remove duplicate pages...
... or you do not want to delete them, you can at least secure the pages that are connected to them using internal skipping. To do this, use the rel = "nofollow" attribute. If you write it in links, they will no longer transmit weight.
Now you know enough ways to remove duplicate pages. If you can combine them skillfully, you can ensure that there is not a single precedent for duplication of content. Only after that can you count on the maximum efficiency of promoting your site.
If you have any questions on this topic, do not forget to ask them in the comments!