I see several common bugs popping up. A wrong idea, index page, you should block crawl path. Meaningful, right? If you don't want the page index, why would you want it to crawl? Unfortunately, this sounds logical and completely wrong. Let's take a look at an example ...
For example: Product reviews
Suppose 11545.html "> We have a decent large ecommerce website with unique products 1000 pages. These pages look like this:
Each product has its own URL page, of course, these URLs are structured as follows:
Http://www.***.com/product/1
Http://www.***.com/product/2
Http://www.***.com/product/3
http://www.***.com/product/1000
Now let's say that each of these products page links to the product's comment page:
The pages of these reviews also have their own, unique URLs (juxtaposed product IDs), like this:
Http://www.***.com/review/1
Http://www.***.com/review/2
Http://www.***.com/review/3
http://www.***.com/review/1000
Unfortunately, we just stripped out 1000 duplicate pages each time the review page is really just a form that has no unique content. These censored pages have no search value and just dilute our index. So we decided it was time to take action ...
"Fix", part 1th
We want these pages up, so we decided to use the NOINDEX (Meta robot) tag. Because we really, really want to complete the page, we also decided to nofollow the review link. The first time we tried to fix the end, it looked like this:
On the surface, it makes sense. Here's the problem, though--those red arrows are cutting paths that might stop spiders. If spiders censor pages will never go back, they will never read the NOINDEX, they will not go to index the page. In the best case, it will take a long time (has gone to indexation time too long for large sites).
Repair, part 2nd
Instead, let's leave the path (the link we should follow). This way, the crawler will continue to visit the page, the repeated review of the URL should gradually disappear:
Stick to it, and in the process it still takes some time (weeks, in most cases). Monitor your index (at the "site:" operator) daily-what you are looking for is gradually decreasing over time. If this happens, you are in good shape. Pro tip: Do not take any day of the "site:" is too serious-it can be unreliable, from times to time. Look at the trend over time. The above content by www.guhele.com Diet health nets in Admin5 starting, reproduced please keep the URL, thank you!
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.