Analyze how to improve Baidu's website indexing

Source: Internet
Author: User

Crawler crawlers?

Pass page quality?
In the previous article, we already mentioned an index of the indexing rate. Many websites are too reluctant to do this. "I can't check the data on the site !", As a matter of fact, there is no way to start a lot of work without this indicator. Identify problems from the data and use the data guidance solution to analyze the data verification results. I recently read the simple introduction to data analysis, and I think it is good. I have made a vivid description of the data analysis method. I suggest you buy a copy for those interested in data analysis. Any data analysis consists of four steps: Objective-> Analysis-> evaluation-> decision-making.
Objective: to see how the website is indexed and whether there are opportunities for improving SEO.
Analysis: What is included is good or bad, is it measured by some indicators? Is the website indexed too General? Should the website be subdivided into different pages?
Evaluation: we need the following data.
Page-level relationship of a website
 
SEO traffic caused by Pages at various levels
How is the page indexing at various levels?
 
The proportion of SEO traffic can be filtered out from Google Analytics.
The number of pages can be obtained from the database, or statistics can be captured through the locomotive or self-made script.
The indexing rate can be used to search the retrieved page, or the locomotive.
Zero tool under the advertisement here: http://www.gnbase.com/forum.php? Mod = viewthread & tid = 11468 & highlight = % CA % D5 % C2 % BC % B2 % E9 % D1 % AF
The problem immediately becomes apparent!
The level 1 + 2 directory page brings a lot of traffic, and the indexing rate is not very good. The breakthrough to optimize the recorded traffic is here!
 
There are a large number of product pages, which are not ideal for indexing, but the resulting traffic is limited. In addition to indexing problems, there are also page content problems. This article does not care about it.
Decision-Making: Our conclusion is to immediately launch an action to optimize the indexing of the directory page.
Here, it seems that the initial goal is to "improve traffic by optimizing the indexing"
Has evolved into a new goal: "How to increase the indexing volume of directory pages"
Can I use the data analysis method for SEO again?
The answer is yes!
Let's repeat the goal-> Analysis-> evaluation-> decision-making process.
Objective: to increase the indexing volume of directory pages
Analysis: based on the two factors involved in this article, we need to check whether the web page has been crawled and whether the quality of the web page has passed.
1. We need to analyze the logs to determine the crawler situation. So we split a series of data from the log to see if the page has actually been crawled.
2. Because the page quality seems to be difficult to measure, we can use the same template:
Number of crawled pages/Number of crawled and indexed pages
To evaluate the impact of the template page quality on indexing. If all the crawled pages are included, it should at least indicate that the content search engine of this page is still recognized. (The actual situation is far more complex than this, and it may be deleted due to quality problems after the record is recorded, but it is always better than any reference, right !)
Evaluation: (sensitive information is replaced by numbers, all of which are real data)
Let's take a look at the crawler logs. Through Shell scripts, we can analyze them.
The total number of directory crawlers is about 13000.
The number of non-repeated directory crawlers is about 5500.
The directory under Channel A is captured almost once by 100%, and channel B's directory is also good, with 70% being captured at least once.
Directories crawled in other channels have a coverage rate of less than 30%.
 
Don't think this result is amazing. In fact, many websites will face such bad problems. As long as you keep segmenting, segmenting, and segmenting data, you will always find some clues.
For log analysis, do not trust any log analysis software. It is intended for lazy people. Self-made scripts and Excel are king. You can split and display any data you want. Of course, you can even skip Excel.
Then, we counted the most frequently captured channels A and channel B, and the indexing rate of the directory page.
 
Channels A and B are reassuring, indicating that there is no problem with page quality, but the remaining indexing conditions are worrying.
Decision-Making: Through the above data evaluation, we have come to the following conclusions.
Page quality does not affect the indexing.
Channel A and channel B's crawling volume is unusually high. After investigation, we found that the original directory page on the home page displays all the directory pages under Channel, the home page has the highest weight for the whole site. Channel B has more powerful external link resources than other channels and has a high weight.
Except Channel A and channel B, the capture of other channels is not optimistic. There are too few and too many entries to capture, which affects the indexing.
Obviously, Channel A is too powerful from the perspective of the station. It must carry out some "poverty-stricken" campaigns to reduce the crawling volume of Channel A and transfer it to other channels. At the same time, you need to provide crawlers with more portals to capture the channel pages.
Now that the problem becomes clearer, we start to divide our work into two parts: 1. Provide more portals 2. Divide resources evenly among channels rather than on a few channels.
Provides entry-level work:
1. Make the URL of the directory page into sitemap. Submit it to the search engine and set it to a relatively high crawling weight.
2. Complete the bread navigation and divide the bread navigation into more details to provide more entries
3. Recommended directory pages for other products
Resource sharing: (some concepts: any page may become a crawler entry. Baidu crawlers have limited crawling depth. The lighter the page is than the entry, the higher the probability of being crawled .)
1. the original home page points to the directory page + product page of Channel A, and all its nofollow, ensure that all crawlers entering the home page are crawled to the channel page, go to the directory page through the channel page (in fact, this is not too important)
2. The original channel page points to its own product page, and all its nofollow (ensure that crawlers from the channel page portal can crawl the directory pages to the maximum extent)
3. Return the link from the directory page to the home page and set all the links to nofollow.
4. Reduce irrelevant links on some pages. (Under what circumstances, this is very effective .)
Now you can start ..
Results
What is the result of this operation? Let's take a look at the data modified one month later.
 
The indexing rate of directory pages has increased by 100%!
The indexing rate of the product page has also been improved to some extent, thanks to the good display of the product by the directory page.
SEO performance of directory pages:
SEO traffic proportion increased by 15%
The number of visiting keywords increased by 10% (including new pages)
SEO traffic has increased by more than 50%. (Including seasonal factors)
Note:
1. In addition to indexing, ranking is also a problem and you can keep an eye on it.
2. Channel A may even be fully shielded in special cases, but it may be A little technical effort.
3. Baidu's support for nofollow is said to be confusing. Anyone who knows Baidu can ask for help.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.