Site traffic Anomaly Tracking documentation

Source: Internet
Author: User
Tags nslookup


One, operation and maintenance of the chapter

1. CDN

2, robots

3, UA/IP

4. Safety issues/Poor management

A, domain name Pan-resolution

B, the website was hacked

C, the page is hanging horse

D, UGC site was spammer mass


Second, feature page analysis

1. Analysis flowchart

2. Find Features Page

3. Analysis of abnormal characteristics

A, no ranking, no flow

B, partial ranking no, large flow loss


Third, related factors--external chain explosion

1, home, channel pages, important pages are malicious votes

2. User created page, spam personal page, content page was maliciously voted

3, Stitching Search results Page Vulnerability

4, how to prevent the creation of stitching search results page

What is traffic anomaly


Baidu Search from the traffic suddenly dropped more than 50%, and is persistent, that has been a continuous 4-5 days of no significant change in flow. As shown in the following:


If the above phenomenon occurs, we suggest that we should judge the reason from operation and maintenance angle, page characteristics and related factors.

The operation and maintenance of the reason of traffic anomaly at site


1. CDN
Some websites use CDN Accelerator, and most recent weekly webmaster platforms have received one or two cases related to CDN acceleration issues. CDN There is a problem: different CDN service providers in the country distribution of the number of nodes, the site in the use of CDN services, the same region CDN to the spider return IP address and to the user's consistent. Then the reality is that some CDN service providers for cost considerations and resource utilization, will not inform the user of the situation to change the IP address to save costs, so that the spider in the catch, will encounter new IP and old IP problems, in the site did not inform the replacement of IP, Spiders will think that the site is currently a problem, the first is to reduce the amount of crawl, and then Judgewhether to put down the inaccessible page, this time there is a user access to IP and spider crawling IP inconsistencies. It is recommended that the site in the selection of CDN services to choose a stable CDN service providers, and secondly if the IP replacement is best through the Webmaster Platform Crawl Diagnostic Tool crawl test, if the IP is incorrect can click the red box "error" prompt to the Webmaster platform

2, RobotsStationmaster should be not unfamiliar to robots, but why still want to mention? There is a situation: the site needs to update the robots file, to the station part of the content is forbidden or open crawl. However, since the operation did not check whether or not to take effect, the webmaster will default to have been effective. At the same time there is an effective period of the problem, then the site needs to be in the Baidu Webmaster platform provided by the robots detection tool to do a crawl test to verify whether it is effective. If you see robots content is inconsistent with your changes, there are several possibilities: the first may be the site is not fully layout, because some site server may be in many places, you need to confirm that robots is all pushed, the second may be Baidu did not update in time, You can tell Baidu robots has changed by robots the Update button below the detection tool.

3, UA/IPUA block is generally not the site of subjective error operation, often an unexpected situation, the site in the program to filter the Operation UA ban, and usually do not know that they banned the spider. Only in the gradual investigation will find this probability is very small problem. So in the traffic anomaly we put this column in the OPS chapter, let the site attention to these details.This setting is required when your site does not want Baiduspider access, and if you want Baiduspider to visit your site, useragent the relevant settings for Baiduspider UA and modify it in a timely manner. prohibit all crawl from Baidu: User-agent:baiduspider Disallow:/
IP blocking is often encountered when the CC attacks do not know which is true Baidu Spider and counterfeit Baidu Spider and banned the spider IP, here we suggest through the DNS counter-check to let the site know which is the real spider, according to different verification methods platform, such as linux/windows/ The verification methods under OS Three platforms are as follows: A, under the Linux platform, you can use the host IP command to reverse IP to determine whether the fetch from Baiduspider. The hostname of Baiduspider is named *.baidu.com or *.baidu.jp, and non-*.baidu.com or *.baidu.jp is impersonation.

B, under the Windows platform or IBM OS/2 platform, you can use the nslookup IP command to reverse IP to determine whether the crawl from Baiduspider. Open command Processor Input nslookup xxx.xxx.xxx.xxx (IP address) can resolve IP, to determine whether from Baiduspider crawl, baiduspider hostname to *.baidu.com or *.baidu.jp The format is named, non-*.baidu.com or *.baidu.jp is impersonation.
C, under Mac OS platform, you can use the dig command to reverse the IP to determine whether the crawl from Baiduspider. Open command Processor input dig xxx.xxx.xxx.xxx (IP address) can resolve IP, to determine whether from Baiduspider crawl, baiduspider hostname in *.baidu.com or *.baidu.jp format named , non-*.baidu.com or *.baidu.jp is impersonation. For more information, please visit: http://zhanzhang.baidu.com/college/articleinfo?id=34
4. Poor Security/managementSecurity issues in the Webmaster platform can be observed by the vast majority of management problems, sites were hacked to drill the loopholes and punished, and the punishment and the site of the existence of loopholes in proportion to the time. A, domain name Pan-resolution in recent months domain name Pan-resolution is a very classic case, many sites because of weak security awareness, password is simple hacker drilling loopholes, parsing a large number of non-site content of the low-quality page, resulting in Baidu search engine to the whole station to take temporary measures, so that the site flow off a lot of even zero.
B, the site was black and pan-wide interpretation of the site is similar, the site has a large number of garbage pages are created by hackers, resulting in punishment, to the site to bring a deadly blow.
C, the page is hanging horse In fact, this is also a black, but more obscure than the black released garbage page, such cases are mainly in the corporate website or some relatively small site, hackers directly placed on the page display ad code, and will judge whether the visitors are ordinary users or spiders, and then treat the difference , or only to a certain area of the user to visit the Hanging Horse page, to spiders and most of the other places to show normal page, and so on, these are very covert behavior, no user report site is also difficult to find, but Baidu search engine can not tolerate such pages appear in search results, nature these sites will be punished.
D, UGC site was spammer Mass finally said UGC site, all users contribute content site in the audit mechanism must be strengthened to prevent, the platform can receive a lot of cases are UGC site supervision disadvantage, resulting in a large number of garbage content online, When the proportion of garbage content and normal content reaches a certain threshold, it is possible that the whole station was punished by Baidu search engine.

Feature page analysis of the reason of site traffic anomaly


1. Analysis Flowchart

The flowchart will follow three steps to let the site confirm whether the normal fall or abnormal fall, the first step in this webmaster to find the feature page
2. Find Features pageFirst of all, what is called the feature page, that is, the flow of more pages, such as the structure of the page, the content of different pages grab different keywords, but the page frame structure is unified. For example, the keyword "Beijing tourism, Shanghai tourism, Tianjin tourism" corresponding to a-class page, then such keywords disappear, the Class A page is not traffic, so a class page is what we say the feature page, to see what changes a page has occurred. First, to find out the past traffic contrast, a period of time and the current gap. Second, recall the recent changes in such a page, whether these changes exist in the operation and maintenance of the problem, then observe a few days to give a range of traffic loss.
3. Analysis of abnormal characteristicsFrom the flow loss range can be generally judged two cases: A, ranking none, flow without the above mentioned keyword ranking No, site traffic no--it is likely to be punished, may be a partial penalty may be the site as a whole is punished.specific reasons can refer to the previous introduction of pomegranate algorithm and Green 2.0 algorithm, of course, a large number of algorithms we did not publish, you can refer to the "Baidu Web search Quality white paper", to observe whether the site exists such problems.
B, some ranking no, large flow loss For example, the characteristics of the page A for multiple keywords, some keywords have been found under a, and some can still be found, the basic can be explained that the page is not within the scope of punishment, may be Baidu algorithm in the adjustment. And if all the keywords are difficult to find a page, it is very likely to be punished.

Site traffic anomaly causes related factors: external chain explosion



Webmaster platform in a lot of cases, there is a large part of the external chain is abnormal and the flow is greatly affected, in this first to tell you the outside chain tool display data sources and rules: The chain tool is a statistical site in a period of time, linked to your site URL, linked page URL, anchor text and other content, This data can be used by webmasters to identify and determine what is expected to be an outside-chain vote and what is not. So if there is an increase in the chain explosion when most of the site is not in line with the expected growth, here we start from three kinds of situations to analyze and solve such problems.
1, homepage, channel page, the key page was a malicious voteA, this kind of event is mainly in the link URL is inexplicable site large votes, if you encounter this phenomenon must pay attention to, it is likely to be malicious voting behavior, the purpose is through a large number of sites to vote on the site, to reduce the site in Baidu search engine evaluation. B, this kind of problem measures can only be the site to increase the refusal of strength, so as to block out the meaningless voting links
2. User-created page, trash personal page, content pageA, UGC site in particular to pay attention to this, before talking about the need to increase the audit and processing efforts to prevent users to create spam, personal pages. Bad molecules are better at ranking, and will vote on these junk pages.  In particular, it is to be reminded that if the site audit efforts in a timely manner will not happen such things, only the site content for a long time in an unattended state when this problem occurs. B, this kind of problem measures can only be the site to increase the audit efforts, close the garbage page, and refuse to vote for these spam domain names and sites
3, Stitching Search results Page VulnerabilityThe Stitching Search results page is a dynamically created page with changes to the address bar parameters, such asthese pages, title often contains too many spam words, submitted to the search engine, in order to rank the effect of these pages to vote on the resulting excessive outside the chain vote.   

This kind of page bad molecule is drill the stitching search results page title, description can display the stitching code in the spam content. In theory, Baidu search engine will refuse to include such a page, but after all, the site is a huge number of omissions. So if you want to solve this kind of problem, can be in the Baidu Webmaster Platform Feedback Center Feedback The following content: 1) example has been included in the Link Page 2) to search for such page keywords Link 3) Describes the page amount and the amount of external chain (to the data in the analysis of the outside chain)
4, how to prevent the creation of stitching search results pageThe above also said, stitching search results page is bad molecular drill can control the page title, description content and create a large number of pages, so as to conduct voting behavior. Then, if you prohibit these invalid parameters in the Mosaic search results page in the title and description naturally there is no such loophole, in Ctrip, for example, the title and description in the stitching page are fixed content, no matter how the page parameters change these will not change.

Site traffic Anomaly Tracking documentation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.