1) Simple rough type, here do not have to Tube browser user-agent, regardless of information such as cookies, every time the PV generated, on the direct count, advantages: simple, disadvantage: may not be true, may also have brush volume data
2) Slightly more detailed statistics, will distinguish between the new and old users, landlord you can study the Baidu Statistics SDK, which contains the user's browser information, operating system information, the user's geographical information, that is, you through the browser's JavaScript and server data interaction, For the background server, is can obtain this data, then for webmaster home Such a site, he may want to statistics to real user access situation, so that some behavioral analysis, this will be combined with the user's IP information, Cookie information (that is, session) and user-agent to statistical analysis, note that the IP here is mapped IP address, for our daily home dial-up Internet, is to get the operator's virtual out of the intranet address in order to save IPV4 resources, so say, a user- Agents, IP, and cookies can basically uniquely identify a user's information.
3) Further, with this data, from a design point of view, the reading volume of this information in the page display is not the highest priority (the highest priority should be the business content itself), but the reading amount of relevant information is meaningful, then the problem comes, Is this information on the design level of the database to be written and locked for mutual exclusion? It is recommended to understand what the CAP principle is.
4) So the solution, may be the cache, may also have IP judgment, the detection of cookies, this to try to know, but personally think the most likely is the amount of reading, Autohome is a method of asynchronous statistics, that is, you produce real reading, It was after the background processing that he gave reading counter +1.
Provide a way to implement this idea:
One IP up to two reads a similar mechanism, or there is a deeper level of logic to judge, such as the next day the IP is emptied, and then this statistical algorithm becomes every day every IP has two times the chance to read more times
For a fixed period of time (say 30 minutes), no matter how many times you access the same browser kernel, you only increase the amount of reading.
Verifying user-agent, cookies and other information; Insert a guest record into a table each time you browse
Weibo implementation: I am doing micro-blogging, I talk about the practice of micro-bo. Reads, likes, single access restrictions. are implemented using Redis. Then synchronize the database (according to certain rules, batches, etc.) every night for free time.
If the user has logged in, only one time, if it is a tourist, according to IP, timestamp, cookies and other comprehensive judgment, the same is counted only once.
This prevents brush browsing.
SEO Statistics algorithm