Concurrent page access statistics are simple, and concurrent page visits

Last Update:2016-02-22 Source: Internet

Author: User

Tags database sharding

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Concurrent page access statistics are simple, and concurrent page visits

Page access statistics may show you how to implement them easily when you go to school. For example, you can save a page access count in servletContext and Add 1 for each access; or save the operation records to the database every time you access the database. This processing method will be used for experiments. In actual applications, this will have a significant impact on the system performance.

The first method is to add a synchronization lock when modifying the public variables because the number of page visits is a public variable. The synchronization lock will significantly slow the access speed; the second method is the same, and frequent database access is not a reasonable method.

Not long ago, a friend of mine asked me to help them write a simple page statistics code. 1. The requirement is to save the page access IP address, time, and other available information, the access information to be saved in the future can be expanded. 2. The current access speed cannot be affected. 3. A certain amount of concurrent access is supported.
After receiving this request from a friend, I thought of the following points: 1. How to filter the pages we need for statistics; 2. Separating access from statistics, the access information is not stored in the access thread. The other thread saves the access information to the database. 3. A public queue can be used to save the access information; 4. You can save a certain amount of access information in batches.
　　

Solution:

1. For the first question, I have provided two methods. 1. Use a set to save all the pages for statistics, and then judge whether the current request is in the statistics column in the Filter. 2. Introduce a public code in the JSP page, use CounterUtils like this in the code. addCounter (request); this method has the advantage that it is easier to maintain pages that require statistics, and it feels more efficient, and does not require Filter interception. However, it is impossible for a friend to use the first method.

2. Every access, we save the information to be saved into an object, and then put it into the queue. Then another thread regularly saves the information.

So I wrote a simple demo to my friends. It wasn't long before I was returned. After testing, the database is not saved until the concurrency reaches 200, and the access slows down. In the end, the heap memory overflows.
There is no way to test with loadRunner in the current machine, and detect memory changes through the jconsole java tool. The test is the same as my friend said. It can run normally at the beginning. When the concurrency reaches a certain amount, the storage starts to slow down. At last, I don't know how to keep the entire storage thread from running, in this way, the queue is getting bigger and bigger, and the memory size of the heap naturally overflows.

　　
From the above situation, we can also think that a queue may not support such a large number of concurrent accesses, so we want to use multiple queues for storage, using methods similar to table sharding and database sharding, different requests are allocated to different queues, so the following method is used:

　　
Some code is as follows:
1. initialize and generate the sequence list set list

/*** Generate a queue Array Based on urls * @ return */private static queue list <RequestStc> [] initUris () {Digester digester = new Digester (); String path = null; try {path = CounterUtils. class. getClassLoader (). getResource ("urls. xml "). toURI (). getPath ();} catch (URISyntaxException e1) {e1.printStackTrace ();} UriRuleSet ruleSet = new UriRuleSet (); ruleSet. addRuleInstances (digester); try {// uri set uris = digester. pars E (new File (path); // hashCode base BaseHash = uris! = Null? Uris. size ()/3:1; parameter list <RequestStc> [] listArr = new parameter list [BaseHash]; for (int I = 0; I <listArr. length; I ++) listArr [I] = new javaslist <RequestStc> (); return listArr;} catch (Exception e) {e. printStackTrace (); return null ;}}

2. encapsulate the request as a required statistical object

/*** Add a request statistics * @ param request */public static void addCounter (HttpServletRequest request) {// encapsulate the user statistics request, in addition, the hash algorithm is used to distribute data to different queues. RequestStc stc = new RequestStc (); stc. setIp (request. getRemoteAddr (); stc. setUri (request. getRequestURI (); stc. setNow (Calendar. getInstance (). getTime (); buffers [request. hashCode () % BaseHash]. push (stc );}

3. Round-Robin into the list queue set

/*** Execution statistics */private void processCount () {try {// round-robin queue while (true) {Thread. sleep (Sleep_MS); if (buffers = null) {break;} Thread th = null; for (int I = 0, len = buffers. length; I <len; I ++) {buffers list <RequestStc> stcList = null; if (buffers [I]. size ()> = Execute_Base) {// copy the buffers array element and clear it. Start a thread to save synchronized (buffers [I]) to the queue. {stcList = (queue list <RequestStc>) buffers [I]. clone (); buffers [I]. clear () ;}th = new Thread (new ExecuteThread (stcList); th. start () ;}}} catch (Exception e) {e. printStackTrace ();}}

4. The ExecuteThread thread is used to save access logs in batches.

// Save databases in batches

There are two methods: 1. Save detailed access records. For example, when an IP address accesses a page, 2. Only save the total number of accesses to each page in a day.

For the first method, use batch save. For the second method, you can use hashTable to maintain the access increments of all pages within a certain period of time. The specific maintenance methods can be as follows:

Maintain reqestStc information in HashTable, where the maintenance process is omitted; write a timer to regularly flush the incremental data in HashTable to the database;

/*** Fl data and save it to the database */public void flush () {Hashtable <String, RequestParam> saveTable = null;
// Why not use the synchronization lock in this place, because HashTable itself is thread-safe, and the synchronization lock saveTable = (Hashtable) counterTables is added to the clone method. clone (); counterTables. clear (); if (saveTable. isEmpty () return; for (Entry <String, RequestParam> ent: saveTable. entrySet () {String url = ent. getKey (); RequestParam param = ent. getValue (); System. out. println ("url:" + url); System. out. println ("pv:" + param. getPv (); requestCount + = param. getPv ();} System. out. println ("ip:" + ips. size (); System. out. println ("total access:" + requestCount );}

5. How to intercept access requests that require statistics
Method 1: Determine whether the uri is in the column needing statistics
Method 2: add the JAVA code such as CounterUtils. addCounter (request) to the jsp to be counted );

Method 3: JS asynchronous access, similar to Baidu statistics. This method has the advantage of not affecting the page loading speed.

After modification, in the loadRunner and tomcat tests, it can basically achieve the maximum concurrency of tomcat and above users, and occupy a small amount of resources.

Another method is Baidu statistics. asynchronous statistics code is used on the js end. The advantage of this method is that the page loading speed is not affected. For example, the Code implementation is not further explored:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More