Simple implementation of concurrent page traffic statistics

Source: Internet
Author: User

Page access statistics, may have been in school to tell how simple to implement, such as saving a page in the ServletContext page access, and then every visit to add 1, or every visit to save the operation record to the database, such processing method, do experiment even, In practical applications such applications have a particularly large impact on system performance.

The first way, because the page access times belong to a public variable, when the public variables are modified, often need to add a synchronous lock, synchronous lock will lead to significantly slower access speed, the second way is the same, and frequent access to the database is not a reasonable way.

Not long ago, a friend of mine asked me to write a simple page statistic code for them. 1, the requirement is to save the page access IP, time, and some other available information, the subsequent need to save the access information can be extended 2, can not affect the current speed of access 3, can support a certain amount of concurrent access
Received a friend to this demand, I think of a few points: 1, how to filter the page we need statistics, 2, the need to separate access and statistics, not access to the thread to save access information, another thread to save access information to the database, 3, you can use a common queue to save this access information; 4. Can save a certain amount of access information in bulk
  

Solution:

1, for the first question, I gave two methods. 1, the use of a collection to save all the required statistics of the page, and then in the filter to determine whether the current request is in the statistical column, 2, in the JSP page introduced a piece of public code, in the code using similar counterutils.addcounter (request); This approach has the benefit of maintaining pages that need to be counted more conveniently, and feeling more efficient, without the need for filter interception. But friends are determined to use the first way, there is no way.

2, every visit, we will need to save the information stored as an object, and then put into the queue, and then another thread to save periodically.

So I wrote a simple demo to a friend, not long before, was returned. After testing the concurrency has not yet 200 to suddenly not save the database, access has become very slow, and finally heap memory overflow.
There is no way to test this machine with LoadRunner at the same time, and to detect memory changes by Jconsole Java's own tools. Test situation and friends say, just start to normal operation, when the concurrency reached a certain amount, began to appear to save slowly, and finally do not know how the whole save thread is no longer running, so the queue is getting bigger, natural heap memory overflow.

  
From the above, it can also be thought that a queue may not support such a large number of concurrent access, so you want to use multiple queues to save, using a similar sub-table library method, different requests are assigned to different queues, so it becomes the following way:

  
Some of the code is as follows:
1. Initialize the list of generated LinkedList collections

/** * Generates an array of queues based on URLs * @return */private static linkedlist<requeststc>[] Inituris () {        Digester digester = new Digester ();        String path = null;        try {path = CounterUtils.class.getClassLoader (). GetResource ("Urls.xml"). Touri (). GetPath ();        } catch (URISyntaxException E1) {e1.printstacktrace ();        } uriruleset RuleSet = new Uriruleset ();        Ruleset.addruleinstances (digester);            try {//URI collection URIs = digester.parse (new File);                        Hashcode Radix basehash = uris!=null?uris.size ()/3:1;                        linkedlist<requeststc>[] Listarr = new Linkedlist[basehash];            for (int i=0; i<listarr.length; i++) listarr[i] = new linkedlist<requeststc> ();        return Listarr;            } catch (Exception e) {e.printstacktrace ();        return null; }    }

2. Encapsulate the request into the required object for statistics

    /**     * Add request Statistics     * @param requests *    /public static void AddCounter (HttpServletRequest request) {                // Encapsulates user-statistic request, and distributes the hash algorithm to different queues        REQUESTSTC STC = new REQUESTSTC ();        Stc.setip (Request.getremoteaddr ());        Stc.seturi (Request.getrequesturi ());        Stc.setnow (Calendar.getinstance (). GetTime ());        Buffers[request.hashcode ()%basehash].push (STC);    }

3. Poll LinkedList Queue Collection

/** * Execute statistics */private void Processcount () {try {//poll queue while (true) {                Thread.Sleep (Sleep_ms);                if (buffers==null) {break;                } Thread th = null;                    for (int i=0,len = buffers.length; i<len; i++) {linkedlist<requeststc> stclist = null;                        if (Buffers[i].size () >=execute_base) {//Copy buffers array element, then empty, start a thread to save the queue Synchronized (Buffers[i]) {stclist = (linkedlist<requeststc>) buffers[i].c                            Lone ();                        Buffers[i].clear ();                        } th = new Thread (new Executethread (stclist));                    Th.start ();        }}}} catch (Exception e) {e.printstacktrace (); }            }        

4. Executethread thread for bulk saving of access logs

Saving a database in bulk

This is divided into two ways 1, save the detailed access records, for example, a certain time an IP on a page to access 2, only one day to save the total number of page access

For the first way, you can use bulk saving. For the second way, you can use a Hashtable to maintain the access increment of the corresponding page for all pages within a certain time period, depending on the following maintenance methods:

The REQESTSTC information is maintained into the Hashtable, in which the maintenance process is omitted, and a timer is written, and the incremental data in the Hashtable is flush into the database at regular intervals;

  

/**     * Swipe data out, save to Database *    /public void flush () {        hashtable<string, requestparam> savetable = null;
Why this place does not use the synchronous lock, because the Hashtable itself is thread-safe, the Clone method adds the synchronous lock savetable = (Hashtable) countertables.clone (); Countertables.clear (); if (Savetable.isempty ()) return; For (entry<string, requestparam> ent:saveTable.entrySet ()) { String url = ent.getkey (); Requestparam param = Ent.getvalue (); System.out.println ("url:" + URL); System.out.println ("PV:" + PARAM.GETPV ()); RequestCount + = PARAM.GETPV (); } System.out.println ("IP:" + ips.size ()); SYSTEM.OUT.PRINTLN ("Total number of Visits:" + RequestCount); }

5. How to intercept access requests that require statistics
Method One: By judging whether the URI is in need of statistics column
Method Two: Add Java code to the JSP that needs statistics for example: Counterutils.addcounter (Request);

Method Three: JS asynchronous access, similar to Baidu statistics this way, this way has a benefit, that is, does not affect the page loading speed

After modification, in LoadRunner and Tomcat test, basically can reach Tomcat maximum concurrent user, and occupy a small amount of resources.

Another way is Baidu statistics that way, in the JS side using asynchronous statistical code, the advantage is not to affect the loading speed of the page, code such as, concrete implementation did not go to the bottom:

Simple implementation of concurrent page traffic statistics

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.