An optimization solution for processing big data in java

Source: Internet
Author: User

An optimization solution for processing big data in java
I have mentioned to you that our company is currently working on a mobile app store project. We have tested an average of 2000 requests per minute, about 50 requests per second. Now there must be more, when the data volume is large, there will be 400 ~ 500 SQL insert operations (record user behavior, every request will write information to the log table), then we do not use hadoop or other distributed systems, the server seems to have 8 GB of memory, the CPU is 16-core, which is similar to the current situation and often leads to connection timeout. I have also done some optimizations before. Click to view the Big Data Optimization. Today I have optimized it again. previously, it was at the configuration and server level. This time it was at the code level.

As I said above, we will insert user information into the database when every request comes in. When 400 requests come in at the same time, 400 SQL statements are inserted at the same time. One insert occupies a connection, which is very dangerous, this often leads to connection bursts, So we directly initiate a thread in this place, put a large amount of data to be inserted into the thread List, and then Perform Batch insert operations in the thread, in this case, when there is a large amount of data, only one Connection can be used to insert 300 million pieces of data into the database, saving a lot of Connection resources! In addition to the large amount of log data inserted in my project, there is also a Push function. When pushing advertisements and so on, more than 1000 of requests come in every second, the push table will be inserted when the log table is inserted. You say this is a lot of injury, so the push is also optimized.

The idea is as above. The optimization result is that you don't have to think about it. Mom no longer has to worry that the application connection to the database will time out, unless the network speed is too bad.

We hope that the above ideas and optimization solutions will help you and our friends, and we will use hadoop right away. We will share our experiences with you later. I will first provide some instance code below.

1. Thread code for log insertion

packagecom.xxx.appstore.util;
 
importjava.sql.Connection;
importjava.sql.PreparedStatement;
importjava.util.ArrayList;
importjava.util.List;
 
importorg.slf4j.Logger;
importorg.slf4j.LoggerFactory;
 
importcom.xxx.appstore.Constants;
importcom.xxx.appstore.UserData;
importcom.xxx.common.util.db.DBUtil;
 
publicclass InsertLogThread extendsThread{
    privatestatic Logger logger = LoggerFactory.getLogger(InsertLogThread.class);
    privatestatic List
 
   saveList =
  new ArrayList
  
   ();
  
 
    privatestatic List
 
   actionList =
  new ArrayList
  
   ();
  
 
 
    public InsertLogThread(){             
    }
 
    publicvoid addUserData(UserData data){
            synchronized(saveList){
                   saveList.add(data);
 
                   saveList.notify();
            }
    }
 
    publicvoid run(){
            while(true){
                   while(saveList.size()> 0){
                           actionList.add(saveList.remove(0));
                           if(actionList.size()> 2000){
                                  break;
                           }
                   }
 
                   insertLog(actionList);
                   actionList.clear();
 
                   try{
                           synchronized(saveList){
                                  saveList.wait();
                           }
 
                   }catch(InterruptedException e){
                           e.printStackTrace();
                   }                      
            }
    }
 
 
    /**
*Insert this request.
 * 
 */
    publicvoid insertLog(List
 
   actionList){
 
            Connection con =null;
            PreparedStatement pst =null;
 
            String sql =insert into tblLog(TId,UuId,Imsi,Brand,Model,Channel,Plat,AndroidVer,ScreenSize,Lang,AppStoreVer,Provider,ConnectionMode,GetLocType,LocStr,country,province,city,IpAddr,AccessType,CurrPage,ProPage,proContent,AppId,OtherParas,Created,+
                          phone,product,sdk,display,codename,tCardSize,RAM,cpuClockSpeed,source,smsCenter,enc,pVer,imei,pkg) values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,sysdate,+
                           ?,?,?,?,?,?,?,?,?,?,?,?,?,?);
            try{
                   con = DBUtil.getConnection(Constants.dbName);
 
                   long id = DBUtil.getNextSeq(con, seq_tbllog_id);
                pst = con.prepareStatement(sql);
                for(int j =0; j < actionList.size(); j++){
                    UserData data = actionList.get(j);
                       pst.setLong(1, id);
                       pst.setString(2, data.aid);
                       pst.setString(3, data.imsi);
                       pst.setString(4, data.brand);
                       pst.setString(5, data.model);
                       pst.setString(6, data.key);
                       pst.setString(7, data.har);
                       pst.setString(8, data.release);
                       pst.setString(9, data.sc);
                       pst.setString(10, data.lang);
                       pst.setString(11, data.storeVer);
                       pst.setString(12, data.providerName);
                       pst.setString(13, data.netType);
                       pst.setString(14, null);
                       pst.setString(15, data.loc);
                       pst.setString(16, data.country);
                       pst.setString(17, data.province);
                       pst.setString(18, data.city);
                       pst.setString(19, data.ip);
                       pst.setString(20, getString(data.currentRequestType, 50));
                       pst.setString(21, getString(data.currentRequestContent, 50));
                       pst.setString(22, getString(data.lastRequestType, 50));
                       pst.setString(23, getString(data.lastRequestContent, 50));
                       pst.setString(24, data.appId);
                       pst.setString(25, getString(data.otherParams, 150));
                       pst.setString(26, data.phoneNum);
                       pst.setString(27, data.product);
                       pst.setString(28, data.sdk);
                       pst.setString(29, data.dis);
                       pst.setString(30, data.code);
                       pst.setString(31, data.tcard);
                       pst.setString(32, data.ram);
                       pst.setString(33, data.fre);
                       pst.setString(34, data.source);
                       pst.setString(35, data.smsCenter);
                       pst.setString(36, data.enc);
                       pst.setInt(37, data.pVer);
                       pst.setString(38, data.imei);
                       pst.setString(39, data.pkg);
 
                       pst.addBatch();
                }
              pst.executeBatch();
            }catch(Exception e){
Logger. error (failed to insert log, e );
            }finally{
                   DBUtil.closePreparedStatement(pst);
                   DBUtil.closeConnection(con);
            }
    }
 
    publicString getString(String str, int length){
            if(str ==null){
                   return str;
            }
 
            while(str.getBytes().length> length){
                   str = str.substring(0, str.length()- 4);
            }
 
            return str;
    }
 
 
 
}

2. For the code used during the call, the following three codes are in different code blocks. Considering the confidentiality of the company, I will not take out all the codes and only list the codes related to the scheme.

PublicstaticInsertLogThread thread =NewInsertLogThread ();// This thread is the top instance of a related class call.
 
 
Thread. start ();// Start the thread in an initialized place
 
Thread. addUserData (userData );// Write this code where thread operations are required. userData can be an object or a value.

 

After sharing the dry goods, you are ready to get off work.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.