Video Learning Website learning duration real-time recording-performance optimization practices
I. Application Scenario Description
The system provides services for teachers to learn online. The video learning website supports online video learning for teachers. During video learning, the learning process of teachers is recorded. Each topic corresponds to multiple teaching videos, and the duration of each teaching video is different. The record rules are as follows: when you watch a video, the instructor submits a request every minute on the page where the video is located, records the learning duration of the video, and updates the record to the database.
Currently, the database has 8266 instructor users. Under the *** policy, most instructor users may watch videos online at the same time for a certain period of time. This means that in extreme cases, more than 6000 requests may be submitted per minute, which puts a lot of pressure on the application server. In addition, before updating the learning duration record, we will compare it with the existing duration (Real-Time query is required). If the current submitted duration is longer than the current duration, it will be updated, otherwise, frequent queries and updates to the database seriously reduce the system response speed. It is imperative to optimize the recording process of learning duration.
Ii. Hardware and Software
Hardware: one server with 4 cores and memory
Software: Windows Server 2008 64-bit operating system, Tomcat 7, jdk1.6, MySql5.5
Iii. Optimization process
Phase 1
Analysis: the main stress in this process is that a large number of server requests and frequent database connections. Therefore, merging requests should solve the problem by using the cache mechanism.
Solution: Put users' requests into the cache, perform regular and centralized processing, and merge updates.
Specific practice: Use the HashMap after thread-safe packaging as the user request cache.
Public static Map <LearnTime, Integer> map = Collections. synchronizedMap (new LinkedHashMap <LearnTime, Integer> ()); // Put the request into the cache after the user requests come Protected static void put (String userName, String videoId, int topicId, int totalTime, int learnTime ){ LearnTime learn = new LearnTime (userName, videoId, topicId, totalTime ); // It can ensure that the cache time is longer than the viewed time. If (map. containsKey (learn )){ If (learnTime> map. get (learn )){ Map. put (learn, learnTime ); } } Else { // Each "user-video" can be queried only once in a cache time. Int learnedTime = dao. getLearnedTime (userName, videoId, topicId); time learned If (learnTime> learnedTime ){ Map. put (learn, learnTime ); } Else { Map. put (learn, learnedTime ); }} } |
Process the cached data once every hour, update the learning records to the database, traverse the HashMap data items, generate SQL statements, splice them together, and process the data within a connection.
Iterator <LearnTime> I = s. iterator (); StringBuilder str = new StringBuilder (); While (I. hasNext ()){ Learn = I. next (); UserName = learn. getUserName (); VideoId = learn. getVideoId (); TopicId = learn. getTopicId (); LearnTime = map. get (learn ); TotalTime = learn. getTotalTime (); If (learnTime <totalTime ){ // Keep two decimal places for the score String sScore = new DecimalFormat ("#. 00"). format (0.5 * learnTime/totalTime ); Double score = Double. valueOf (sScore ); Str. append (SQL statement ); } Else if (learnTime> = totalTime ){ LearnedTime = dao. getLearnedTime (userName, videoId, topicId); time learned If (learnedTime! = TotalTime ){ Str. append ("SQL statement "); } } } |
After this process, the system performance has been improved, but the database connection pressure is still quite high. From the program code, we can see that before adding learning records and updates to the cache, all of these operations connect to the database for query, which also consumes a lot of resources for database connection. In addition, a large number of requests from the client to the server have not been well resolved. Therefore, optimization is required.
Stage 2
Analysis: We have analyzed two problems. One is the large number of concurrent requests on the application server, and the other is the frequent access to the database. Another thing I have not mentioned is that when the cache is being read, writing data to the cache is blocked. If the cache traversal and update processing are too slow, this will cause long request blocking.
Solution: For Application Server requests, since each request only performs a query and then updates data to the cache, this process is very short, we can use Tomcat to configure a large thread pool to respond to so many requests. For frequent database access, it is not difficult to find that the update process has already been merged, the query only checks whether updates are made to the database. If the database determines whether updates are made, all query requests are merged into the update, this problem is solved. The cache traversal speed is determined by the cache size and the appropriate cache cycle needs to be selected. The update processing can be stripped out, while traversing the cache, extract the data and open another thread to process the update operation. Release the HashMap lock to reduce the blocking duration. The process of traversing and generating SQL statements should be controlled within several seconds, it has little impact.
Specific Practices:
1. Put the request into the cache
Public static Map <LearnTime, Integer> map = Collections. synchronizedMap (new LinkedHashMap <LearnTime, Integer> ()); // Put the request into the cache after the user requests come Protected static void put (String userName, String videoId, int topicId, int totalTime, int learnTime ){ LearnTime learn = new LearnTime (userName, videoId, topicId, totalTime ); // It can ensure that the cache time is longer than the viewed time. If (map. containsKey (learn )){ If (learnTime> map. get (learn )){ Map. put (learn, learnTime ); } } Else { // Directly put the data into the cache to determine whether the data is updated in the database for processing. Map. put (learn, learnTime ); } } |
2. Update process Stripping
Static class UpdateTask implements Runnable { Private String SQL; Public UpdateTask (String SQL ){ This. SQL = SQL; } @ Override Public void run (){ Dao. updateLearnTime (SQL ); } } |
3. cache period cleaning
Count ++ ;//Each time an update is processed, the count is incremented by one. // When the cache duration is reached, the cache is cleared and the count is cleared. If (count % clearCycle = 0 ){ Map. clear (); Count = 0; } |
4. The verification process is performed in the database, that is, the update statement is limited, omitted.
5. Configure the Tomcat connection pool of the application server.
1) download tcnative-1.dll to support APR requests 2) copy the dll file to windows/system32 or add it to the path 3) configure server. xml in Tomcat <Executor name = "tomcatThreadPool" NamePrefix = "tomcatThreadPool -" MaxThreads = "1000" MaxIdleTime = "300000" MinSpareThreads = "100" PrestartminSpareThreads = "true" /> <Connector executor = "tomcatThreadPool" URIEncoding = "UTF-8" port = "80" protocol = "org. apache. coyote. http11.Http11AprProtocol" ConnectionTimeout = "20000" RedirectPort = "8453" MaxThreads = "1000" MinSpareThreads = "200" AcceptCount = "1000" /> |
Iv. Optimization Summary
Optimization in this actual scenario mainly involves four aspects: 1. Merge database connection requests; 2. Increase the number of response threads of the application server; 3. Actual Update Processing and cache cycle stripping to reduce blocking; 4. Weigh the cache size and user habits, and set a reasonable cache cleaning cycle.
In addition, the initial bottleneck of this scenario is frequent database connections, which can be optimized by merging connections. In extreme cases, cannot I merge connections? This requires optimization by using the connection pool at the database access layer. Under the existing architecture, you do not know how to configure the database connection pool. This is an important optimization point to be explored.
V. Test Data
In actual scenarios, if more than 8000 users watch videos online at the same time, there are more than 8000 requests per minute, with an average of 140 requests per second. Each request is executed by a thread. During the test, the process is simulated.
Test Case: 20 requests are submitted every 100 milliseconds, that is, 200 requests per second. Each request starts a new thread for execution. A total of 2000000 requests are submitted, and the test duration is 10000 seconds. Watch the video online and process updates every three minutes for 3000 users (because of poor simulation, the data in the database is extracted and tested in the program by random request submission, cache is cleared every eight processing cycles. There is no pressure on the test results. In actual scenarios, there may be more cache items to be traversed. However, based on experience, the server can execute more than 3000 update statements per second without blocking the cache, therefore, it fully meets the performance requirements.
Change the request frequency to 50 requests per 100 milliseconds, that is, 500 requests per second. Other conditions remain unchanged, and the test duration is 4000 seconds. The test results are also very satisfactory, and there is not much change, that is to say, there is a problem with the support of 30 thousand users to record the request length. A waste of server resources!
The test code is as follows:
Random random =NewRandom (); LearnTime learn; IntIndex = 0; IntLearnTime = 0; For(IntI = 0; I <2000000; I ++ ){ Index = random. nextInt (3000 ); Learn =Learns[Index]; LearnTime = random. nextInt (learn. getTotalTime () + 1 ); Thread thread =NewThread (NewPutTask (learn, learnTime )); Thread. start (); If(I % 50 = 0 ){ Thread.Sleep(100 ); } } // Request thread simulation is as follows: Static classPutTaskImplementsRunnable { PrivateLearnTimelearn; PrivateintLearnTime; PublicPutTask (LearnTime learn,IntLearnTime ){ This. Learn = learn; This. LearnTime = learnTime; } PublicvoidRun (){ LearnTimeHandleServlet.Put(Learn, learnTime ); } } |