Location analysis of high CPU utilization at one server

Source: Internet
Author: User

Symptom: The current project starts for a period of time and has a service that causes CPU usage to persist over 30%

Environment: Windows 7, Cpu:8 Core, Memory: 8g Memory

Positioning process:

Start the project and view the Java process ID

Check the CPU usage of the event Processor at approximately 1%:

Turn on simulator send a few packets of data, see again, found that the CPU has been racing up:

Close the simulator, wait for the data to be processed, check again, the CPU is not lowered:

Use Processexplorer to view the CPU usage of the event processor thread and discover that the CPU utilization of threads 40160 and 40156 continues to be high,

Use Jstack-l 35684 to print out the line stacks of the event process and see the thread 40156-> for the threads ID (to be converted to 16) corresponding (40160-> 9ce0, stacks 9CDC):

As you can see, threading TaskScheduler_com.delta.atm.services.impl.AutoTTExecutorServiceImpl and Taskscheduler_ Com.delta.atm.services.impl.AlarmExecutorServiceImpl is in a running state, the thread information is continuously printed, and the two threads are always in the runnable state, and simulator is not turned on, and the thread should be asleep.

First-line code, view Alarmserviceimpl,

Check Schedulerserviceimpl again,

From the perspective of the following code, Taketask should only be executed once, and then hibernate, waiting for wake-up without data.

According to the printed line stacks found that these two threads have been running for a long time in the Taketask method, so guess this part has a dead loop, and since this machine is 8 cores, 1/8=0.125, if two threads continue while loop, it will consume 25% CPU resources.

Add the output, print out the global count, find the reason, the thread sleep condition is the count of 0 to sleep, but the data are processed out of the print count is not 0 so the thread will keep idling,

But why does it occur when the data is not processed by Count 0?

From the code results, Alarmservice receives a packet of data count will add 1, each packet of data, count will be reduced by 1, and Count is the Atomicinteger thread safety type.

Since Alarmservice is using a producer-consumer model, the approximate structure is as shown, requiring a global count to represent all of the data in the queue, and when the count number is 0, task schedule sleeps, otherwise it will continue to be dispatched.

View the data in the current queue is empty, indicating that the data in the queue has been processed, but the value of count is not 0.

Look at the following code to find out why, this queue is our own encapsulation, each site corresponding to a Queue,key value is Datatime, this value is based on the Indian box and only 10 bits, accurate to the second, However, simulator each site data transmission frequency will often send multiple packets of data a second, so there will be duplicate datatime, it will cause the queue inside the data is overwritten, this time the total size of the queue and the record count does not match.

In fact, the inside of this queue we use a treemap, so the key value is not allowed to repeat, the best way is to rewrite a allow key value is repeated and sortable a queue, but considering that the real environment in 3 minutes per site to send a packet of data per second packet data can not appear, the case, and treemap internal use is red black tree efficiency is very high, if your own write sorting algorithm efficiency may not be too good, so temporarily use judgment to avoid, if the same site queue inside already have the same datatime directly ignore this packet data.

RePack start view output, count has been lowered

View CPU usage as follows

View thread status, already in waiting, problem solved.

Location analysis of high CPU utilization at one server

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.