C # locate all cup Problems

Source: Internet
Author: User
Tags wrappers high cpu usage

Abstract:

When the CPU usage of A. NET application suddenly remains high in the production environment, how can we quickly and accurately locate the problem and minimize the impact on real-time business? How can I know what your application is doing without capturing dump or using live debug? How can I check which thread causes the CPU to rise and what code is being executed by this thread?

Analysis:
There are many reasons for CPU increase,
1. Sometimes the application load is high, and the CPU will naturally be increased and increased by business requests;
2. Sometimes high CPU resources are recycled due to GC;
3. Sometimes the code executed by a thread is in an endless loop under some circumstances;
4. Sometimes it is because the lock contention is too fierce. After the lock on a resource is released, the waiting thread will get the lock;
5. Sometimes it is caused by too many threads and too frequent context switching.
6. Too many exceptions are thrown every second.
One-by-one analysis
1. We usually use some counters to observe the load and concurrent requests of the actual application, such as the number of requests received per second. Therefore, the CPU usage caused by the increase in business volume is high and it is easy to determine.
2. the CPU percentage used by GC has a dedicated counter.
3. If a piece of code falls into an endless loop and causes high CPU usage, just capture dump and check it ~ * E! Clrstack and! Runaway is still difficult to locate,
A) generally, it takes several consecutive dump attempts and then uses them! Runaway checks which threads have a very low time difference in user State, and then checks the call stack of the thread.
B) Recording thread/thread ID and thread/% processor time counters, simultaneously capturing dump, find the thread ID with high CPU consumption from the counter, then we can look at the local variables of the Call Stack and call stack parameters from dump.
4. There are also related. Net counters for Lock contention.
5. The number of threads of each process and the number of context switches per second can also be viewed directly by counters.
6. The exception thrown every second can also be viewed as a direct counter.

Ideas:
1. We can see from the above that 3rd types of land are difficult to troubleshoot, and it is sometimes easy to capture the service s when capturing dump. If it is a stateful service, the consequences of killing the service are very serious, therefore, we have to come up with a more lightweight method to obtain the call stack of each thread of the service. In fact, the CLR itself has some interfaces that support debugging, all of which are com,. net has some packaging for this, you can use C # to use these debugging APIs, of course, include attaching to the process to obtain all the thread call stack functions. This dll is called mdbgcore. dll in the. net sdk.
2. In addition, the counter. Net has a ready-made class, which was introduced in the previous post.
3.. Net also provides some APIs for process management to obtain information about the number of threads of a process, startup time of each thread, user-state time, thread status, and priority.

With the above knowledge points, we can combine them to write a tool that intelligently locates high CPU problems.

High CPU demo
First, we will write a CPU-high demo. Method A does not consume much CPU because of sleep, while Method B does not. Therefore, we will execute a floating point operation, therefore, the CPU usage increases (one CPU resource is occupied ).
Using system; <br/> using system. threading; </P> <p> namespace hightcpudemo <br/>{< br/> internal class Program <br/>{< br/> Private Static void main (string [] ARGs) <br/>{< br/> New thread (). start (); <br/> New thread (B ). start (); <br/> console. readkey (); <br/>}</P> <p> Private Static void a (object state) <br/>{< br/> while (true) <br/>{< br/> thread. sleep (1000); <br/>}</P> <p> Private Static void B (object state) <br/>{< br/> while (true) <br/>{< br/> double D = new random (). nextdouble () * new random (). nextdouble (); <br/>}< br/>

Code Implementation

Our goal is to find Method B while the program is running, and confirm that it is the cause of CPU height. The Code is as follows. I don't want to explain it very much. The code is not complicated and focuses on the idea.

Using system; <br/> using system. collections. generic; <br/> using system. diagnostics; <br/> using system. io; <br/> using system. text; <br/> using system. threading; <br/> using Microsoft. samples. debugging. mdbgengine; </P> <p> internal class mythreadinfo <br/>{< br/> Public String callstack = "null"; <br/> Public String ID; <br/> Public String processortimepercentage; <br/> Public String starttime; <br/> Publ IC string userprocessortime; </P> <p> Public override string tostring () <br/>{< br/> return <br/> string. format (<br/> @ "<Table Style =" "width: 1000px;" "> <tr> <TD style =" "width: 80px; ""> threadid </TD> <TD style = "" width: 200px; "" >{0 }</TD> <TD style = "" width: 140px; ""> % processor time </TD> <TD >{1} </TD> </tr> <br/> <tr> <TD style = "" width: 80px; ""> userprocessortime </TD> <TD style = "" width: 200px; "" >{2} </TD> <TD sty Le = "" width: 140px; ""> starttime </TD> <TD> {3} </TD> </tr> <TD colspan = "4" "> {4} </ TD> </tr> </table> ", <br/> ID, processortimepercentage, userprocessortime, starttime, callstack ); <br/>}</P> <p> internal class mythreadcounterinfo <br/>{< br/> Public performancecounter idcounter; <br/> Public performancecounter processortimecounter; </P> <p> Public mythreadcounterinfo (performancecounter counter1, PE Rformancecounter counter2) <br/>{< br/> idcounter = counter1; <br/> processortimecounter = counter2; <br/>}</P> <p> internal class Program <br/>{< br/> // skip past fake attach events. <br/> Private Static void drainattach (mdbgengine debugger, mdbgprocess proc) <br/>{< br/> bool foldstatus = debugger. options. stoponnewthread; <br/> debugger. options. stoponnewthread = false; // skip while Wai Ting for attachcomplete </P> <p> Proc. go (). waitone (); <br/> debug. assert (Proc. stopreason is attachcompletestopreason); </P> <p> debugger. options. stoponnewthread = true; // needed for attach </P> <p> // drain the rest of the thread create events. <br/> while (Proc. corprocess. hasqueuedcallbacks (null) <br/>{< br/> Proc. go (). waitone (); <br/> debug. assert (Proc. stopreason is thre Adcreatedstopreason); <br/>}</P> <p> debugger. options. stoponnewthread = foldstatus; <br/>}</P> <p> // expects 1 Arg, the PID as a decimal string <br/> Private Static void main (string [] ARGs) <br/>{< br/> try <br/>{< br/> int pid = int. parse (ARGs [0]); </P> <p> var sb = new stringbuilder (); <br/> Process = process. getprocpolicyid (PID); <br/> var counters = new dictionary <string, mythreadcount Erinfo> (); <br/> var threadinfos = new dictionary <string, mythreadinfo> (); </P> <p> Sb. appendformat (<br/> @ "<HTML> <pead> <title> {0} </title> <MCE: style type =" "text/CSS"> <! -- <Br/> table, TD {border: 1px solid #000; border-collapse: Collapse ;}< br/> --> </MCE: style> <style type = "" text/CSS "" mce_bogus = "1"> table, TD {border: 1px solid #000; border-collapse: Collapse ;}} </style> </pead> <body> ", <br/> process. processname); </P> <p> console. writeline ("1. Collecting counters"); </P> <p> var Cate = new performancecountercategory ("Thread"); <br/> string [] instances = Cate. getinstancenames (); <br/> Foreach (string instance in instances) <br/>{< br/> If (instance. startswith (process. processname, stringcomparison. currentcultureignorecase) <br/>{< br/> var counter1 = <br/> New performancecounter ("Thread", "id thread", instance, true ); <br/> var counter2 = <br/> New performancecounter ("Thread", "% processor time", instance, true); <br/> counters. add (instance, new mythreadcounterinfo (counter1, Counter2); <br/>}</P> <p> foreach (VAR pair in counters) <br/>{< br/> pair. value. idcounter. nextvalue (); <br/> pair. value. processortimecounter. nextvalue (); <br/>}< br/> thread. sleep (1000); <br/> foreach (VAR pair in counters) <br/>{< br/> try <br/>{< br/> var info = new mythreadinfo (); <br/> info. id = pair. value. idcounter. nextvalue (). tostring (); <br/> info. processortimepercentage = Pai R. value. processortimecounter. nextvalue (). tostring ("0.0"); </P> <p> threadinfos. add (info. ID, Info); <br/>}< br/> catch <br/>{< br/>}</P> <p> console. writeline ("2. Collecting thread information"); <br/> processthreadcollection collection = process. threads; <br/> foreach (processthread thread in Collection) <br/>{< br/> try <br/>{< br/> mythreadinfo Info; <br/> If (threadinfos. trygetvalue (thread. id. tostring (), O Ut info) <br/>{< br/> info. userprocessortime = thread. userprocessortime. tostring (); <br/> info. starttime = thread. starttime. tostring (); <br/>}< br/> catch <br/>{< br/>}</P> <p> var debugger = new mdbgengine (); </P> <p> mdbgprocess proc = NULL; <br/> try <br/> {<br/> proc = debugger. attach (PID); <br/> drainattach (debugger, Proc); </P> <p> mdbgthreadcollection Tc = Proc. threads; <br /> Console. writeline ("3. attaching to process {0} to obtain call stack", pid); <br/> foreach (mdbgthread t in TC) <br/>{< br/> var tempstrs = new stringbuilder (); <br/> foreach (mdbgframe F in T. frames) <br/>{< br/> tempstrs. appendformat ("<br/>" + F); <br/>}< br/> mythreadinfo Info; <br/> If (threadinfos. trygetvalue (T. id. tostring (), out Info) <br/>{< br/> info. callstack = tempstrs. length = 0? "No managment call stack": tempstrs. tostring (); <br/>}< br/> finally <br/>{< br/> If (Proc! = NULL) <br/>{< br/> Proc. detach (). waitone (); <br/>}< br/> foreach (VAR info in threadinfos) <br/>{< br/> Sb. append (info. value. tostring (); <br/> Sb. append ("<HR/>"); <br/>}< br/> Sb. append ("</body> </ptml>"); </P> <p> console. writeline ("4. Generating Report"); <br/> using (VAR Sw = new streamwriter (process. processname + ". htm ", false, <br/> encoding. default) <br/>{< br/> SW. write (sb. tostring (); <br/>}</P> <p> process. start (process. processname + ". htm "); <br/>}< br/> catch (exception ex) <br/>{< br/> console. writeline (Ex); <br/>}< br/>

Unit Test
Find the thread ID of heightcpudemo, for example, 8724. then execute printstack.exe 8724. The output result is as follows:
E:/study/threadstack/printstack/bin/debug> printstack.exe 8724
1. Collecting counters...
2. Collecting thread information...
3. Attaching process 8724 to get the call stack...
4. Generating report...

At the end, a Report of hightcpudemo.htm will be generated in the current directory. Which thread consumes a lot of CPU and its managed call stack is clear at a glance, so the problem can be quickly located.

Threadid 10280 % Processor time 97.1
Userprocessortime 00:00:20. 2187500 Starttime 21:58:19
System. Random. Sample (source line information unavailable)
System. Random. nextdouble (source line information unavailable)
Hightcpudemo. program. B (program. CS: 27)
System. Threading. threadhelper. threadstart_context (source line information unavailable)
System. Threading. executioncontext. Run (source line information unavailable)
System. Threading. threadhelper. threadstart (source line information unavailable)

Reference

How the. NET debugger works
Http://www.developerfusion.com/article/4692/how-the-net-debugger-works/
Working on managed wrappers for native debugging API
Http://blogs.msdn.com/jmstall/archive/2006/07/05/managed-wrappers-for-native-debug-api.aspx
Runtime call stack analysis with. net
Http://www.ddj.com/184405715
Mdbg linkfest
Http://blogs.msdn.com/jmstall/archive/2005/11/08/mdbg_linkfest.aspx
Tool to get snapshot of managed callstacks
Http://blogs.msdn.com/jmstall/archive/2005/11/28/snapshot.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.