High CPU is a common fault on website servers. Many windbg tutorials use high CPU as an example. I also met this on the company server in March. The entire debugging process was smooth and regular, but the final reason was very interesting. It was related to a Trojan behavior.
The w3wp process of the website server often has a sudden high CPU, as shown in the week15-16. The corrected results are quite obvious.
Analysis process:
(1) previously, windbg was installed on the server, and adplus. vbs-hang-pn w3wp.exe-o d: ops was run at the cpu high point to generate dump files,
If you do not have any conditions, you can use the ntsd provided by the system to generate them. For example:
Ntsd-pv-pn w3wp.exe-logo d: out.txt-lines-c ". dump/ma d: testlocal. dmp; q"
For more information about the use of ntsd on 64-bit machines, see the 32-bit w3wp process on 64-bit machines.
Because we need to solve the high CPU problem, the idea is to analyze the cpu time occupied by a thread after the process starts. Therefore, you need to take multiple dump to see which thread is used for "Maximum CPU usage growth" in the "high cpu period". The final two files obtained are as follows:
(2) Start 1227.dmpin windbg and run it! The runaway command shows the total CPU usage time of each thread.
0: 000>! Runaway
User Mode Time
Thread Time
18: fdc 0 days 1:20:28. 390
19: 1370 0 days 1:16:36. 359
1:08:28. 765
22: 698 0 days 1:07:55. 968
20: 1180 0 days 0:58:22. 046
138: 1284 0 days 0:56:53. 890
136: f9c 0 days 0:49:38. 609
9: 1094 0 days 0:44:26. 312
147: db8 0 days 0:25:16. 234
149: 6f4 0 days 0:22:00. 687
148: c8c 0 days 0:20:29. 156
0:01:31 PM 0 days. 562
12: d24 0 days 0:01:27. 593
14: 5e8 0 days 0:01:26. 203
11: ce0 0 days 0:01:06. 703
(3) Check another dmp file. windbg opens 1236.dmp.
0: 000>! Runaway
User Mode Time
Thread Time
18: fdc 0 days 1:21:09. 125
19: 1370 0 days 1:20:20. 468
1:08:43. 140
22: 698 0 days 1:08:28. 812
20: 1180 0 days 1:03:01. 078
138: 1284 0 days 0:57:49. 281
136: f9c 0 days 0:55:01. 250
9: 1094 0 days 0:44:50. 781
146: db8 0 days 0:27:10. 062
147: c8c 0 days 0:25:17. 828
148: 6f4 0 days 0:25:03. 656
0:01:32 PM 0 days. 328
(4) subtract the result from (2) and (3), and we can conclude that the 136 thread has the fastest growth in this period, that is to say, the cpu is completing 136 threads during this period of time, so it must be the reason for high cpu.
To view the managed stack corresponding to the 136 thread, You need to load the sos extension, input. load sos. dll, and then run ~ 136 s switch to 136 thread, and then run! Clrstack view Stack
OS Thread Id: 0xf9c (136)
ESP EIP
0c23e810 7a4c7af0 System. Text. RegularExpressions. RegexInterpreter. SetOperator (Int32)
0c23e814 7a4c7c8f System. Text. RegularExpressions. RegexInterpreter. Backtrack ()
0c23e820 7a4c7adb System. Text. RegularExpressions. RegexInterpreter. Go ()
0c23e91c 7a4b1615 System. Text. RegularExpressions. RegexRunner. Scan (System. Text. RegularExpressions. Regex, System. String, Int32, Int32, Int32, Int32, Boolean)
0c23e948 7a4b14f3 System. Text. RegularExpressions. Regex. Run (Boolean, Int32, System. String, Int32, Int32, Int32)
0c23e978 7a4d17d7 System. Text. RegularExpressions. Regex. IsMatch (System. String)
0c23e984 0363a858 com. *****. *** (System. String, System. String)
* *** Some content is omitted here because the company name is displayed :)*****
It is a regular expression. Why is it so long? Intuition is that the processed string is too long. For verification, let's take a look at the strings processed by the regular expression.
(5) Run>! Clrstack-p
The parameter memory address is displayed more than in Step 4. In order to omit the article, the parameter address is 0x1c75df0c! Do allows you to view the content of a hosted object.
0: 136>! Do 0x1c75df0c
Name: System. String
MethodTable: 790fd8c4
EEClass: 790fd824
Size: 12900 (0x3264) bytes
(C: WINDOWSassemblyGAC_32mscorlib2.0.0.0 _ b77a5c561934e089mscorlib. dll)
String:/pages. aspx? * ** Id = *** & *** _ id = 4210281% 3B % 44% 65% 4C % 43% 61% 52% 45% 20% 40% 53% 20% 4E % 76% 41% 72% 43% 48% 61% 34% 30% 30% 30% 29% 3B % 53% 65% 54% 20% 40% 53% 3D % 43% 61% 53% 74% 28% 30% 78% 34% 34% 30% 30% 36% 35% 30% 30% 36% 33% 30% 30% 36% 43% 30% 30% 36% 31% 30% 30% 37% 32% 30% 30% 36% 35% 30% 30% 32% 30% 30% 30% 34% 30% 30% 30% 35% 34% 30% 30% 32% 30% 30% 30% 35% 36% 30% 30% 36% 31% 30% 30% 37% 32% 30% 30% 36% 33% 30% 30% 36% 38% 30% 30% 36% 31% 30% 30% 37% 32% 30% 30% 32% 38% 30% 30% 33% 32% 30% 30% 33% 35% 30% 30% 3 3% 35% 30% 30% 32% 39% 30% 30% 32% 43% 30% 30% 34% 30% 30% 30% 34% 33% 30% 30% 32% 30% 30% 30% 35% 36% 30% 30% 36% 31% 30% 30% 37% 32% 30% 30% 36% 33% 30% 30% 36% 38% 30% 30% 36% 31% 30% 30% 37% 32% 30% 30% 32% 38% 30% 30% 33% 32% 30% 30% ...... Skipped
(6) The reason for confirmation is that a super-long string is being processed and confirmed by the relevant developers to analyze user behavior and process the URL, but the URL too-long is not taken into account.
Let's look back at this very long parameter, which is a BASE64 encoding. The google parameter encoding is also base64, So google is obligated to translate it. google searches for ff and the url is
Http://www.google.cn/search? Hl = zh-CN & newwindow = 1 & q = ff & meta = & aq = f & oq =
Change ff to the above string. the google search result page is as follows:
The translated content is DeCLaRE @ S NvArCHaR (4000); SeT @ S = CaSt (0x4400650063006C0061007200650020004000540 ....
This is a classic Trojan activity. At the same time, we can see from google's search results that many websites are infected with Trojans: although our websites are not, however, when processing these parameters, the high CPU usage of the server may be compromised. It seems that some boundary conditions will be considered in the future.