High CPU is a common fault on website servers. Many windbg tutorials use high CPU as an example. I also met this on the company server in March. The entire debugging process was smooth and regular, but the final reason was very interesting. It was related to a Trojan behavior.
The w3wp process of the website server often has a sudden high CPU, as shown in the week15-16. The corrected results are quite obvious.
Analysis process:
(1) previously, windbg was installed on the server, and adplus. vbs-Hang-PN w3wp.exe-o d: \ ops \ was run at the CPU high point to generate dump files,
If you do not have any conditions, you can use the ntsd provided by the system to generate them. For example:
Ntsd-Pv-PN w3wp.exe-logo D: \ out.txt-lines-c ". Dump/mA D: \ testlocal. dmp; q"
For more information about the use of ntsd on 64-bit machines, see the 32-bit w3wp process on 64-bit machines.
Because we need to solve the high CPU problem, the idea is to analyze the CPU time occupied by a thread after the process starts. Therefore, you need to take multiple dump to see which thread is used for "Maximum CPU usage growth" in the "high CPU period". The final two files obtained are as follows:
(2) windbgOpen1227. DMP, run! The runaway command shows the total CPU usage time of each thread.
0: 000>! Runaway
User Mode Time
ThreadTime
18: FDC0 days 1:20:28. 390
19: 13700 days 1:16:36. 359
0 days 1:08:28. 765
22: 6980 days 1:07:55. 968
20: 11800 days 0:58:22. 046
138: 12840 days 0:56:53. 890
136: f9c0 days 0:49:38. 609
9: 10940 days 0:44:26. 312
147: db80 days 0:25:16. 234
149: 6f40 days 0:22:00. 687
December 148: c8c0 days 0:20:29. 156
0 days 0:01:31. 562
12: d240 days 0:01:27. 593
14: 5e80 days 0:01:26. 203
11: ce00 days 0:01:06. 703
(3) Check another DMP file, windbg.Open1236. dmp
0: 000>! Runaway
User Mode Time
ThreadTime
18: FDC0 days 1:21:09. 125
19: 13700 days 1:20:20. 468
0 days 1:08:43. 140
22: 6980 days 1:08:28. 812
20: 11800 days 1:03:01. 078
138: 12840 days 0:57:49. 281
136: f9c0 days 0:55:01. 250
9: 10940 days 0:44:50. 781
146: db80 days 0:27:10. 062
December 147: c8c0 days 0:25:17. 828
148: 6f40 days 0:25:03. 656
0 days 0:01:32. 328
(4) subtract the result from (2) and (3), and we can conclude that the 136 thread has the fastest growth in this period, that is to say, the CPU is completing 136 threads during this period of time, so it must be the reason for high CPU.
To view the managed stack corresponding to the 136 thread, You need to load the SOS extension, input. Load SOS. dll, and then run ~ 136 s switch to 136 thread, and then run! Clrstack view Stack
OS thread ID: 0xf9c (136)
ESPEIP
0c23e810 7a4c7af0 system. Text. regularexpressions. regexinterpreter. setoperator (int32)
0c23e814 7a4c7c8f system. Text. regularexpressions. regexinterpreter. Backtrack ()
0c23e820 7a4c7adb system. Text. regularexpressions. regexinterpreter. Go ()
0c23e91c 7a4b1615 system. Text. regularexpressions. regexrunner. Scan (system. Text. regularexpressions. RegEx, system. String, int32, int32, int32, int32, Boolean)
0c23e948 7a4b14f3 system. Text. regularexpressions. RegEx. Run (Boolean, int32, system. String, int32, int32, int32)
0c23e978 7a4d17d7 system. Text. regularexpressions. RegEx. ismatch (system. String)
0c23e984 0363a858 com. *****. *** (system. String, system. String)
* *** Some content is omitted here because the company name is displayed :)*****
It is a regular expression. Why is it so long? Intuition is that the processed string is too long. For verification, let's take a look at the strings processed by the regular expression.
(5) Run>! Clrstack-P
The parameter memory address is displayed more than in Step 4. In order to omit the article, the parameter address is 0x1c75df0c! Do allows you to view the content of a hosted object.
0: 136>! Do 0x1c75df0c
Name: system. String
Methodtable: 790fd8c4
Eeclass: 790fd824
Size: 12900 (0x3264) bytes
(C: \ WINDOWS \ Assembly \ gac_32 \ mscorlib \ 2.0.0.0 _ b77a5c561934e089 \ mscorlib. dll)
String:/pages. aspx? * ** Id = *** & *** _ id = 4210281% 3B % 44% 65% 4C % 43% 61% 52% 45% 20% 40% 53% 20% 4E % 76% 41% 72% 43% 48% 61% 34% 30% 30% 30% 29% 3B % 53% 65% 54% 20% 40% 53% 3D % 43% 61% 53% 74% 28% 30% 78% 34% 34% 30% 30% 36% 35% 30% 30% 36% 33% 30% 30% 36% 43% 30% 30% 36% 31% 30% 30% 37% 32% 30% 30% 36% 35% 30% 30% 32% 30% 30% 30% 34% 30% 30% 30% 35% 34% 30% 30% 32% 30% 30% 30% 35% 36% 30% 30% 36% 31% 30% 30% 37% 32% 30% 30% 36% 33% 30% 30% 36% 38% 30% 30% 36% 31% 30% 30% 37% 32% 30% 30% 32% 38% 30% 30% 33% 32% 30% 30% 33% 35% 30% 30% 3 3% 35% 30% 30% 32% 39% 30% 30% 32% 43% 30% 30% 34% 30% 30% 30% 34% 33% 30% 30% 32% 30% 30% 30% 35% 36% 30% 30% 36% 31% 30% 30% 37% 32% 30% 30% 36% 33% 30% 30% 36% 38% 30% 30% 36% 31% 30% 30% 37% 32% 30% 30% 32% 38% 30% 30% 33% 32% 30% 30% ...... Skipped
(6) The reason for confirmation is that a super-long string is being processed and confirmed by the relevant developers to analyze user behavior and process the URL, but the URL too-long is not taken into account.
Let's look back at this very long parameter, which is a base64 encoding. The Google parameter encoding is also base64, So Google is obligated to translate it. Google searches for FF and the URL is
Http://www.google.cn/search? Hl = ZH-CN & newwindow = 1 & Q = FF & meta = & AQ = F & OQ =
Change ff to the above string. the Google search result page is as follows:
The translated content isDeclare @ s nvarchar (4000); Set @ s = cast (0x4400650063006c0061007200650020004000540 ....
This is a classic Trojan activity. At the same time, we can see from Google's search results that many websites are infected with Trojans: although our websites are not, however, when processing these parameters, the high CPU usage of the server may be compromised. It seems that some boundary conditions will be considered in the future.