Solution to the problem of dead-loop caused by the Java regularization method

Source: Internet
Author: User

The

Recent online application has been very high in load value, almost near the edge of the outage, starting to report an exception as follows:

At Java.util.regex.pattern$grouptail.match (Unknown source) in Java.util.regex.pattern$ctype.match (Unknown source) at Java.util.regex.pattern$branch.match (Unknown source) at Java.util.regex.pattern$grouphead.match (Unknown source) at Java.util.regex.pattern$loop.match (Unknown source) at Java.util.regex.pattern$grouptail.match (Unknown source) at Java.util.regex.pattern$ctype.match (Unknown source) at Java.util.regex.pattern$branch.match (Unknown source) at Java.util.regex.pattern$grouphead.match (Unknown source) at Java.util.regex.pattern$loop.match (Unknown source) at Java.util.regex.pattern$grouptail.match (Unknown source) at Java.util.regex.pattern$ctype.match (Unknown source) at Java.util.regex.pattern$branch.match (Unknown source) at Java.util.regex.pattern$grouphead.match (Unknown source) at Java.util.regex.pattern$loop.match (Unknown source) at Java.util.regex.pattern$grouptail.match (Unknown source) at Java.util.regex.pattern$ctype.match (Unknown Source) at Java.util.regex.pattern$branch.matCH (Unknown source) at Java.util.regex.pattern$grouphead.match (Unknown source) at Java.util.regex.pattern$loop.match ( Unknown source) at Java.util.regex.pattern$grouptail.match (Unknown source) at Java.util.regex.pattern$ctype.match ( Unknown source) at Java.util.regex.pattern$branch.match (Unknown source)  

Navigate to one of our tool methods through exception information capture: The tool method is as follows:

public static Boolean Checkspecialchars (String inputstr, string regex) {if (Inputstr = null | | '. Equals (INPUTSTR)) {return false;} return Pattern.compile (Regex). Matcher (INPUTSTR). matches (); }

There is no place to call this method through the loop, but according to the exception information is obviously a dead loop, which causes us to further track the problem, through a period of time testing and summary, finally found the cause of the problem. This method allows a regular expression to be passed in, and the problem occurs on the incoming regular expression, which is simplified as follows:

String regex = "([a-z]|//d) *";

By testing, this method will become unstable and begin to reproduce our previous exception information if the number of matches entered in the string is more than 817 times. The test code is as follows:

Import Java.util.regex.Pattern; /** * Created on 2010-11-9 * <p>title: Test regular expression dead loop </p> * @author shixing_11@sina.com * @version 1.0/Public C Lass regextest {public static void main (string args[]) {string regex = "([a-z]|//d) *"; String inputstr = ""; for (int i = 0; i < 309 i++)//The value here is >=400 will immediately throw the exception {inputstr = Inputstr.concat (string.valueof (i));//loop stitching input String} Sy Stem.out.println ("String length is:" +inputstr.length ()); Boolean flag = Checkspecialchars (Inputstr, regex); System.out.println ("Match result is:" +flag);} public static Boolean Checkspecialchars (String inputstr, string regex) {if (Inputstr = null | | '. Equals (INPUTSTR)) {return false;} return Pattern.compile (Regex). Matcher (INPUTSTR). matches (); Note that the matches () method throws the exception} here

Original: The problem is JDK bug, to JDK1.6 actually haven't repaired, bug details see:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5050507 and

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6988218

Attached: also can refer to http://www.99inf.net/SoftwareDev/Java/53834.htm this article, very good.

Through the above method to solve the problem of throwing anomalies, after the modification of the machine restart found that the exception is not thrown, but the CPU occupancy rate is not improved, frequent alarm, after careful investigation, there are five processing regular threads to the CPU resources consumed, really did not recruit, the final check using other methods, completely kill regular.

Summary: Through this online problem-checking, regular expression is a double ren sword, if large-scale data verification is best not to use regular, very poor efficiency. CPU processing power will be spent on the processing of these several regular. In addition, the problem is that the project is online for some time before it appears, this shows that when the data reaches an order of magnitude, the regular processing efficiency will quickly drop, so like my case, the initial data volume is small, has no problem, wait until the sudden increase in traffic, the CPU in a short time load value is very high. So it is easy not to use in large data or concurrent access to higher applications.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.