JavaScript-Regular expression matches the contents of the innermost brackets

Source: Internet
Author: User
Now there is a string:

str1 = '(subject_id = "A" OR (status_id = "Open" AND (status_id = "C" OR level_id = "D")))'

Or

str2 = '(subject_id = "A" OR subject_id = "Food" OR (subject_id = "C" OR (status_id = "Open" AND (status_id = "C" OR (level_id = "D" AND subject_id = "(Cat)")))))'

I need to pass the regular, matching the innermost parentheses in the string and their contents (not matching the parentheses in the quotation marks), i.e.:

str1 => (status_id = "C" OR level_id = "D")str2 => (level_id = "D" AND subject_id = "(Cat)")

So what is this hyper-complex regular supposed to write?

If the regular implementation is not, then how to achieve JS?

Add, for str1 , I found such a regular can satisfy the match:

\([^()]+\)

But for str2, there is still no way to look forward to the answer!

Reply content:

Now there is a string:

str1 = '(subject_id = "A" OR (status_id = "Open" AND (status_id = "C" OR level_id = "D")))'

Or

str2 = '(subject_id = "A" OR subject_id = "Food" OR (subject_id = "C" OR (status_id = "Open" AND (status_id = "C" OR (level_id = "D" AND subject_id = "(Cat)")))))'

I need to pass the regular, matching the innermost parentheses in the string and their contents (not matching the parentheses in the quotation marks), i.e.:

str1 => (status_id = "C" OR level_id = "D")str2 => (level_id = "D" AND subject_id = "(Cat)")

So what is this hyper-complex regular supposed to write?

If the regular implementation is not, then how to achieve JS?

Add, for str1 , I found such a regular can satisfy the match:

\([^()]+\)

But for str2, there is still no way to look forward to the answer!

For str2, I found something like this.

\([^()]*\"[^"]*\"[^()]*\)

Looking at the requirements I didn't even think about using the regular, it seems too complicated ... Go directly to the traditional method;
You can use the idea of the arithmetic precedence , that is, the data structure of the stack to get the contents of the inner brackets;
Technical points:

    1. Match the parentheses of the inner layer

    2. The content within the quotation marks is not used as a matching criterion

Start designing the algorithm according to this idea:
The algorithm is to calculate the substring to match startIndex and endIndex then use the substring() method to obtain the substring;

    • When matching to a "(" character, into the stack , when we match to the first ")" , out of the stack , that is, two indexes between the substring as the target string;

    • When matched to one "\"" , the match is stopped until the next search is "(" reached and the "\"" search continues to begin "(" .

Beat the brain to think out of the algorithm, there is a lack of welcome to add.

So, try
/\(([^\(\)]*?" [^\"\(\)]*([^\"\(\)]+\)[^\(\)]*?\"[^\(\)]*)+)| ([^\(\)]+\)/

Add:

Analyze Requirements > Find solutions for each demand point > consolidation solution = Problem solving

Analysis Requirements:

    1. Need to match ( a ) the form

    2. aThere are two possibilities for the characters that are contained in a1 and a2 represent

      1. a1Contains one or more b " c " b forms of a string,

        1. bA string that is not included " , ( or )

        2. Which c is a string that is not included "

      2. a2does not contain ( or)

Inverse derivation:

2.2 a2 =[^\(\)]*
2.1.1 b =[^\(\)\"]*
2.1.2 c =[^\"]*
2.1 = = = = a1 (b\"c\"b)+ (b\"c\")+b([^\(\)\"]*\"[^\"]*\")+[^\(\)\"]*
1 \(a\) = \(a1\)|\(a2\) =\(([^\(\)\"]*\"[^\"]*\")+[^\(\)\"]*\)|\([^\(\)]*\)

Regular Expressions:

/\(([^\(\)\"]*\"[^\"]*\")+[^\(\)\"]*\)|\([^\(\)]*\)/

Verify:

var reg = /\(([^\(\)\"]*\"[^\"]*\")+[^\(\)\"]*\)|\([^\(\)]*\)/;'(the (quick "brown" fox "jumps over, (the) lazy" dog ))'    .match(reg)[0]//"(quick "brown" fox "jumps over, (the) lazy" dog )"'(the ("(quick)" brown fox "jumps (over, the)" lazy) dog )'    .match(reg)[0];//"("(quick)" brown fox "jumps (over, the)" lazy)"'(the (quick brown fox (jumps "over", ((the) "lazy"))) dog )'    .match(reg)[0];//"(the)"

So change it:

substr=str.match(/\([^()]+\)/g)[0]

Get the innermost bracket and the value in it, and then determine whether the previous digit is ", whether the next one is":

index=str.indexOf(str.match(/\([^()]+\)/g)[0])length=str.match(/\([^()]+\)/g)[0].lengthstr.substr(index+length,1)str.substr(index-1,1)

If it does not exist, then it is the answer that is needed, if present, replace the substr in STR first, then in match, and finally in replace:

str.replace(substr,"&&&")str.replace(substr,"&&&").match(/\([^()]+\)/g)[0]str.replace(substr,"&&&").match(/\([^()]+\)/g)[0].replace("&&&",substr)

The difficulty of the problem is to have recursive statistics on "", for example

(level_id = "D AND subject_id = "(Cat)"")

(CAT) is in compliance with the requirements.

\([^()]*?\"((?:[^\"\"]|\"(?1)\")*+)\"[^()]*?\)|\([^()]*?\)

True love of life, away from the regular, the regular can meet your requirements, PHP can be used (PHP support recursion) Java and Python is not available.

Recommend a thought to find (the index, cut string processing

The phone can't send a regular black line
If the "^ ()" In the landlord does not match () then continue
The mismatch (the condition of the removal, the greedy + change to *?)

! code

Console.log (' (subject_id = "A" or (status_id = "Open" and (status_id = "C" or level_id = "D")) '. Match (/(1*)/))
Hope to help you

    1. ()

The use of regular matching will be more complex, it is recommended to replace the interference string "(and)", such as "[,]", and then replace with simple regular, and then change back.

The regular Python implementation is as follows:

import restr1 = '(subject_id = "A" OR (status_id = "Open" AND (status_id = "C" OR level_id = "D")))'str2 = '(subject_id = "A" OR subject_id = "Food" OR (subject_id = "C" OR (status_id = "Open" AND (status_id = "C" OR (level_id = "D" AND subject_id = "(Cat)")))))'pat = re.compile(r"""(?<=[^"])        \([^()]+?        ("\(.+?\)")*        \)        (?=[^"])        """, re.X)print pat.search(str1).group(0)print pat.search(str2).group(0)

The output is:

(status_id = "C" OR level_id = "D")(level_id = "D" AND subject_id = "(Cat)")
  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.