Java Regular Expression Learning

Source: Internet
Author: User

Java Regular Expression Learning

Matching mode

The JDK provides three matching modes, namely: greedy mode (greedy), barely (reluctant) and possessive (possessive), respectively, corresponding to three kinds of share words, in which the greedy mode is the default mode, the reluctant mode by adding one after the expression? To express. The possessive pattern is expressed by adding a + to the expression.

What is the meaning of the three modes?

The meaning of greedy mode is: as many matches as possible, but also to satisfy the overall match.
The meaning of a reluctant pattern is to match as little as possible and to satisfy the overall match as much as possible.
The meaning of possessive mode is: as many matches as possible, if because the match is more than the result of collation can not match, then also does not backtrack.

For example, there is a string like this:

/m/t/wd/nl/n/p/m/wd/nl/n/p/m/wd/nl/n/p/m/v/n

Expression matching for greedy mode:

/m/t.*/nl/n/p/m  此时匹配结果为 /m/t/wd/nl/n/p/m/wd/nl/n/p/m/wd/nl/n/p/m

The expression of the barely-pattern matches:

/m/t/.*?/nl/n/p/m 此时匹配结果为 /m/t/wd/nl/n/p/m   /m/t/wdx+?/nl/n/p/m  如果是这样,那么就匹配不上了,因为+表示至少要匹配一个,勉强模式,至少也要匹配一个,所以匹配失败了。

Expression matching for possessive mode:

/m/t.*+/nl/n/p/m 此时无法匹配,因为.*匹配了过多的字符,导致后面无法匹配是上了。

Note: You can only use a quantifier or a share word for a variable matching rule. Like x?? means to match the character X as little as possible, while X is the default greedy mode, which is the meaning of as many matches as possible. Again, the meaning of X{n} is that it must be prepared to match n X, and no other quantifiers will work at this time.

Surround (forecast)

Look around is a more advanced topic, but it's natural to use it.
Surround view is useful for scenarios where you need to know the front or back of a matched part, or not, a particular expression, without capturing (consuming) these specific expressions when you do a regular match.
If you do not use the surround look, but use the expression directly to judge, then it is inevitable that these matched expressions are consumed.

For example: Suppose I want to give iloveyou this sentence, the principle is that the appearance of capital letters is considered a new word.

If you use this matching rule:

\p{Upper}\p{Lower}*[\p{Upper}]?

, then the matching uppercase letters are consumed. The result of the match would be:

ILYou

This does not meet the requirements.

The solution is to use surround look, the regular expression is:

\p{Upper}?\p{Lower}*(?=[\p{Upper}]?)

The output is:

ILoveYou

There are four types of surround look:

(?=X) 表示后面跟着的是正则表达式X,匹配前面的部分时,不会消耗X这一部分,同时也不会捕获。零宽度正向肯定预测。(?<=X) 表示前面的是正则表达式X,匹配后面的部分时,不会消耗X这一部分,同时也不会捕获。 零宽度反向肯定预测。 (?!X) 表示后面跟着的不是正则表达式X,匹配前面的部分时,不会消耗X这一部分,同时也不会捕获。零宽度正向否定预测。 (?!=X) 表示前面的不是正则表达式X,匹配后面的部分时,不会消耗X这一部分,同时也不会捕获。 零宽度反向否定预测。
Non-capturing-possessive matching
(?>X) 这个尚未研究清楚。

Java Regular Expression Learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.