IOS regular expression, Regular Expression
IOS: Detailed Regular Expression
1. Introduction:
Regular Expressions are common in projects, such as logon accounts and passwords (mobile phone numbers and email addresses ). The method used is to filter the predicate object:NSPredicate.
2. What is a regular expression:
Regular Expressions, also known as regular expressions, are a logical formula for string operations. Regular Expressions can detect whether the given string conforms to the logic defined by us, or obtain the specific part from the string. It can quickly achieve complex String Control in an extremely simple way.
3. Syntax:
First, special symbols '^' and '$ '. They indicate the start and end of a string respectively. Eg: "^ one": indicates all strings starting with "one" ("one cat", "one123",...); similar to:-(BOOL) hasPrefix :( NSString *) aString; "a dog $": indicates the string ending with "a dog" ("it is a dog",...); similar: -(BOOL) hasSuffix :( NSString *) aString; "^ apple $": indicates that the start and end are both "apple" strings, which are unique ~; "Banana": indicates any string containing "banana. Similar to iOS8's new method-(BOOL) containsString :( NSString *) aString, used to search for substrings. '*', '+' And '? 'These three symbols indicate the number of times one or N characters appear again. They indicate "No or more" ([0, + ∞] rounded up), "one or more" ([1, + ∞] rounded up ), "None or once" ([0, 1] is rounded up ). The following is an example: "AB *": indicates that a string is followed by zero or several B ("a", "AB", "abbb ", ......); "AB +": indicates that a string is followed by at least one B or more ("AB", "abbb ",......); "AB ?": Indicates that a string is followed by zero a or one B ("a", "AB"); "? B + $ ": indicates that there are zero or one a followed by one or several B (" B "," AB "," bb "," abb "at the end of the string ", ......). It can be enclosed in braces ({}) to indicate a specific repeated range. For example, "AB {4}" indicates that a string has one a followed by four B ("abbbb"); "AB {1 ,}": indicates that a string has at least one B ("AB", "abb", "abbb",…) followed by ",......); "AB {3, 4}": indicates that a string has 3 to 4 B ("abbb", "abbbb "). Then, "*" can be represented by {0,}, and "+" can be represented by {1,}, "?" You can use {0, 1} to indicate that there is no lower limit, but no upper limit! For example, "AB {, 5}" indicates the incorrect syntax "|" indicates "or" Operation: "a | B" indicates that a string contains "a" or "B "; "(a | bcd) ef": "aef" or "bcdef"; "(a | B) * c ": represents a string of "a" "B" followed by a "c"; square brackets "[]" indicate many characters in the brackets, select 1-N syntactic characters in parentheses as the result, for example, "[AB]": indicates that a string has a "a" or "B" (equivalent to "a | B"); "[a-d]": indicates that a string contains one of the lower-case 'A' to 'D' (equivalent to "a | B | c | d" or "[abcd]"); "^ [a-zA-Z]": represents a string starting with a letter; "[0-9] a": represents a digit before; "[a-zA-Z0-9] $": represents a string ending with a letter or number. ". "Matches any single character except" \ r \ n ":". [a-z] ": indicates that a string has a" a "followed by an arbitrary character and a lowercase letter." ^. {5} $ ": indicates any one string with a length of 5." \ num "indicates a positive integer. The number of characters before "\ num" is the same, for example, "(.) \ 1": two consecutive identical characters. "10 \ {1, 2 \}": indicates the number 1 followed by 1 or 2 0 ("10", "100 ″)." 0 \ {3, \} indicates that the number is at least three consecutive 0 ("000", "0000 ",······). Use '^' in square brackets to indicate the characters that do not want to appear. '^' should be the first character in square brackets. "@ [^ A-zA-Z] 4 @" indicates two "@" should not contain letters ). "\ D" is commonly used to match a numeric character. It is equivalent to [0-9]. "\ D" matches a non-numeric character. It is equivalent to [^ 0-9]. "\ W" matches any word characters that contain underscores. It is equivalent to "[A-Za-z0-9 _]". "\ W" matches any non-word characters. It is equivalent to "[^ A-Za-z0-9 _]". In iOS, when a regular expression is written and an escape character is encountered, add "\", for example, full-digit character: @ "^ \ d \ + $"
4. Common regular expressions are as follows: (email address, phone number, ID card, password, and nickname)
// Mailbox + (BOOL) validateEmail :( NSString *) email {NSString * emailRegex = @ "[A-Z0-9a-z. _ % +-] + @ [A-Za-z0-9. -] + \\. [A-Za-z] {2, 4} "; NSPredicate * emailTest = [NSPredicate predicateWithFormat: @" self matches % @ ", emailRegex]; return [emailTest email];} // mobile phone number verification + (BOOL) validateMobile :( NSString *) mobile {// mobile phone number starts with 13, 15, 18, eight \ d numeric characters NSString * phoneRegex = @ "^ (13 [0-9]) | (15 [^ 4, \ D]) | (18 [0, 0-9]) \ d {8} $ "; NSPredicate * phoneTest = [NSPredicate predicateWithFormat: @" self matches % @ ", phoneRegex]; return [phoneTest evaluateWithObject: mobile];} // license plate number verification + (BOOL) validateCarNo :( NSString *) carNo {NSString * carRegex = @ "^ [\ u4e00-\ u9fa5] {1} [a-zA-Z] {1} [a-zA-Z_0-9] {4} [a-zA-Z_0-9 _ \ u4e00 -\ u9fa5] $ "; NSPredicate * carTest = [NSPredicate predicateWithFormat: @ "self matches % @", carRegex]; NSLog (@ "carTest is % @", carTest); return [carTest evaluateWithObject: carNo];} // model + (BOOL) validateCarType :( NSString *) CarType {NSString * CarTypeRegex = @ "^ [\ u4E00-\ u9FFF] + $ "; NSPredicate * carTest = [NSPredicate predicateWithFormat: @ "self matches % @", CarTypeRegex]; return [carTest variables: CarType];} // username + (BOOL) validateUserName :( NSString *) name {NSString * userNameRegex = @ "^ [A-Za-z0-9] {6,20} + $"; NSPredicate * userNamePredicate = [NSPredicate predicateWithFormat: @ "self matches % @", userNameRegex]; bool B = [userNamePredicate evaluateWithObject: name]; return B;} // password + (BOOL) validatePassword :( NSString *) passWord {NSString * passWordRegex = @ "^ [a-zA-Z0-9] {6, 20} + $"; NSPredicate * passWordPredicate = [NSPredicate predicateWithFormat: @ "self matches % @", passdregex]; return [passWordPredicate evaluateWithObject: passWord];} // nickname + (BOOL) validateNickname :( NSString *) nickname {NSString * nicknameRegex = @ "^ [\ u4e00-\ u9fa5] {4, 8} $"; NSPredicate * passWordPredicate = [NSPredicate alias: @ "self matches % @", nicknameRegex]; return [passWordPredicate evaluateWithObject: nickname];} // ID card number + (BOOL) validateIdentityCard: (NSString *) identityCard {BOOL flag; if (identityCard. length <= 0) {flag = NO; return flag;} NSString * regex2 = @ "^ (\ d {14} | \ d {17 }) (\ d | [xX]) $ "; NSPredicate * identityCardPredicate = [NSPredicate attributes: @" self matches % @ ", regex2]; return [identityCardPredicate identifier: identityCard];}
The following are some metacharacters of the regular expression:
Metacharacters |
Description |
\ |
Mark the next character as a special character, a literal character, or a backward reference, or an octal escape character. For example, "\ n" matches \ n. "\ N" matches the line break. The sequence "\" matches "\", and "\ (" matches "(". |
^ |
Matches the start position of the input string. If the Multiline attribute of the RegExp object is set, ^ matches the position after "\ n" or "\ r. |
$ |
Matches the end position of the input string. If the Multiline attribute of the RegExp object is set, $ also matches the position before "\ n" or "\ r. |
* |
Match the previous subexpression zero or multiple times (greater than or equal to 0 times ). For example, zo * can match "z", "zo", and "zoo ". * Is equivalent to {0 ,}. |
+ |
Match the previous subexpression once or multiple times (greater than or equal to 1 time ). For example, "zo +" can match "zo" and "zoo", but cannot match "z ". + Is equivalent to {1 ,}. |
? |
Match the previous subexpression zero or once. For example, "do (es )?" It can match "do" in "do" or "does ".? It is equivalent to {0, 1 }. |
{N} |
N is a non-negative integer. Match n times. For example, "o {2}" cannot match "o" in "Bob", but can match two o in "food. |
{N ,} |
N is a non-negative integer. Match at least n times. For example, "o {2,}" cannot match "o" in "Bob", but can match all o in "foooood. "O {1,}" is equivalent to "o + ". "O {0,}" is equivalent to "o *". |
{N, m} |
Both m and n are non-negative integers, where n <= m. Match at least n times and at most m times. For example, "o {1, 3}" matches the first three o in "fooooood. "O {0, 1}" is equivalent to "o ?". Note that there must be no space between a comma and two numbers. |
? |
When this character is followed by any other delimiter (*, + ,?, The matching mode after {n}, {n ,}, {n, m}) is not greedy. The non-Greedy mode matches as few searched strings as possible, while the default greedy mode matches as many searched strings as possible. For example, for strings "oooo", "o + ?" A single "o" will be matched, while "o +" will match all "o ". |
. |
Match any single character except "\ r \ n. To match any character including "\ r \ n", use a pattern like "[\ s \ S. |
(Pattern) |
Match pattern and obtain this match. The obtained match can be obtained from the generated Matches set. The SubMatches set is used in VBScript, and $0… is used in JScript... $9 attribute. To match the parentheses, use "\ (" or "\)". |
(? : Pattern) |
Matches pattern but does not get the matching result. That is to say, this is a non-get match and is not stored for future use. This is useful when you use the "(|)" character to combine all parts of a pattern. For example, "industr (? : Y | ies) "is a simpler expression than" industry | industrial. |
(? = Pattern) |
Forward validation pre-query: matches the search string at the beginning of any string that matches pattern. This is a non-get match, that is, the match does not need to be obtained for future use. For example (? = 95 | 98 | NT | 2000) "can match" Windows "in" Windows2000 ", but cannot match" Windows "in" Windows3.1 ". Pre-query does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, instead of starting after the pre-query characters. |
(?! Pattern) |
Forward negative pre-query: matches the search string at the beginning of any string that does not match pattern. This is a non-get match, that is, the match does not need to be obtained for future use. For example, "Windows (?! 95 | 98 | NT | 2000) "can match" Windows "in" Windows3.1 ", but cannot match" Windows "in" Windows2000 ". |
(? <= Pattern) |
The reverse direction must be pre-check, which is similar to positive pre-check, but in the opposite direction. For example, <= 95 | 98 | NT | 2000) Windows can match Windows in 2000Windows, but cannot match Windows in 3.1Windows ". |
(? <! Pattern) |
Reverse negative pre-query, similar to forward negative pre-query, is in the opposite direction. For example, "(? <! 95 | 98 | NT | 2000) Windows can match "Windows" in "3.1Windows", but cannot match "Windows" in "2000Windows ". |
X | y |
Match x or y. For example, "z | food" can match "z" or "food ". "(Z | f) ood" matches "zood" or "food ". |
[Xyz] |
Character Set combination. Match any character in it. For example, "[abc]" can match "a" in "plain ". |
[^ Xyz] |
Negative value character set combination. Match any character not included. For example, "[^ abc]" can match "plin" in "plain ". |
[A-z] |
Character range. Matches any character in the specified range. For example, "[a-z]" can match any lowercase letter in the range of "a" to "z. Note: only when a hyphen is in a character group and appears between two characters can the range of the characters be expressed. If a group starts with a hyphen, it can only represent the character itself. |
[^ A-z] |
Negative character range. Matches any character that is not within the specified range. For example, "[^ a-z]" can match any character that is not in the range of "a" to "z. |
\ B |
Match A Word boundary, that is, the position between a word and a space. For example, "er \ B" can match "er" in "never", but cannot match "er" in "verb ". |
\ B |
Match non-word boundary. "Er \ B" can match "er" in "verb", but cannot match "er" in "never ". |
\ Cx |
Match the control characters specified by x. For example, \ cM matches a Control-M or carriage return character. The value of x must be either a A-Z or a-z. Otherwise, c is treated as a literal "c" character. |
\ D |
Match a numeric character. It is equivalent to [0-9]. |
\ D |
Match a non-numeric character. It is equivalent to [^ 0-9]. |
\ F |
Match a form feed. It is equivalent to \ x0c and \ cL. |
\ N |
Match A linefeed. It is equivalent to \ x0a and \ cJ. |
\ R |
Match a carriage return. It is equivalent to \ x0d and \ cM. |
\ S |
Matches any blank characters, including spaces, tabs, and page breaks. It is equivalent to [\ f \ n \ r \ t \ v]. |
\ S |
Match any non-blank characters. It is equivalent to [^ \ f \ n \ r \ t \ v]. |
\ T |
Match a tab. It is equivalent to \ x09 and \ cI. |
\ V |
Match a vertical tab. It is equivalent to \ x0b and \ cK. |
\ W |
Match any word characters that contain underscores. It is equivalent to "[A-Za-z0-9 _]". |
\ W |
Match any non-word characters. It is equivalent to "[^ A-Za-z0-9 _]". |
\ Xn |
Match n, where n is the hexadecimal escape value. The hexadecimal escape value must be determined by the length of two numbers. For example, "\ x41" matches "". "\ X041" is equivalent to "\ x04 & 1 ". The regular expression can be ASCII encoded. |
\ Num |
Matches num, where num is a positive integer. References to the obtained matching. For example, "(.) \ 1" matches two consecutive identical characters. |
\ N |
Identifies an octal escape value or a backward reference. If at least n subexpressions are obtained before \ n, n is backward referenced. Otherwise, if n is an octal digit (0-7), n is an octal escape value. |
\ Nm |
Identifies an octal escape value or a backward reference. If at least one child expression is obtained before \ nm, the nm is backward referenced. If at least n records are obtained before \ nm, n is a backward reference followed by text m. If none of the preceding conditions are met, if n and m are Octal numbers (0-7), \ nm matches the octal escape value nm. |
\ Nml |
If n is an octal number (0-7) and m and l are Octal numbers (0-7), the octal escape value nml is matched. |
\ Un |
Match n, where n is a Unicode character represented by four hexadecimal numbers. For example, \ u00A9 matches the copyright symbol (& copy ;). |
\ <\> |
Start (\ <) and end (\>) of the match word (word ). For example, the regular expression \ <the \> can match the "the" in the string "for the wise", but cannot match the "the" in the string "otherwise ". Note: This metacharacter is not supported by all software. |
\(\) |
Define the expressions between \ (and \) as "group ), and save the characters matching this expression to a temporary region (a regular expression can save up to 9 characters). They can be referenced using the \ 1 to \ 9 symbols. |
| |
Perform logical "Or" (Or) operations on the two matching conditions. For example, the regular expression (him | her) matches "it belongs to him" and "it belongs to her", but does not match "it belongs to them .". Note: This metacharacter is not supported by all software. |
+ |
Match one or more characters that match exactly before it. For example, the regular expression 9 + matches 9, 99, and 999. Note: This metacharacter is not supported by all software. |
? |
Match 0 or 1 character that is exactly before it. Note: This metacharacter is not supported by all software. |
{I} {I, j} |
Matches a specified number of characters defined in the previous expression. For example, the regular expression A [0-9] {3} can match the character "A" followed by A string of exactly three numeric characters, such as A123 and A348, but does not match A1234. The regular expression [0-9] {} matches any four, five, or six consecutive numbers.
|
Original Link
Http://www.cnblogs.com/XYQ-208910/p/6056646.html
Github recommendations
Github: https://github.com/xiayuanquan/XYQRegexPattern