Fourth. String manipulation with regular expressions (5)

Source: Internet
Author: User
Tags alphabetic character character classes ereg posix printable characters valid email address

4.6 Presentation of regular expressions (from the book PHP & MySQL Web Development)
PHP supports two forms of regular expression syntax: POSIX and Perl.
Purpose: Complete complex pattern matching.
Difficulty: Difficult
4.6.1 Basic knowledge ******

Definition: A regular expression is a way to describe a text pattern.

Analogy: the Strstr () function that matches another string at a location in a string, if not indicated, anywhere in the string.

Example: the character at "shop" matches the regular expression "shop", can also match the regular expression "H", "Ho" and so on.

Other: In addition to precisely matching strings, you can also specify the Wengi (meta-meaning) of an expression with special characters.
For example, with special characters, you can specify a pattern that must exist at the beginning or end of a string, and some part of the pattern may be duplicated, or
A character in a pattern belongs to a particular type. In addition, they can be matched by the appearance of special characters.

4.6.2 character set and class ******

A character set can be used to match any character belonging to a particular type, which is a wildcard character.

For example, you can use a character as a wildcard to replace any character except a newline character (\ n).
==> regular expressions. At can be matched with "cat", "sat", "Mat" and so on. (This wildcard is used for file name matching in the OS)
In addition, you can use regular expressions to more specifically indicate the type of character you want to match, and you can indicate a collection to which the character belongs.
For example:. At can match "cat", "sat", "Mat", or "#at", if you want to qualify it as a to Z character, you can
Sample Declaration ==>[a-z]at
Any content contained in [] is a character class-a set of characters to which a matched character belongs,
(The expression in square brackets matches only one character)
==>[aeiou] Sets can represent vowels; [a-za-z] collections represent any uppercase and lowercase letters;
[^a-z] collection indicates that the character does not belong to a set (that is, matches any character that is not between A and Z)

Table 4-3 character classes for POSIX-style regular expressions
————————————————————————————
Class matching
[[: Alnum:]] literal numeric character ' [: Alpha:] ' and ' [:d igit:] '
[[: Alpha:]] alphabetic character ' [: Lower:] ' and ' [: Upper:] '
[[: Lower:]] lowercase characters
[[: Upper:]] Uppercase characters
[[:d Igit:]] Fractional 0 1 2 3 4 5 6 7 8 9
[[: Xdigit:]] Hex number 0 1 2 3 4 5 6 7 8 9 a B c D E f a b c D E F
[[:p UNCT:]] punctuation marks! "# $% & ' () * +,-. / : ; < = >? @ [\] ^ _ ' {|} ~
[[: Blank:]] tabs and spaces Space and tab
[[: Space:]] blank character
[[: Cntrl:]] Control
[[:p rint:]] all printable characters ' [: alnum:] ', ' [:p unct:] ', and space
[[: Graph:]] In addition to empty all printable characters ' [: alnum:] ' and ' [:p UNCT:] '

———————————————————————————————————————


4.6.3 Repeat **************

The "*" symbol indicates that the pattern can be repeated 0 or more times;

The "+" symbol indicates that the pattern can be repeated 1 or more times.
(These two symbols are placed after the expression to be useful)

Example: [[: alnum:]]+ = = "has at least one alphabetic character"

4.6.4 Sub-expression *************

Splits an expression into several sub-expressions, for example, you can say "at least one of these strings requires an exact match."

Use () symbol to achieve: (very) * Large can match "large", "very large", "very very large" and so on.

4.6.5 Sub-expression Count ************

Use a numeric expression in {} to specify the number of times the content is allowed to repeat.

Example: {3} means repeat 3 times, {2,4} means repeat 2-4 times, {2,} = repeat at least 2 times
==> (very) {1,3} indicates matching "very", "very very", and "very very very".

4.6.6 navigates to the beginning or end of a string ***********

The [a-z] pattern matches any string that contains lowercase alphabetic characters. Whether the string has only one character, or if it contains only the entire length of the string
A matching character.

You can also determine whether a particular subexpression appears at the start, end, or two positions.

The caret (^) is used for the beginning of the regular expression, indicating that the substring must appear at the beginning of the searched string.

"$" is used at the end of the regular expression, indicating that the substring must appear at the end of the searched string.

Example: ^bob matches Bob at the beginning of the string;
com$ matches the string where COM appears at the end of the string;
^[a-z]$ matches a string that contains only one character between A and Z.

4.6.7 Branch ******************

Use a vertical bar in a regular expression to represent a selection.

Example: com|edu|net matching com,edu or net

4.6.8 Match Special characters **********

If you want to match special characters, for example:., {or $, you must precede the backslash (\)

In PHP, you must enclose the regular expression pattern in a single quoted string.

4.6.9 special characters at a glance ***********

Table 4-4 Summary of special characters outside the square brackets in POSIX regular expressions
————————————————————————————————————
Character meaning
\ escape Character
^ Match at the beginning of the string
$ matches at the end of the string
. Match characters other than \ n line break
| Select the start (or) of the branch
(The start of the sub-mode
) The end of the sub-mode
* Repeat 0 or more times
+ repeat 1 or more times
{The beginning of the minimum/maximum number of tokens
} The end of the minimum/maximum number of tokens
? Mark a sub-mode as optional
——————————————————————————————————————

Table 4-5 POSIX regular expression, a summary of the special characters inside the square brackets
————————————————————————————————————
Character meaning
\ escape Character
^ Non, only used at the beginning
-Used to indicate the range of characters
————————————————————————————————————

4.6.10 applying ************ in smart forms

In a form-only application, regular expressions can be used in two ways:

1. Find the specific noun in the customer's feedback information.

Previous practice: Use the String function strstr (), if you want to match "shop", "Customer service", "retail" needs to do 3 different searches
==> strstr ($string, ' Shop '); Strstr ($string, ' customer service '); Strstr ($string, ' retail ');

Use Regular: Can match "shop", "Customer service", "retail"
==> Shop|customer Service|retail

2. Verify the user's email address in the program.

Previous practice: Use String functions.

Use Regular: Encode the standard format of the e-mail address.
(Number/punctuation + @ + text/number/character string +.) + literal/numeric string with hyphens)

==> ^[a-za-z0-9_\-.] [Email protected] [A-za-z0-9\-]+\. [a-za-z0-9\-.] +$

Which ^[a-za-z0-9_\-.] + means "at least one letter (any case), number, underscore, hyphen, dot (.), or
These characters are combined as the starting string "
(Start the string with at least one letter,number,underscore,hyphen,or dot,or some combination
of those);

Tip: When you use a point number at the beginning or end of a character class, the point number loses its special wildcard meaning and can only be a dot symbol.

@ Match character @;

[a-za-z0-9\-]+ matches the hostname containing literal numeric characters and hyphens;

Character combination \. Match "." (We use a dot outside the character class and must be escaped so that it matches a dot symbol)

[a-za-z0-9\-.] +$ matches the remainder of the domain name (COM, net, and so on), which contains letters, numbers, hyphens, and, if necessary, more points until the end of the string.

4.7 Finding substrings with regular expressions

In PHP, the two functions that are available and used to match POSIX-style regular expressions are ereg () and eregi ()

Ereg (): int ereg (String $pattern, String $string [, Array & $regs]);

Finds substrings in a string that match the pattern of a given regular expression in a case-sensitive manner.

If you find a substring that matches the sub-pattern in parentheses within the pattern and use the third parameter, regs,

The match will be stored in the regs array.

$regs [1] contains the substring starting with the first opening parenthesis, and so on.

The $regs [0] contains the entire matched string.

If a match of pattern pattern is found in string, the length of the matched string is returned, and false is returned if no match is found or an error occurs.

If the third argument is not used or if the string length matched is 0, the function returns 1.

Eregi (): Not case-sensitive, the rest is the same as Ereg ().

The following is an improved smart form (smart form):

if(!mb_eregi (' ^[a-za-z0-9_\-\.] [Email protected] [A-za-z0-9\-]+\. [A-za-z0-9\-\.] +$ ',$email)){    Echo"<p>that is not a valid email address.</p>". <p>please return to the previous page and try Again.</p> "; Exit;}$toaddress= "[email protected]";//The default valueif(Mb_eregi ("Shop|customer service|retail",$feedback)){    $toaddress= "[Email protected]";}Else if(Mb_eregi ("Deliver|fulfill",$feedback)){    $toaddress= "[Email protected]";}Else if(Mb_eregi ("Bill|account",$feedback)){    $toaddress= "[Email protected]";}Else{    $toaddress= "[Email protected]";}if(Mb_eregi ("bigcustomer\.com",$email)){    $toaddress= "[Email protected]";}

4.8 Replacing substrings with regular expressions

Function: Ereg_replace ()/Eregi_replace ()

Prototype: String ereg_replace (string pattern, string replacement, string search);

Meaning: The function finds a string in string search for the regular expression pattern, and replaces it with a string replacement.

4.9 Using regular expressions to split a string

Function: Split ()

Prototype: Array split (string pattern, string search[, int max]);

Meaning: The string search is split into a substring that conforms to the regular expression pattern, then the string is returned to an array, and Max specifies the number of elements to enter into the array.

Example: Splitting an e-mail address

# Split () $address = "[email protected]"; $arr = Mb_split ("\.| @ ",$address);  while (list($key$valueeach ($arr)) {    Echo "<br/>". $value ;}

The output is different from the book, the book is the delimiter also output a line, actually removed the delimiter

Results:
Username
Example
Com

Fourth. String manipulation with regular expressions (5)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.