On regular expression, regular expression _php tutorial

Source: Internet
Author: User
Tags perl regular expression types of functions

On regular expressions, regular expressions


First, what is a regular expression?

Simply put: Regular expressions (Regular expression) is a language that handles string matching;

A regular expression describes a pattern of string matches that can be used to check whether a string contains a seed string and to "take out" or "replace" a substring that matches.

Second, the application of regular expression

Regular expressions are very useful in the actual development process and can solve some complex string processing problems quickly, so I will make some simple classifications for the application of regular expressions:

The first type: data validation

For example, if you want to verify that a string is the correct email,telphone,ip and so on, then it is very convenient to use regular expressions.

The second type: Content lookup

For example, if you want to crawl a picture of a webpage, then you must find the label, which can be precisely matched with regular expressions.

Third Type: content substitution

For example, if you want to hide the cell phone number in the middle four to this mode, 123****4567, then the use of regular expressions will be very convenient.

Iii. What are the regular expressions

Here I will briefly describe the regular expression:

1. Several important concepts of regular expressions

    • subexpression: In a regular expression, if you use the contents of "()", it is called a "subexpression".
    • Capture: The result of a sub-expression match is placed in the buffer by the system, which we call "capture"
    • Reverse reference: we use "\ n", where n is a number, which refers to the contents of a buffer before referencing, which we call "reverse reference"

2, Quantity qualifier

    • X+ means: 1 or more
    • X* means: 0 or more
    • X? Indicates: 0 or 1
    • X{n} means: N
    • X{n,} indicates: at least n
    • X{n,m} means: N to M, greedy principle, will match as many as possible; If you add one later? , then the principle of non-greed

Note: X indicates the character to find

3, character qualifier

    • \d: Matches a numeric character, [0-9]
    • \d: Matches a non-numeric character, [^0-9]
    • \w: Matches a word character that includes an underscore, [0-9a-za-z_]
    • \w: Matches any non-word character, [^0-9a-za-z_]
    • \s: Matches any whitespace character, space, carriage return, tab
    • \s: Matches any non-whitespace character
    • . Means: matches any single character

In addition, there are the following types:

Range characters: [A-z], [A-z], [0-9], [0-9a-z], [0-9a-za-z]
Any character: [ABCD], [1234]
Characters not included: [^a-z], [^0-9], [^ABCD]

4. Locator

    • ^ Expression: opening identification
    • $ means: End identification
    • \b: Word boundary
    • \b: Non-word boundary

5. Escape character

    • \ used to match certain special characters

6. Select the matching character

    • | can match multiple rules

7. Special usage

    • (? =): Forward pre-check: matches a string that ends with a specified content
    • (?!) : Negative pre-check: matches a string that is not the end of the specified content
    • (?:) : Do not put the contents of the selection match into the buffer

Iv. How regular expressions are used under JavaScript

There are two ways to use regular expressions under javascript:

The first method: using the RegExp class

The methods provided are:

    • Test (str): String that matches the pattern in the string, returns True/false
    • EXEC (str): Returns the matching pattern to the string, if any, returns the corresponding string, none, and returns null;

If there are sub-expressions in the regular expression, when using the Exec method

Returned: result[0] = match result, result[1] = matching result of sub-expression 1 ...

The second method is: Use the String class

The methods provided are:

    • Search: Returns the position of the string that matches the pattern, if not, returns 1
    • Match: Returns the matching pattern to the string, if any, returns the array, none, returns null
    • Replace: Replaces the string to which the matching pattern matches
    • Split: Strings are delimited by string matching pattern, return array

How to use regular expressions under PHP

There are two types of functions that use regular expressions under PHP:

The first is: perl Regular expression functions

The methods provided are:

    • Preg_grep--Returns the array unit that matches the pattern
    • Preg_match_all--Global regular expression matching
    • Preg_match--Perform regular expression matching
    • Preg_quote--Escaping regular expression characters
    • Preg_replace_callback--Perform a search and replace of regular expressions with callback functions
    • Preg_replace--Perform search and replace of regular expressions
    • Preg_split--splitting a string with regular expressions

The second type is: POSIX regular expression functions

The methods provided are:

    • Ereg_replace--Replacing regular expressions
    • Ereg--Regular expression matching
    • Eregi_replace--case-insensitive replacement of regular expressions
    • Eregi--case-insensitive regular expression matching
    • Split--Splits a string into an array with regular expressions
    • Spliti--separating strings into arrays with regular expressions that are case insensitive
    • Sql_regcase-generates a regular expression for a match that is not size-sensitive

Vi. Summary

A regular expression is a tool that we implement a function that:

1. Powerful function

The different combinations of qualifiers in regular expressions implement different functions, and sometimes implementing a complex function requires writing very long regular expressions, and how to match them exactly is a test of a programmer's ability.

2, simple and convenient

Usually we are doing string content lookup, we can only do a specific string lookup, but the regular expression can help us to do Fuzzy Lookup, faster and more convenient, just need a regular expression string.

3, all kinds of languages are basically supported

Regular expressions are supported in mainstream languages such as Java, PHP, Javascript, C #, C + +, and so on.

4, the study is very simple, the application is very advanced

Learning regular expressions is quick and easy, but how to write an efficient, accurate regular expression in real-world development, or it takes a long time to try and accumulate.


Regular expressions

Regular is often used for JS to determine the phone number, mailbox, etc., through a simple way to achieve a powerful function

Symbolic interpretation

Character description
\ marks the next character as a special character, or a literal character, or a backward reference, or an octal escape character. For example, ' n ' matches the character "n". ' \ n ' matches a line break. The sequence ' \ \ ' matches "\" and "\ (" Matches "(".
^ matches the starting position of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after ' \ n ' or ' \ R '.
$ matches the end position of the input string. If the Multiline property of the RegExp object is set, $ also matches the position before ' \ n ' or ' \ R '.
* matches the preceding subexpression 0 or more times. For example, zo* can match "z" and "Zoo". * Equivalent to {0,}.
+ matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but not "Z". + equivalent to {1,}.
? Matches the preceding subexpression 0 or one time. For example, "Do (es)?" can match "do" in "do" or "does".? Equivalent to {0,1}.
{n} n is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two o in ' food '.
{N,} n is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match all o in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.
{n,m} m and n are non-negative integers, where n <= m. Matches at least n times and matches up to M times. For example, "o{1,3}" will match the first three o in "Fooooood". ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between a comma and two numbers.
? When the character immediately follows any other restriction (*, +,?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. The non-greedy pattern matches the searched string as little as possible, while the default greedy pattern matches as many of the searched strings as possible. For example, for the string "oooo", ' o+? ' will match a single "O", while ' o+ ' will match all ' o '.
. Matches any single character except "\ n". To match any character including ' \ n ', use a pattern like ' [. \ n] '.
X|y matches x or Y. For example, ' Z ... Remaining full text >>

Shizheng expression? An example of how the

At present, regular expression has been widely used in many software, including *nix (Linux, UNIX, etc.), HP and other operating systems, Php,c#,java and other development environments, as well as many applications, can see the shadow of the regular expression.

The use of regular expressions can be achieved through a simple approach to powerful functions. In order to be simple and effective without losing strong, resulting in regular expression code difficult, learning is not very easy, so need to pay some effort to do, after the introduction of reference to certain references, use up or relatively simple and effective.

Example: ^.+@.+\\. +$

2. History of regular expressions

The "ancestors" of regular expressions can be traced back to early studies of how the human nervous system works. Warren McCulloch and Walter Pitts the two neuroscientists have developed a mathematical approach to describe these neural networks.
In 1956, a mathematician named Stephen Kleene, on the basis of the early work of McCulloch and Pitts, published a paper titled "Representation of Neural network events", introducing the concept of regular expressions. The regular expression is the expression that describes what he calls "the algebra of the regular set", so the term "regular expression" is used.

Later, it was discovered that this work could be applied to some early studies of the computational search algorithm using Ken Thompson, the main inventor of Unix. The First Utility application of a regular expression is the QED editor in Unix.

As they say, the rest is a well-known history. Since then, regular expressions have been an important part of text-based editors and search tools.
3. Regular expression definitions
The regular expression (regular expression) describes a pattern of string matching that can be used to check whether a string contains a seed string, replaces a matched substring, or extracts a substring that matches a certain condition from a string.

When a directory is listed, the *.txt in dir *.txt or LS *.txt is not a regular expression, because the meaning of * is different from the regular type.
Regular expressions are text patterns that consist of ordinary characters, such as characters A through z, and special characters (called metacharacters). A regular expression, as a template, matches a character pattern to the string you are searching for.

3.1 Ordinary characters
Consists of all printed and nonprinting characters that are not explicitly specified as metacharacters. This includes all uppercase and lowercase alphabetic characters, all numbers, all punctuation marks, and some symbols.

3.2 Non-printable character characters meaning
\CX matches the control character indicated by X. For example, \cm matches a control-m or carriage return. The value of x must be one of a-Z or a-Z. Otherwise, c is treated as a literal ' C ' character.
\f matches a page break. Equivalent to \x0c and \CL.
\ n matches a line break. Equivalent to \x0a and \CJ.
\ r matches a carriage return character. Equivalent to \x0d and \cm.
\s matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v].
\s matches any non-whitespace character. equivalent to [^ \f\n\r\t\v].
\ t matches a tab character. Equivalent to \x09 and \ci.
\v matches a vertical tab. Equivalent to \x0b and \ck.

3.3 Special characters

The so-called special characters, is some special meaning of the characters, such as the above said "*.txt" in the *, simple ... Remaining full text >>

http://www.bkjia.com/PHPjc/856578.html www.bkjia.com true http://www.bkjia.com/PHPjc/856578.html techarticle on regular expression, regular expression one, what is a regular expression? Simply put: Regular expressions (Regular expression) is a language that handles string matching;

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.