This section describes the basic syntax of regular expressions in PHP: delimiters and atoms. The content includes the definition of delimiters and the definition and composition of atoms. in this section, we will introduce the basic syntax of regular expressions in PHP: delimiters and atoms. The content includes the definition of delimiters and the definition and composition of atoms. The composition of atoms is flexible to meet our needs for processing strings. Before that, we need to first understand a regular expression processing function preg_match () for testing, so as to facilitate the tutorial.
Let's take a look at the definition of the regular expression, the composition of the regular expression, and the preg_match () function:
1. the delimiter of the regular expression.
Any character except letters, numbers, and backslash can be a separator, such as | ,//,{},!! And so on, but note that if there is no special need, we use the forward slash // as the delimiter of the regular expression.
2. Regular expression structure.
Let's take a look at this formula:/atom and metacharacter/pattern modifier
That is to say, the atoms and metacharacters of the regular expression are placed between delimiters, while the pattern modifier is placed outside the delimiters.
3, preg_match () function
We will explain it in detail later. here we only return a Boolean value to help the test, indicating whether the matching is successful.
After learning about the above simple content, let's get started.
Atoms in regular expressions
What is Atom? An atom is the most basic unit of a regular expression and must contain at least one atom. As long as a regular expression can be used separately, it is an atom.
This concept may seem vague. it doesn't matter. next we will introduce the constructor of the regular expression.
Atomic composition
1. all printed (all strings that can be output on the screen) and non-printed characters (invisible, such as spaces and line breaks)
2. if you want to use all meaningful characters as atoms, use the "" Escape character to escape them. For example:. \ * \ + \? \ (\ <\>.
Note: "\" escape characters can be converted into meaningless characters, and meaningless characters can be converted into meaningful characters. For example, d indicates any decimal number.
3. in a regular expression, you can directly use the atoms that represent the range provided by the system, as shown in the following table:
Indicates the range of atoms. |
Description |
Custom atomic notation |
\ D |
Represents any decimal number. |
[0-9] |
\ D |
Represents any character except the number. |
[^ 0-9] |
\ S |
Any blank character, space, \ r \ t \ f |
[\ N \ r \ t \ f] |
\ S |
Indicates any non-blank |
[^ \ N \ r \ t \ f] |
\ W |
Represents any word a-zA-Z0-9 _ |
[A-zA-Z0-9 _] |
\ W |
Represents any non-word, any character except a-zA-Z0-9 _ |
[^ A-zA-Z0-9 _] |
4. a custom atomic table (using square brackets []) can match any one of the atoms in square brackets.
In the preceding table, the range atoms provided by the system are converted to an equivalent value using a custom method. Since it is impossible for the system to provide all the atoms I need, it is necessary to customize an atomic table. for example, if we want to match letters or numbers, we need to write the atom into a [a-zA-Z0-9].
Note:
A, the symbol "-" represents the range, such as [a-z] represents lowercase letters a to z, but do not write this form as A [A-9!
B. the symbol "^" indicates the inverse. it must be placed at the beginning of square brackets. for example, if we want to match a non-number, the atom is [^ 0-9].
Let's take a look at the example of using the regular expression atom. the code is as follows:
The code is as follows:
$ Pattern = '/\ d/'; // number atomic table, that is, the regular expression mode
$ String = 'dsadsadsa '; // string to be matched
If (preg_match ($ pattern, $ string )){
Echo "regular expression{$ Pattern}And string{$ String}Matched successfully ";
} Else {
Echo "the regular expression {$ pattern} and string {$ string} failed to match ";
}
?>
Note: if one of the atoms in the custom atomic table is matched by a string, the match is successful. Remove the square brackets of the custom atomic table to match the entire string. For example, '/abc/' indicates that the substring abc must exist in the string to be matched, '/[abc]/' indicates that the string is matched as long as it contains any character in a, B, and c.
You can modify the pattern in the above instance (that is, the pattern variable $ pattern of the regular expression) to verify the atoms of the regular expression described in this section.
This section describes the delimiters and atoms of regular expressions. I believe that you will use the atoms of regular expressions based on the exercises. Next we will introduce the metacharacters in php regular expressions. do not miss them.