Review Regular Expressions
1. \, backslash, mark the next character as a special character, an original character, a backward reference, or an octal escape character.
Eg: \ n line feed, \ backslash ,\((;
Backward reference:
When parentheses are added, the Regular Expression Engine records the matching information in the brackets and stores the information in the temporary cache;
Place a part of the regular expression in parentheses to group them. Then, you can use some regular operations such as repeated operators for the entire group;
Each captured sub-match is stored according to the content from left to right in the regular expression. Stores the sub-matched cache area numbers starting from 1 and continuing until a maximum of 99 word expressions. Each cache area can be accessed using '\ n, N indicates one or two decimal numbers in a specific cache area.
PHP code for filtering Regular Expressions
<? PHP
Header ('content-type: text/html; charset = UTF-8 ');
$ Content = "<Table class = 'abc' style = 'abc'> <tr> <TD> adfddi-</TD> </tr> </table> ";
$ Pattern1 = '/</? [^>] +>/'; // Filter all HTML
$ Pattern2 = '/<([A-Za-Z] +) [^>] *>/'; // filter HTML attributes
$ Newcontent = preg_replace ('/<([A-Za-Z] +) [^>] *>/', '<\ 1>', $ content );
Echo $ newcontent;
?>
2. ^: match the start position of the input string. If the multuline attribute of the Regexp object is set, ^ matches the position after '\ n' or' \ R '.
3. $: match the end position of the input string. If the multuline attribute of the Regexp object is set, ^ matches the position before '\ n' or' \ R '.
4. *. Match the expression zero or multiple times;
5. +: match the previous expression once or multiple times;
6 .?, Match the previous expression zero or once
7. {n}, match the expression n times
8. {n ,}, match the front-face expression at least N times;
9. {n, m}, matching the previous expression at least times, at most m times
10 .? When this character follows any other delimiter (*, + ?, After {n}, {n, m}), the matching mode is not greedy. The non-Greedy mode matches as few searched strings as possible, while the default greedy mode matches as many searched strings as possible.
11 .., match any single character except '\ n;
12. (pattern), match pattern and obtain this match for backward reference;
13 .(? : Pattern), matches pattern but does not get this match. That is to say, this is a non-get match and will not be stored for future use.
14 .(? = Pattern), forward pre-query, search for the string at the beginning of any string that matches pattern. This is a non-get match, that is, this match does not need to be obtained for future use. For example (? = 95 | 98 | nt | 2000) "can match" Windows "in" Windows2000 ", but cannot match" Windows "in" windows3.1 ". Pre-query does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, instead of starting after the pre-query characters. However, in Windows (? = 95 | 98 | nt | 2000) "cannot match if something is added to it
15 .(?! Pattern), regular expression negation pre-query, search for a string starting from any string that does not match pattern.
16. x | y, matching X or Y. For example, if 'z | food' matches 'Z' or 'food', '(z | f) Ood' matches 'zood 'or 'food '.
17. [xyz], Character Set combination (character class ). Match any character in it. For example, '[ABC]' can match 'A' in 'plain '. Special characters only have the special meaning of backslash \, which is used to escape characters. Other special characters, such as asterisks, plus signs, and brackets, are common characters. Hyphen-if it appears in the middle of the string, it indicates the character range description; if it appears in the first place, it is only a common character.
18. [^ XYZ]: exclude Character Set combination. Match any character not listed.
19. [A-Z]. character range, matching any character in the specified range.
20. [^ A-Z], excluding character ranges. Match any character not listed.
21. \ B: match a word boundary, that is, the position of the word and space.
22. \ B, matching non-word boundary.
23. \ CX, matching the control characters specified by X. For example, \ cm matches a control-M or carriage return character. The value of X must be either a A-Z or a-Z. Otherwise, it is treated as a literal 'C' character.
24. \ D, matching a number character.
25. \ D, matching a non-numeric character.
26. \ f, match a newline.
27. \ n, match a line break.
28. \ r, match a carriage return.
29. \ s. match any blank characters, including spaces, tabs, and page breaks.
30. \ s, match any non-blank characters.
31. \ t, matching a tab.
32. \ v, matching a vertical tab.
33. \ W, matching any word characters including underlines, equivalent to [A-Za-Z _];
34. \ W, matching any non-word characters.
35. \ XN, matching n, where N is the hexadecimal escape value must be determined by two numbers. For example, "\ x41" matches "".
36. \ num, which references a substring backward. The substring matches the num of the regular expression with the substring enclosed in parentheses. Num is a positive integer starting from 1. The maximum value is 99.
37. \ n, matching an octal escape value or a backward reference.
38. \ UN, matching n, where n is a Unicode character represented by four hexadecimal numbers. For example, \ u00a9 matches the copyright symbol ().
This article is from the blog of "Tiger brother", please be sure to keep this source http://7613577.blog.51cto.com/7603577/1531057