Introduction:
Regular expressions are often used in various programs dealing with text processing. Mastery of regular expressions is a basic skill. This paper mainly describes the principle and application of regular expressions, and gives detailed examples for situational learning, whether using VIM, sed,awk,grep and other programs, can find help in this article. In addition, we can easily use replacing a word for quick editing in VS2010 these Ides, this article will introduce you to the method of vim to implement this function, and this article will also tell you, vim under the more powerful replacement function: "The function of the first two parameters to replace" and other functions. Proficiency in regular expression, is conducive to the rapid implementation of text editing.
This article comes from: A concise tutorial on regular Expressions--grep Vim's find and replace instance
1. Basic Knowledge
what is a character.
The character, the symbol is simple to say is abstract, certain symbol expresses certain meaning. Generally speaking, the character meaning has three kinds of "general character, special character, escape character". For example, in ASCII, a denotes a (generic character);/represents the subsequent character of the turn,% represents the formatted output (special character); \ n represents a carriage return (escape character), is to make n do not represent its own meaning (general meaning or special meaning). At the same time, the escape character can make a general character also express special meaning such as \a, also can let a special character express general meaning/%. The special characters of different language system are different, and the conversion characters are different. For example in Chinese, ". "Represents the end of a sentence, but is placed in the escape character" "only to represent". ”。 (So, theoretically, as long as you like, you can always put all the escape characters and formatting special characters in ordinary characters, but at this point, your symbol system needs more characters to distinguish)
2. Regular expressions
As you know, the regular expression in our usual sense consists of two parts: the Regular expression Processing engine (grep,c#) and regular expression syntax, a certain syntax to input in the engine to work, it is important to understand this. And this article is mainly for you to tell the syntax of regular expressions.
Now, let's talk about the escape characters in regular expressions (note that different languages have different meanings for special characters, such as $ in C language as generic characters but not in regular expressions):
2.1 Special characters:
/ [ ] ^ % $ * + . | () –
Special characters |
Description |
$ |
Matches the end position of the input string. If the Multiline property of the RegExp object is set, then $ also matches ' \ n ' or ' \ R '. To match the $ character itself, use \$. |
() |
Marks the start and end position of a subexpression. The subexpression can be obtained for later use. To match these characters, use \ (and \). |
* |
Matches the preceding subexpression 0 or more times. To match the * character, use \*. |
+ |
Matches the preceding subexpression one or more times. to match the + character, use \+. |
. |
Matches any single character except the newline character \ n. to match., please use \. |
[] |
Marks the beginning of a bracket expression. To match [, use \[. |
? |
Matches the preceding subexpression 0 or more times, or indicates a non-greedy qualifier. Want to match? characters, please use \?. |
\ |
Marks the next character as either a special character, or a literal character, or a backward reference, or a octal escape character. For example, ' n ' matches the character ' n '. ' \ n ' matches line breaks. The sequence ' \ \ ' matches ' \ ' and ' \ (' matches '. |
^ |
Matches the starting position of the input string, unless used in a bracket expression, at which point it means that the character set is not accepted. To match the ^ character itself, use \^. |
{} |
The beginning of a tag qualifier expression. To match {, use \{. |
| |
Indicates a choice between two items. to match |, use \|. |
2.2 Escape
\d
Represents a digit character, equivalent to [0-9].
\d
Represents a non-digit character, equivalent to [^0-9].
\f
Represents a form-feed character (Unix).
\ n
Represents a linefeed character (a new line break).
\ r
Represents a Carriagereturn character (the reset character).
\s
A whitespace (white space character) that represents any of the newline characters, including spaces, tab,form-feed, and so on.
\s
Represents a Non-whitespace (Non-white-space character) that is not a line feed.
\ t
Represents a tab.
\v
Represents a vertical tab (Unix).
\w
Represents any one of the display characters, including underscores (numbers,