First, Introduction
Regular expressions are commonly used in many of the tools and commands in the shell, and understanding basic regular expressions and extending the meaning and usage of metacharacters in regular expressions is good for skilled shell programming.
Regular expression re (Regular Expression) is a string of and metacharacters characters that is composed of text queries and string manipulations that match one character or character set of the text. For data flow processing to complete data filtering.
Second, detailed
POSIX is divided into basic regular expressions and extended regular expressions, and Linux supports basic regular expressions.
1, the basic regular expression
(1) The basic elements of regular expression include ordinary character and metacharacters character, such as A, B, 1, etc. belong to ordinary characters. and the *, ^, [] and other meta characters give a special meaning.
(2) square brackets [] matching character sets, you can use "-" to indicate the range of the character set (from "-" to the left character to the "-" right character end). The Linux system is case-sensitive and supports alphabetic sorting.
(3) "^" symbol to match the beginning of the line, but the "^" symbol in "[]" is no longer to match the beginning of the line is to take the counter symbol, example [^b-d] represents all characters not in the B~d range (that is, other letters, numbers, spaces, etc.). [A-za-z] [a-za-z]* matches any English word.
(4) The "\<\>" symbol is an exact matching symbol, such as \<the\> exactly matches the word, and does not match words including the character such as them, there.
(5) The "\{\}" series symbol can specify the number of repetitions (the "*" symbol can only represent a repeat of 0 or more times).
2, extended Regular expression awk and Perl and other Linux tools also support regular expression extended some of the metacharacters.
(1) "?" You can match at most 1 characters. "+" matches at least once. "*" can match 0 times and multiple times.
(2) "(usually with" | ") Combines to represent a set of optional characters. However, the "()" symbol is rarely used because "[]" is completely capable of representing a set of optional characters in lieu of "()." Re (a|e|o) d is equivalent to Re[aeo]d.
3, wildcard (1) Bash shell itself does not support regular expressions, the use of regular expressions is shell commands and tools such as Gerp, SED, awk, and so on. But the shell can use some of the metacharacters in the regular expression to implement the wildcard function, and the most common wildcard characters include regular expression metacharacters:?, *, [], {}, ^, and so on, their meaning is not exactly the same as the meaning of the regular expression.
The * symbol no longer represents the repetition of the character in front of it, but represents any character of any bit. A character represents an arbitrary character. The ^ symbol does not represent the beginning of the line, but represents the reverse.
(2) LS lists a file that starts with A~h and is not at the end of. txt, and the wildcard is [a-h]*. [^txt]*. "^" means reverse, and the meaning of the "[]" symbol is the same as that in regular expressions.
(3) The regular expression is only preceded by the use of the escape character in the curly brace, that is, \{\} is used to limit the number of matching characters, while the {} symbol in the wildcard represents a set of expressions such as {[A-h]*.txt, a?. log} is satisfied [A-h]*.txt or a?. Log of all files, {} within the expression is or the relationship.
(4) Internal variable Globignore saves the collection of file names that are ignored when the wildcard,?, *, [], {}, ^ five symbols, and Globignore variables form all of the shell's wildcard content.
(5) The need to search a large number of files or directories to match and output, high CPU and memory requirements, hackers enter file names that contain wildcard characters intentionally allowing the server to repeatedly and continuously perform a wildcard may cause a denial-of-service attack, so the server restricts the number of times the wildcard function is performed, and restricts the wildcard characters that a user enters each time.
Third, summary
(1) The shell does not support regular expressions, and its wildcard characters are not exactly the same as the symbolic meanings in regular expressions.
(2) A simple introduction to shell programming, and other more content later in the development of further detailed.