Python regular-expression learningResources from
Learning resources from Ubuntu Wiki
Introduction to Regular expressions
Regular expressions, all languages have related libraries. In essence, a regular expression (or RE) is a small, highly specialized programming language
Simple modeCharacter matching
- _ Most of the letters and characters usually match themselves. For example, the regular expression test will match the string "test" exactly.
- Regular expressions use some meta characters to help match characters:
"[" and "]"
"[" and "]": they are often used to specify a character category, so-called character categories are the character set you want to match.
The characters can be listed individually or separated by a "-" to denote the interval [a-z].
\
\d
: equivalent to [0-9]
\D
: matches any non-numeric character; equivalent to [0-9]
\s
: matches any whitespace character; equivalent to [\t\n\r\f\v]
\S
: matches any non-whitespace character; equivalent to [\t\n\r\f\v]
' \w ': matches any alphanumeric character; equivalent to [a-za-z0-9_]
\W
: ...; equivalent to [^a-za-z0-9_]
Give a chestnut: [\s,.] will match any whitespace character "," or "."
Repeat
Another feature of regular expressions is that you can specify the number of repetitions of a part of a regular expression.
“*”
- Let's first look at the meta-character * of the first repeating function. * Does not match the character "*"; he can specify that the previous character can be
Match 0 or more times. "Ca*t" can match "CT", "Caaat"
+
- "+" means to match one or more times, at least once. "Ca+t" can Match "cat" Caaat "but not" CT ".
- {M,n} (Note that there can be no spaces in m,n), it means at least m duplicates, up to n repeats.
{0,} equals *,{1,} equals +, while {0,1} is associated with? Same. If you can, it is best to use *,+, or. Very simple because they are shorter and easier to understand.
The application of regular expressions in PythonCompiling regular expressions
Regular expressions are compiled into RegexObject
instances that can provide methods for different operations, such as pattern-matching searches or string substitutions.
import rep = re.compile(‘ab\*‘)print p<\_sre.SRE\_Pattern object at 0xb76e1a70>
The trouble with the back slash
- To match the backslash, it would be 2 backslashes (markdown, I'll use the text instead of the word), but in Python the backslash would have 2 backslashes, so it would need 4 backslashes to match the backslash.
- So there is R ', and the string with an "R" Backslash will not be handled in any special way. So r "\ n" is two characters containing "" "and" N ", and" \ n "is a character that represents a newline. Regular expressions are typically represented in Python code with this raw string.
Perform a match
A few important methods (Regexobject method):
match()
: Determines if re is matched at the beginning of the string
search()
: Scan string to find the location of this re match
findall()
: Finds all substrings matching the RE, as a 列表
return
finditer()
: Find all the self strings that the re matches and return them as an iterator
Several important methods of Matchobject:
>> m = p.match( ‘tempo‘)>> print m<_sre.SRE_Match object at 80c4f68>
group()
: Returns a string that is matched by re
start()
: Returns the position where the match started
end()
: Returns the end position
span()
: Returns a tuple containing the location of the match (start, end)
Give me some chestnuts.
Import restr = "http://www.oschina.net/?code=QlSJi2&state=" pattern = re.compile (' code=\w+\& ') match = Pattern.search (str) if match: print Match.group (), Match.span ()
PYQT Learning Basics 4-episode-Python Regular Expression learning