Python Regular Expression Tutorial: basics, python Regular Expression
Preface
Someone raised a requirement before. I think it is best to use a regular expression. Considering that every time I used a regular expression, it was temporary, so this time I learned the regular expression while completing the task. Refer to a video Regular Expressions on PyCon2016.
I will summarize the regular expressions in several articles.
The first part is as follows:
Basic
The most basic usage of regular expressions is summarized here. Most of the content is often used by me (and most programmers), so I just mentioned it, use examples to describe only a few of them.
. All characters except line breaks
^ Beginning of Line
$ End of line
[Abcd] a character in abcd.
[^ Abcd] any character except abcd
[A-d] is equivalent to [abcd]
[A-dz] is equivalent to [abcdz]
\ B word boundary
\ W letters, numbers, or underscores are equivalent to [a-zA-Z0-9 _]
\ W is opposite to \ w
\ D number, equivalent to [0-9]
\ D is opposite to \ d
\ S blank character, equivalent to [\ t \ n \ r \ f \ v]
\ S is opposite to \ s
{5} The regular expression section (the same below) appears exactly five times before this
{2, 5 }~ Appears 2 to 5 times
{2 ,}~ Appears twice or multiple times
{, 5 }~ 0 to 5 times
*~ Appears 0 or multiple times
? ~ 0 or 1
+ ~ Appears once or multiple times
ABC | DEF matches ABC or DEF
\ Escape character, for example, \ indicates matching *, \ $ indicates matching $ *
\ B and \ are briefly described using the following examples:
\ B:
>>> re.search(r'\bhello\b', 'hello')<_sre.SRE_Match object; span=(0, 5), match='hello'>>>> re.search(r'\bhello\b', 'hello world')<_sre.SRE_Match object; span=(0, 5), match='hello'>>>> re.search(r'\bhello\b', 'hello,world')<_sre.SRE_Match object; span=(0, 5), match='hello'>>>> re.search(r'\bhello\b', 'hello_world') >>>
In fact, \ B is generally the same as \ W, but \ B can match non-display class characters such as the beginning and end of the line, while \ W cannot.
\:
>>> re.search(r'\$100', '$100')<_sre.SRE_Match object; span=(0, 4), match='$100'>>>> re.search(r'$100', '$100') >>>
To match characters with special meanings in regular expressions, such as $, ^, and *, escape.
Raw string:
In addition, in the previous example, an r is added before the pattern string, which means raw string, followed by the string. The Pyhton interpreter does not need to escape it. Because \ has special meanings in Python strings and regular expressions, if it is not raw string, it must be expressed as a \ character, you need four \ (escape once in the Python interpreter, two \ represents one \, and the remaining two \ Are escaped again in the regular expression, \). For example:
>>> re.search(r'\bhello\b', 'hello')<_sre.SRE_Match object; span=(0, 5), match='hello'>>>> re.search('\bhello\b', 'hello') >>> re.search('\\bhello\\b', 'hello')<_sre.SRE_Match object; span=(0, 5), match='hello'>>>> re.search('\\\\hello\\\\', '\\hello\\') <_sre.SRE_Match object; span=(0, 7), match='\\hello\\'>>>> re.search(r'\\hello\\', '\\hello\\') <_sre.SRE_Match object; span=(0, 7), match='\\hello\\'>>>> print('\\hello\\')\hello\
Summary
The above is all about the basics of Python regular expressions. With this knowledge, the basic use of regular expressions is no problem. For some special cases, you need to learn more advanced usage. Please wait for the following articles. I hope the content of this article will help you in your study or work. If you have any questions, you can leave a message. If you have any questions, you can leave a message.