Python regular expressions are an extremely useful text processing technology, but they are difficult to use. The re module of Python makes many beneficial improvements to the basic regular expressions. For programmers who need to process text.
This article provides a quick start tutorial for readers who are not familiar with regular expressions. Of course, this article is also helpful for some readers who are familiar with regular expressions in other languages, because you can understand the special characteristics of Python in regular expressions.
1. What is a regular expression?
When writing a program or webpage that processes strings, it is often necessary to find strings that conform to certain complex rules (or patterns. Regular Expressions are tools used to describe these rules (or patterns. In other words, a regular expression is the code that records text rules. Once the required text is found, you can modify it accordingly.
Remember the wildcards used for file search in the Windows command line, that is, * and ?. When we look for all the PDF files in a directory, we only need to search for *. pdf. Here, * is interpreted as any string. Similar to wildcards, Python regular expressions are also a tool for text matching. They can more accurately describe your needs than wildcards, for example, finding all phone numbers on a web page.
We know that telephone numbers generally have a fixed format: area code-telephone number, that is, a telephone number that starts with 0, followed by 2-3 numbers, and then a hyphen "-", A string consisting of 7 or 8 digits (for example, 010-12345678 or 0634-1234567 ).
2. The simplest Regular Expression
The best way to learn regular expressions is to start with a specific example and let the reader experiment in person. The following are some simple examples and detailed descriptions of them. When we look for to in a string, you can use the regular expression. This is almost the simplest regular expression. It can precisely match such a string: it consists of two characters, the first character is t, and the last one is o.
For demonstration, we provide a function re_show (), which can be considered as an encapsulation of the re module, it will match the given string (that is, a string matches a regular expression, which usually means that some or some parts of the string can satisfy the conditions given by the expression) in the left-side navigation pane.
We will not further introduce this function. As long as you know that the first parameter of re_show () is a regular expression, and the second parameter is a string to be matched, when the Matching content is found, enclose it with curly brackets. The source code is as follows:
- import re
- def re_show(pat, s):
- print re.compile(pat, re.M).sub("{\g<0>}", s.rstrip()),'\n'
- s = '''Python runs on Windows, Linux/Unix,
- Mac OS X, OS/2, Amiga, Palm Handhelds, and Nokia mobile phones.
- Python has also been ported to the Java and .NET virtual machines.'''
- re_show("to",s)
Specifically, the function calls re_show ("to", s) to find whether the string s contains the string to, or whether the string s matches the Python regular expression to. If yes, add curly brackets to the string. The execution result of the above Code is as follows:
- Python runs on Windows, Linux/Unix,
- Mac OS X, OS/2, Amiga, Palm Handhelds, and Nokia mobile phones.
- Python has also been ported {to} the Java and .NET virtual machines.
- Introduction to Python system files
- How to correctly use Python Functions
- Detailed introduction and analysis of Python build tools
- Advantages of Python in PythonAndroid
- How to Use the Python module to parse the configuration file?