Vamei Source: Http://www.cnblogs.com/vamei Welcome reprint, Please also keep this statement. Thank you!
I'll start with the regular expression of Python's standard library. Regular expressions are a common tool in word processing and do not require additional system knowledge or experience. We will explain the system-related packages in the back.
The primary function of regular expressions (regular expression) is to search for what you want to find by using a specific pattern (pattern) from a string.
Grammar
Previously, we introduced string-related processing functions. We can implement simple search functions through these functions, such as searching for a substring of "you" from the string "I love You". But sometimes we just have a vague idea of what we're looking for, and we can't say specifically that I'm looking for "you", for example, I want to find out the numbers contained in the string, which can be anywhere from 0 to 9. These vague targets can be passed to Python as information written to the regular expression, allowing Python to know what we are looking for.
(Official documentation)
Using regular expressions in Python requires a package re in the standard library.
Import rem = Re.search (' [0-9] ', ' Abcd4ef ') print (M.group (0))
Re.search () receives two parameters, the first ' [0-9] ' is what we call a regular expression, and it tells Python, "Listen, I'm looking for a numeric character from 0 to 9 from a string."
Re.search () If the required substring is found from the second argument, an object m is returned, and you can view the results of the search by using the M.group () method. If no qualifying characters are found, Re.search () returns none.
If you are familiar with Linux or Perl, you should already be familiar with regular expressions. When we open the Linux shell, we can use regular expressions to find or delete the files we want, such as:
$RM Book[0-9][0-9].txt
This is to delete a file similar to Book02.txt. Book[0-9][0-9].txt contains information that begins with a book, followed by two numeric characters, followed by a ". txt" file name. If the file name does not meet the criteria, say:
Bo12.txt
Book1.txt
Book99.text
Will not be selected.
The ability to have regular expressions built into Perl is said to be the strongest of all regular expression systems, which is one reason why Perl is a powerful tool for system administrators.
Functions of regular expressions
m = Re.search (pattern, string) # searches the entire string until a matching substring is found. m = Re.match (pattern, string) # Check to see if the string conforms to the regular expression from the beginning. Must match from the beginning of the first character of a string.
You can select one of these two functions to search. In the above example, if we use Re.match (), we will get none because the string starts with ' a ' and does not conform to the ' [0-9] ' requirement.
For the returned m, we use M.group () to invoke the result. (We'll explain in more detail later m.group ())
We can also replace the searched substring after the search:
# Search using regular transform pattern in string, replace with another string replacement for the searched string. Returns the replaced string.
In addition, the usual regular expression functions have
Re.split () # Splits a string according to a regular expression, placing all the substrings in a table (list) back
Re.findall () # Searches a string based on a regular expression, placing all of the matching substrings in a given table (list) and returning
(Once you're familiar with the above function, you can look at Re.compile () to improve your search efficiency.) )
Write a regular expression
The key is to write information into a regular expression. Let's look at the common syntax for regular expressions:
1) Single character:
. Any one of the characters
A|b character A or character B
[AFG] A or F or a character of G
[0-4] 0-4 in the range of a character
[A-f] A character in the A-f range
[^m] is not a character of M
\s a space
\s a non-whitespace
\d [0-9]
\d [^0-9]
\w [0-9a-za-z]
\w [^0-9a-za-z]
2) Repeat
followed by a single character, representing several such similar characters
* Repeat >=0 times
+ Repeat >=1 times
? Repeat 0 or 1 times.
{m} repeats m times. For example, a{4} is equivalent to AAAA, then for example [1-3]{2} equivalent to [1-3][1-3]
{m, n} repeats M to n times. For example, a{2, 5} means a repeats 2 to 5 times. A repetition less than m, or a repetition greater than N, does not meet the criteria.
Examples of strings that match regular expressions
[0-9] {3,5} 9678
A?b b
A+b Aaaaab
3) Location
^ Starting position of the string
$ end position of the string
The regular expression matches the string example does not match the string
^ab.*c$ ABEEC CABEEC (if used with Re.search (), will not be found. )
4) Return control
It is possible to further refine the results of the search. For example, one of the following regex expressions:
Output_ (\d{4})
The regular expression encloses a small regular expression with parentheses (), \d{4}. This small regular expression is used to filter the desired information from the results (here is the four-digit number). This is part of a regular expression that is enclosed in parentheses, called a group.
We can query the group by M.group (number). Group (0) is the entire regular expression of the search results, group (1) is the first group ...
Import rem = Re.search ("Output_ (\d{4})", "Output_1986.txt") Print (M.group (1))
We can also name the group to better use the M.group query:
Import rem = Re.search ("Output_ (? P<YEAR>\D{4}) "," Output_1986.txt ") # (? P<name>, ...) Name Print for group (M.group ("Year"))
Practice
There is a file that has a file name of Output_1981.10.21.txt. Use Python below: Read the datetime information in the file name and find out what the day is. Rename the file to Output_yyyy-mm-dd-w.txt (YYYY: Four-bit year, MM: two-bit month, DD: Two-bit day, W: one-digit week, and assumes Monday is the first day of the week)
Summarize
Re.search () Re.match () re.sub () Re.findall ()
Regular Expression Composition method
Python standard library 01 regular expressions (re-packages)