Python for ICs ICS Chapter 1 Regular Expression (2), pythoninformatics
Note: The original article is from Python for Informatics by Dr Charles Severance.
11.1 regular expression character match
We can use many other special characters to create more powerful regular expressions. The most common special symbol is the period (".") that can match any character ("."). In the following example, the regular expression "F.. m:" will match "From:", "Fxxm:", "F12m:", or "F! @ M: "It is similar to a string because the period in the expression can match any character.
import re hand = open('mbox-short.txt') for line in hand: line = line.rstrip() if re.search('^F..m:', line): print line
The combination of asterisks ("*") and plus signs ("+") that represent any number of repetitions of a character in a regular expression makes the expression particularly powerful. The asterisk indicates that, in the searched string, the matching character can appear more than zero times, while the plus sign is repeated more than once.
In the following example, we use repeated wildcards to further narrow our search range:
import re hand = open('mbox-short.txt') for line in hand: line = line.rstrip() if re.search('^From:.+@', line): print(line)
The search string "From:. + @" will successfully match the rows starting with "From:", followed by more than one arbitrary character, and followed by a "@" character. Therefore, this will match rows similar to the following:
From: stephen. marquard @ uct. ac. za
This ". +" wildcard extension matches all characters from the colon to the @ character.
From:. + @
The plus sign and the star sign are regarded as good. For example, the following string will be pushed to the last @ and matched:
From: stephen.marquard@uct.ac.za, csev@umich.edu, and cwen @ iupui.edu
It is also possible to make the asterisks and the plus sign less greedy, but you need to add another symbol. For more information about how to disable their greedy behavior, see the detailed documentation.
Related reading:
Python for ICs ICS Chapter 1 Regular Expressions (1)
Regular Expressions in Chapter 4 of Python for ICs ICS (4)
This article introduces Python for ICs ICS Chapter 1 Regular Expression (2) and will be updated in the future. for more highlights, please stay tuned!