"Main Content" 1. formatting;2. Regular Expressions
"Basic Knowledge"
First, the format of
(i) Conversion of legacy formats using%
1. Basic format: string% data
string contains the sequence of values to interpolate, with the inserted portion consisting of% and letters. Conversion type:
%s string
%d decimal integers %x Hex%o Eight -binary integers
%f decimal Floating-point number%e The floating-point number represented by scientific calculation method
Percent text%
"Example" >>>our Cat%s weights%s pounds. "% (cat, weight)
2. Format settings
%10s: Sets the minimum field width 10 characters, right alignment, left insufficient padding space
%-10s: Set minimum field width 10 characters, left alignment, right not enough padding space
%10.4s: A maximum character width of 4 truncates a string that exceeds the length limit (floating-point precision is 4 digits after the decimal point, such as 123.2000)
"Summary" for%A.BC (c is the conversion type)
A is field wide, right-justified, negative left-aligned, and B is the character width (floating-point number is decimal)
(ii) modern formatting using {} and format
1.
>>> n=42
>>> f=7.03
>>> s = "String Cheese"
>>> ' {}{}{} '. Format (n,f,s)
' 427.03string cheese '
2.
>>> ' {2}{0}{1} '. Format (n,f,s)
' String cheese427.03 '
3.
>>> ' {n}{f}{s} '. Format (n=42,f=9.443,s= "String Cheese")
' 429.443string cheese '
Formatting: The format is placed after ":" in {}.
{0:D} integral type
{0:10d} Wide field is 10 (default right-aligned)
{0:>10d} Right align {0:<10d} left align {0:^10d} center
{0:!>10d} with! fill
Second, using regular expression matching
Note: Import RE is required
Matching mode:
. represents any single character;
* represents any one of its preceding characters
. * represents any number of characters (including 0)
Fuche (0 or 1 characters)
Example:. *frank:frank contains several characters before
N. n contains 1 characters
N.? n is 1 characters or not (contains optional)
(i) Re. Match (mode, source): Whether the source starts with a pattern
1 " Young Frankenstein " 2 m = Re.match ("you", Source)3 if m: 4 Print (M.group ())
Output: You (no output if not at the beginning of the pattern)
(ii) Re. Search (mode, source): Whether the source contains a pattern
(iii) Re. FindAll (Pattern, source): Find all strings in the source that conform to the pattern
"1"
>>> m = Re.findall ("n", source)
>>> if M:
Print (m)
[' n ', ' n ', ' n ', ' n ']
"2"
>>> m = Re.findall ("N.", source)
>>> if M:
Print (m)
[' ng ', ' nk ', ' NS ']
"3"
>>> m = Re.findall ("N.?", Source)
>>> if M:
Print (m)
[' ng ', ' nk ', ' ns ', ' n ']
(iv) Re. Split (mode, source): Slice source by pattern
(v) Re. Sub (): Replace source to match
m = re.sub (' n ', '? ', source) #将source中n用? replace
Full symbolic interpretation of regular expressions
character |
Description |
\ |
Marks the next character as a special character, or a literal character, or a backward reference, or an octal escape. For example, ' n ' matches the character "n". ' \ n ' matches a line break. The sequence ' \ \ ' matches "\" and "\ (" Matches "(". |
^ |
Matches the starting position of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after ' \ n ' or ' \ R '. |
$ |
Matches the end position of the input string. If the Multiline property of the RegExp object is set, $ also matches the position before ' \ n ' or ' \ R '. |
* |
Matches the preceding subexpression 0 or more times. For example, zo* can match "z" and "Zoo". * Equivalent to {0,}. |
+ |
Matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but not "Z". + equivalent to {1,}. |
? |
Matches the preceding subexpression 0 or one time. For example, "Do (es)?" can match "do" in "do" or "does".? Equivalent to {0,1}. |
N |
N is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two o in ' food '. |
{N,} |
N is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match all o in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '. |
{N,m} |
Both M and n are non-negative integers, where n <= m. Matches at least n times and matches up to M times. For example, "o{1,3}" will match the first three o in "Fooooood". ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between a comma and two numbers. |
? |
When the character immediately follows any other restriction (*, +,?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. The non-greedy pattern matches the searched string as little as possible, while the default greedy pattern matches as many of the searched strings as possible. For example, for the string "oooo", ' o+? ' will match a single "O", while ' o+ ' will match all ' o '. |
. |
Matches any single character except "\ n". To match any character including ' \ n ', use a pattern like ' [. \ n] '. |
(pattern) |
Match pattern and get this match. The obtained matches can be obtained from the resulting Matches collection, the Submatches collection is used in VBScript, and the $0...$9 property is used in JScript. To match the parentheses character, use ' \ (' or ' \ '). |
(?:p Attern) |
Matches pattern but does not get a matching result, which means that this is a non-fetch match and is not stored for later use. This is useful when using the "or" character (|) to combine parts of a pattern. For example, ' Industr (?: y|ies) is a more abbreviated expression than ' industry|industries '. |
(? =pattern) |
Forward-checking matches the lookup string at the beginning of any string that matches the pattern. This is a non-fetch match, which means that the match does not need to be acquired for later use. For example, ' Windows (? =95|98| nt|2000) ' Can match Windows 2000 ', but does not match Windows 3.1 in Windows. Pre-checking does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, rather than starting with the character that contains the pre-check. |
(?! Pattern |
A negative pre-check matches the lookup string at the beginning of any string that does not match the pattern. This is a non-fetch match, which means that the match does not need to be acquired for later use. For example ' Windows (?! 95|98| nt|2000) ' can match Windows 3.1 ', but does not match Windows 2000 in Windows. Pre-check does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, rather than starting with the character that contains the pre-check |
X|y |
Match x or Y. For example, ' Z|food ' can match "z" or "food". ' (z|f) Ood ' matches "Zood" or "food". |
[XYZ] |
The character set is combined. Matches any one of the characters contained. For example, ' [ABC] ' can match ' a ' in ' plain '. |
[^XYZ] |
Negative character set. Matches any character that is not contained. For example, ' [^ABC] ' can match ' P ' in ' plain '. |
[A-z] |
The character range. Matches any character within the specified range. For example, ' [A-z] ' can match any lowercase alphabetic character in the ' a ' to ' Z ' range. |
[^a-z] |
A negative character range. Matches any character that is not in the specified range. For example, ' [^a-z] ' can match any character that is not within the range of ' a ' to ' Z '. |
\b |
Matches a word boundary, which is the position between a word and a space. For example, ' er\b ' can match ' er ' in ' never ', but not ' er ' in ' verb '. |
\b |
Matches a non-word boundary. ' er\b ' can match ' er ' in ' verb ', but cannot match ' er ' in ' Never '. |
\cx |
Matches the control character indicated by X. For example, \cm matches a control-m or carriage return. The value of x must be one of a-Z or a-Z. Otherwise, c is treated as a literal ' C ' character. |
\d |
Matches a numeric character. equivalent to [0-9]. |
\d |
Matches a non-numeric character. equivalent to [^0-9]. |
\f |
Matches a page break. Equivalent to \x0c and \CL. |
\ n |
Matches a line break. Equivalent to \x0a and \CJ. |
\ r |
Matches a carriage return character. Equivalent to \x0d and \cm. |
\s |
Matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v]. |
\s |
Matches any non-whitespace character. equivalent to [^ \f\n\r\t\v]. |
\ t |
Matches a tab character. Equivalent to \x09 and \ci. |
\v |
Matches a vertical tab. Equivalent to \x0b and \ck. |
\w |
Matches any word character that includes an underscore. Equivalent to ' [a-za-z0-9_] '. |
\w |
Matches any non-word character. Equivalent to ' [^a-za-z0-9_] '. |
\xn |
Match N, where n is the hexadecimal escape value. The hexadecimal escape value must be two digits long for a determination. For example, ' \x41 ' matches ' A '. ' \x041 ' is equivalent to ' \x04 ' & ' 1 '. ASCII encoding can be used in regular expressions: |
\num |
Matches num, where num is a positive integer. A reference to the obtained match. For example, ' (.) \1 ' matches two consecutive identical characters. |
\ n |
Identifies an octal escape value or a backward reference. n is a backward reference if \ n is preceded by at least one of the sub-expressions obtained. Otherwise, if n is the octal number (0-7), N is an octal escape value. |
\nm |
Identifies an octal escape value or a backward reference. If at least NM has obtained a subexpression before \nm, then NM is a backward reference. If there are at least N fetches before \nm, then n is a backward reference followed by the literal m. If none of the preceding conditions are met, if both N and M are octal digits (0-7), then \nm will match the octal escape value nm. |
\nml |
If n is an octal number (0-3) and both M and L are octal digits (0-7), the octal escape value NML is matched. |
\un |
Match N, where N is a Unicode character represented by four hexadecimal digits. For example, \u00a9 matches the copyright symbol (?). |
Python language and its Application _ the seventh chapter _ like a master to play the data