String
Contains constants and classes for working with text. The string module starts with the earliest version of Python. In version 2.0, many functions that were previously implemented only in modules were transferred to the method of a String object. Later versions, although these functions are still available, they are deprecated and will be removed in Python 3.0. The string module also contains useful constants and classes to handle strings and Unicode objects, and the following discussion focuses on this.
Functions
- String.capwords (S, sep=none): Use Str.split () to divide the parameters into words, use str.capitalize () to capitalize each word, and use Str.join (). If the optional second parameter Sep does not exist or none, the space character will be replaced by a single space, leading and trailing spaces will be deleted, otherwise Sep is used for splitting and joining words.
Example
Capwords () capitalizes the first letter of all the words in a string.
import‘The quick brown fox jumped over the lazy dog.‘print(s)print(string.capwords(s))
Result and call split (), capitalize the first letter of each word in the result list, and then call join () to concatenate the words with the result of the same sequence of actions.
The quick brown fox jumped over the lazy dog.The Quick Brown Fox Jumped Over The Lazy Dog.
Templates
The string template was added as part of PEP 292 to be used as an alternative to the built-in interpolation syntax. If a string is used. The placeholder for the Template, the word with the prefix of $ is considered a variable (such as $var), and you can include the variable name in curly braces (such as ${var}) if you need to differentiate it in context.
Class String. Template: The constructor takes a single argument, the templated string.
- Substitute (mapping, **kwds): Executes the template substitution and returns a new string. A map is any placeholder in a template similar to a Dictionary object that matches a key. Alternatively, you can provide a placeholder for the keyword parameters where these keywords are. Placeholders from Kwds take precedence when given mappings and Kwds and there are duplicates.
- Safe_substitute (Mapping, **kwds): Like substitute (), the original placeholder appears in the generated string, except if the mappings and Kwds are missing placeholders instead of throwing keyerror exceptions. Also, unlike substitute (), any other appearance of $ will simply return $ instead of raising valueerror.
While other exceptions may still occur, this approach is called "security" because the substitution always tries to return an available string instead of throwing an exception. In another sense, safe_substitute () may be anything but safe, because it silently ignores malformed templates that contain hanging delimiters, mismatched curly braces, or placeholders that are not valid Python identifiers.
Example
This example compares simple templates and similar string interpolation using the% operator and the new format string syntax using Str.format ().
ImportStringvalues = {' var ':' foo '}t = string. Template ("" " Variable: $varEscape: $ $Variable in Next: ${var}iable" "") Print (' TEMPLATE: ', T.substitute (values)) s ="" " Variable:% (Var) sescape:%%variable in Next:% (Var) siable " ""Print' interpolation: ', s% values) s ="" " Variable: {var}escape: {{}}variable in next: {var}iable" ""Print' FORMAT: ', S.format (**values))
In the first two cases, the trigger character ($ or%) is escaped by repeating two times. For format syntax, both need to be escaped by repeating them.
TEMPLATE: Variable : fooEscape : $Variable in next: fooiableINTERPOLATION: Variable : fooEscape : %Variable in next: fooiableFORMAT: Variable : fooEscape : {}Variable in next: fooiable
A key difference between a template and a string interpolation or format is that the type of the parameter is not taken into account. The value is converted to a string and the string is inserted into the result. There are no formatting options available. For example, you cannot control the number of digits used to represent floating-point values.
One benefit of using templates is to call the Safe_substitute () method, which avoids the creation of an exception when the parameter values required by the template are not fully provided.
ImportStringvalues = {' var ':' foo '}t = string. Template ("$var are here but $missing are not provided")Try: Print (' Substitute (): ', T.substitute (values))exceptKeyerror asErr:print (' ERROR: ', str (ERR)) print (' Safe_subsitute (): ', T.safe_substitute (values))
Because the value of the missing variable does not appear in the parameter dictionary, Substitue () throws a Keyerror exception. Safe_substitute () captures the exception and retains the variable expression in the text.
ERROR: ‘missing‘safe_subsitute(): foo is here but $missing is not provided
Advanced Templates
Advanced usage: You can derive subclasses of the template to customize the placeholder syntax, the delimiter character, or the entire regular expression used to parse the template string. To do this, you can override the properties of these classes:
- delimiter--This is a literal string that describes the placeholder introduction delimiter. The default value is $. Note that this should not be a regular expression, because the implementation will call Re.escape () as needed.
- Idpattern-This is a regular expression that describes the non-supported placeholder pattern (the braces are automatically added as needed). The default value is regular expression
[_a-z][_a-z0-9]*
- Flags-the regular expression flags that are applied when compiling a regular expression that is used to identify replacements. The default value is re. IGNORECASE. Please note that the RE. Verbose will always be added to the markup, so custom Idpattern must adhere to the conventions of detailed regular expressions.
Alternatively, you can provide an entire regular expression pattern by overriding the class property pattern. If you do this, the value must be a regular expression object that has a four named capturing group. The capturing group corresponds to the rules given above, as well as the invalid placeholder rules:
- Escape-This group matches an escape sequence, such as $$.
- Named-this group matches an unsupported placeholder name; it should not include delimiters in the capturing group.
- Braced-this group matches the placeholder name enclosed in parentheses; it should not include delimiters or curly braces in the capturing group.
- Invalid-this group matches any other delimiter pattern (usually a single delimiter) and it should appear at the end of the regular expression.
Example
If string. The default expression of template does not meet your requirements, and you can achieve your goal by adjusting the regular expression used to match the variable name in the body of the templates. One simple way is to change the two class properties of delimiter and Idpattern.
ImportString class MyTemplate(string. Template):delimiter ='% 'Idpattern =' [a-z]+_[a-z]+ 'Template_text ="' Delimiter: Replaced:%with_underscore ignored:%notunderscored 'D = {' With_underscore ':' replaced ',' notunderscored ':' not relaced ',}t = MyTemplate (template_text) print (' Modified ID pattern: ') Print (T.safe_substitute (d))
In this case, the substitution rule is changed so that the delimiter is% instead of $, and the variable name must contain an underscore somewhere in the middle. The%notunderscored pattern is not replaced by anything because it does not contain an underscore character.
Modified ID pattern: Delimiter : % Replaced : replaced Ignored : %notunderscored
For more complex changes, you can override the Pattern property and define a new regular expression. The provided pattern must contain four named groups that capture the escaped delimiter, the named variable, the hardened version of the variable name, and the invalid delimiter pattern.
import stringt = string.Template(‘$var‘)print(t.pattern.pattern)
Because T.pattern is a regular expression that has been compiled, we can only look at the real string by its pattern property.
\$(?: (?P<escaped>\$) | # Escape sequence of two delimiters (?P<named>[_a-z][_a-z0-9]*) | # delimiter and a Python identifier {(?P<braced>[_a-z][_a-z0-9]*)} | # delimiter and a braced identifier (?P<invalid>) # Other ill-formed delimiter exprs )
If you want to create a new template, such as a variable expression with {{var}}, you can use a pattern like this:
ImportStringImportRe class mytemlate(string. Template):delimiter =' {'Pattern =R ' ' \{\{(?: (? p<escaped>\{\{) | (? p<named>[_a-z][_a-z0-9]*) \}\}| (? p<braced>[_a-z][_a-z0-9]*) \}\}| (? p<invalid>)) "t = mytemlate (' {{{{{{ var}} ') Print (' MATCHES: ', T.pattern.findall (t.template)) print (' substituted: ', T.safe_substitute (var=' Replaceent '))
Even if the named and braced modes are the same, they still need to be provided separately. Here is the output:
MATCHES: [(‘{{‘, ‘‘, ‘‘, ‘‘), (‘‘, ‘var‘, ‘‘, ‘‘)]SUBSTITUTED: {{replaceent
Formatter
The formatter class implements the same layout specification language as the format () method of STR, and its attributes include type coercion, alignment, attribute and field references, naming and location template parameters, and type-specific formatting options. Most of the time, the format () method is a more convenient interface for these features, but formatter is used as a way to construct subclasses for situations that require change.
Constants
The string module includes some constants related to the ASCII and numeric character sets.
ImportStringImportInspect def is_str(value): returnIsinstance (value, str) forName, valueinchInspect.getmembers (String, IS_STR):ifName.startswith (' _ '):ContinuePrint'%s=%r\n '% (name, value))
These constants are useful when working with ASCII data, but their applications are limited because of the increasing popularity of non-ASCII text encountered in some form of Unicode.
ascii_letters=‘abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ‘ascii_lowercase=‘abcdefghijklmnopqrstuvwxyz‘ascii_uppercase=‘ABCDEFGHIJKLMNOPQRSTUVWXYZ‘digits=‘0123456789‘hexdigits=‘0123456789abcdefABCDEF‘octdigits=‘01234567‘printable=‘0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\‘()*+,-./:;<=>[email protected][\\]^_`{|}~ \t\n\r\x0b\x0c‘punctuation=‘!"#$%&\‘()*+,-./:;<=>[email protected][\\]^_`{|}~‘whitespace=‘ \t\n\r\x0b\x0c‘
Python string.md