A string is a collection of ordered characters used to store and represent text-based information, one of Python's built-in data structures. A string is a sequence of multiple characters, and therefore supports the related operation of a sequence.
The supported operations are:
1. Access by index
>>> s = ' This is a test string ' >>> len (s) #len () function can get the length of the string 21
After you get the length of the string, you can access the contents of the string based on the index.
>>>s[0] #反向索引时 #正向索引 ' t ' >>>s[-1], the index value is counted from right to left, counting from 1 onwards. ' G '
2. Support for Shard operations
>>>s[0:4] ' This ' >>>s[:4] #没有左值, which means starting from the first value ' this ' >>>s[0:] #没有右值, representing the starting position to the last value ' This is a test string ' >>> S[::2] #有步进值2, which means to take a value every 2. ' Ti SATs TIG '
3. Support two string addition
>>>s = s + ' xyz ' >>>s ' This is a test string xyz '
String has a related property: In the string variable definition, the character inside the string cannot be changed.
Common string methods
Find |
Finds a substring in a longer string and returns the leftmost index where the string is located, no returns 1 |
Join |
Adding elements to a queue |
Lower |
Returns the lowercase master of a string |
Replace |
Returns the string after all occurrences of a string have been replaced |
Split |
Splitting a string into a sequence |
Strip |
Returns a string that removes spaces on both sides |
Translate |
Replaces a single character and makes multiple substitutions. |
Join |
". Join (STR) inserts a delimiter between each element of the string str", "can also be substituted with other characters |
Ord |
Converts a character to the corresponding ASCII value |
Rstrip |
String that removes the rightmost space |
Chr |
Converts an integer string to the corresponding character |
In addition to the above-mentioned methods, the Python standard library re module supports more advanced pattern-based string processing, as well as more advanced text processing tools such as XML parsers.
String type
Single quotes, double quotes, and triple quotes represent the same string
Escape character, common with \n,\t,\\,\ ', \ "
Raw string: R "C:\new\test.spam", this time suppresses the transfer character, so that the string only represents the literal meaning. However, the raw string cannot end with a single \.
Bytes string: B ' sp\x01am '
The Unicode string in Python2: U ' eggs\u0020spam ', the default string in Python3 is the Unicode string.
Conversion of strings and numbers:
the int () method converts a string to a number
The STR () method converts a number to a string
Ord (s) method to convert a string to the corresponding ASCII code
The Chr (n) method tells the ASCII number to convert the character corresponding to the flavor.
String formatting (most important content)
1. Format the expression, using print or printf output, the basic format is as follows:
"%[(name)][flags][width][.precision] TypeCode"% what to output
The field has the following meanings:
(name) indicates that the following object is automatically parsed according to the name, and that the value of name corresponds to the display
Example:
Print ('% (n) d,% (x) s '% {"n": 1, "x": "Spam"}) ' 1, spam '
Flags conversion flags, which are left-aligned, + indicate that the conversion value is preceded by a positive sign, and 0 indicates that the conversion value is 0 filled when it is not wide enough.
width the width of the output string
. Precision When you output decimals, the number of digits, when precision is the * number, indicates that the number of decimal places is given by the following value.
Example:
Print ('%f,%.2f,%.*f '% (1/3.0, 1/3.0, 4, 1/3.0) ' 0.33333, 0.33, 0.3333 '
typcode:printf style
TypeCode type for string formatting
Conversion type |
Meaning |
D,i |
Signed decimal number |
O |
unsigned octal number |
U |
Non-signed decimal number |
X |
unsigned hexadecimal (lowercase) |
X |
unsigned hexadecimal (uppercase) |
E |
Floating-point number represented by scientific notation (lowercase) |
E |
Floating-point number represented by scientific notation (uppercase) |
F,f |
Decimal floating-point number |
G |
If the exponent is greater than-4 or less than the precision value is the same as E, the other case is the same as F |
G |
If the exponent is greater than-4 or less than the precision value is the same as E, other cases with the F-phase |
C |
Single-character (receives an integer or single character string) |
R |
String (convert any Python object using repr) |
S |
String (convert any Python object using str) |
A |
ASCII characters |
% |
Display |
(name) represents the opportunity to format a string using a dictionary, using a dictionary format:
1. The content that needs formatting repeats itself
2. Need to format a lot of content
Name corresponds to the dictionary key, which is equivalent to the output DICT[NAME]
2. Format Call Method--format method
: The number is followed by the fill character, alignment, output width ... The standard format is:
"{[field_name]! [Conversion]:[[fill][align]][sign][#][0][width][,][.precision][typecode]} ". Format (' Test ')
Field_name two forms, in the form of 0,1,2, represent the dictionary formatting, corresponding to the key values in the dictionary, when looking for variables according to their location, in the form of keyword.
Conversion represents the conversion format and supports three types, s|r|a, which are strings in Str, repr, and ASCII three format.
Fill padding character, which can be any character except ' {', '} '
Align alignment
' < ' left justified
' > ' Right align
' = ' digit alignment
' ^ ' Center alignment
Sign only for numeric types
' + ' indicates that positive and negative numbers should be signed.
'-' indicates negative numbers to be signed, default options
Space indicates a positive number preceded by a space, negative numbers preceded by '-'
, used to separate numbers, such as 1,234.5
Width alignment Widths
. Precision the number of digits of the output number when the decimal is output
The TypeCode output Object format has the following values, and the typecode meanings in the% formatting are the same:
"B" | "C" | "D" | "E" | "E" | "F" | "F" | "G" | "G" | "N" | "O" | "S" | "X" | "X" | "%"
Example 1: direct-specific values
>>> ' {0!s:-^40} '. Format (' Test ') '------------------test------------------' >>> ' {0!r:-^40} '. Format (' Test ') '-----------------' Test '-----------------'
Example 2:0 doing a fill character
>>> ' {0!s:0^40} '. Format (' Test ') ' 000000000000000000test000000000000000000 ' >>> ' {0!r:0^40} '. Format (' Test ') ' 00000000000000000 ' Test ' 00000000000000000 '
Example 3: Variables
>>> ' {0:{fill}{align}{width} '. Format (' Test ', fill= '-', align= ' ^ ', width= ' ") '------------------ Test------------------'
Example 4: Number of digits (, number used to separate numbers)
>>> ' {0:<40,.5} '. Format (1234.5678) ' 1,234.6
‘
Example 5:type=b, when displaying a binary number.
>>> ' {0:<10b} '. Format (42) ' 101010 '
Basic usage:
1. ' {0},{1} '. Format (' Value0 ', ' value1 ')
The 0,1 in {} is the variable to be output, corresponding to the VALUE0 and value1 in format, if there is no number in {}, then output the contents of formt in turn. The corresponding output is adjusted when the order is adjusted, for example ' {1},{0} '. The output of format (' Value0 ', ' value1 ') is value1,value0.
2. ' {name},{age} '. Format (name= ' Han ', age= ' 25 ')
The variable name in {} corresponds to the variable names in format, even if the swap order does not affect
' {name},{age} '. Format (age=25,name= ' Han ')
Same as the above output
Precautions:
When formatting the format, try to specify the position of each output value, which is advantageous for the precise output
The positional parameter must precede the keyword argument.
Bytes (binary format, python3.x unique, python2.x bytes and str not distinguished)
The difference between bytes and Str is that bytes is a sequence of byte, and STR is a sequence of Unicode
B=str.encode () S=bytes.decode ()
Bytes Precautions:
In Encode () and decode (), you can specify the format of the encoding conversion, such as decode (' utf-8 ') or decode (' GBK '). The content of Web pages that are used to obtain the content of a Web page is uniformly converted into a common Unicode format for easy processing. The content of different Web pages needs to guess the encoding format of the webpage through the third party library.
Network programming socket Data transmission is the bytes format, before sending it with the encode () function to convert it to bytes format, after receiving, you need to use the Decode () function to convert the received data to UTF8 format. The process is as follows:
Data str encode bytes-> network transmission->bytes decode Data str
Import sysreload (SYS) sys.setdefaultcoding (' Utf-8 ')
Unicode refers to the encoding specification, similar to the HTTP protocol, while Gbk,utf-8 is similar to Apache, Nginx, Character set encoding, and Unicode support and implementation. Bytes is generally used for network data transmission, and STR is generally used for data processing between programs. Therefore, in network programming, before the data transmission, the general to say Str converted to bytes, the service end after receiving the bytes, to use the Decode () function to convert to Str.
ByteArray (similar to bytes type, but content in ByteArray is variable, data in bytes type is immutable)
ByteArray is used to change the content of bytes, as follows:
B1=b ' 1234 ' B2=bytearray (B1) B2[0]=int (b ' 6 '. Hex (), +) bytes (B2)
Output B ' 6234 '
And then you get a change of B1.
This article is from the "No Flying World" blog, please be sure to keep this source http://hf1208.blog.51cto.com/8957433/1881648
Python second week string