Python basics-string, python string
Python version: 3.6.2 Operating System: Windows Author: SmallWZQ
In Python, a string is also a data type. Strings are more complex than other data types. Why? Because strings contain not only English letters, but also Chinese characters from different countries. Since strings contain languages from different countries, encoding is also involved in strings.
In Python 3. x, strings are Unicode encoded. That is to say, Python strings support multiple languages.
The sample code is as follows:
1 # string contains Chinese 2 >>> print ('I love my motherland! I love my country! ') 3. I love my motherland! I love my country!
The concatenation syntax is supported for strings.
# Concatenate a string >>> x = "Hello," >>> y = 'World! '>>> X + y' Hello, world! '>>> Print (x + y) Hello, world!
In Python, values are converted to strings in two ways:
1. str (), which converts the value to a string in a reasonable form for your convenience;
2. repr (), which creates a string and represents the value in a valid Python expression.
For the encoding of a single character, Python providesord()
The function obtains the integer representation of a character,chr()
The function converts the encoding to the corresponding characters:
1 # ord () and chr () 2> ord ('A') 3 654> ord ('中') 5 200136 >>> chr (66) 7 'B' 8 >>> chr (25991) 9' 文'
If you know the integer encoding of the character, you can also write str in hexadecimal format:
1 # hexadecimal -- string encoding 2> '\ u4e2d \ u6587 '3' Chinese'
The two statements are equivalent.
Because the Python string type isstr
In the memory, it is represented in Unicode. A character corresponds to several bytes.
If you want to transfer data over the network or save the data to a diskstr
Changed to bytebyte
S.
Python uses single quotation marks or double quotation marks with a B prefix for bytes data:
1 # Bytes encoding 2 >>> s = B 'acv' 3 >>> print (s) 4 B' acv' 5 >>> s6 B 'acv'
Be sure to differentiate'ABC'
Andb'ABC'
, The former isstr
The content of the latter is displayed in the same way as that of the former,bytes
Each character occupies only one byte.
Str in Unicode can be encoded as the specified bytes through the encode () method. For example:
1 # ASCII string encoding, UTF-82> 'abc '. encode ('ascii ') 3 B' abc' 4 >>> 'Chinese '. encode ('utf-8') 5 B '\ xe4 \ xb8 \ xad \ xe6 \ x96 \ x87 '6 >>> 'Chinese '. encode ('ascii ') 7 Traceback (most recent call last): 8 File "<stdin>", line 1, in <module> 9 UnicodeEncodeError: 'ascii 'codec can't encode characters in position 0-1: ordinal not in range (128)
Note that str can be converted to any encoded bytes. However, when converting a string containing Chinese characters, it cannot be converted to an ascii bytes. Str with Chinese characters cannot be encoded in ASCII format. Because the range of Chinese characters exceeds the range of ASCII codes, an error is returned in Python.
If we read byte streams from the network or disk, the data we read is bytes. To change bytes to str, you need to use the decode () method:
# Decode () Usage
1 >>> B 'abc '. decode ('ascii ') 2 'abc' 3 >>> B' \ xe4 \ xb8 \ xad \ xe6 \ x96 \ x87 '. decode ('utf-8') 4' Chinese'
To calculate the number of characters in str, you can use the len () function:
1 >>> len ('abc') 2 33 >>> len ('Chinese') 4 2
The len () function calculates the str Character Count. If it is changed to bytes, the len () function calculates the number of bytes:
1 >>> len (B 'abc') 2 33 >>> len (B' \ xe4 \ xb8 \ xad \ xe6 \ x96 \ x87 ') 4 65 >>> len ('Chinese '. encode ('utf-8') 6 6
During string operations, we often encounter conversions between str and bytes.
In particular, to avoid garbled characters, you should always use UTF-8 encoding to convert str and bytes.
String formatting
The last common problem is how to output formatted strings. We often output similar'Dear xxx, hello! Your phone bill for xx months is xx, and your balance is xx'
And so on, and the content of xxx changes according to the variable. Therefore, a simple format method is required.
In C, % can be used to control the output format, which is similar in python.
The sample code is as follows:
1 # string formatting (%) 2 >>> 'hello, % s' % 'World' 3' Hello, world' 4 >>>'hi, % s, you have $ % d. '% ('James', 1000000) 5' Hi, Michael, you have $1000000.'
The % operator is used to format strings.
Inside the string, % s indicates replacing with a string, and % d Indicates replacing with an integer. How many %? Placeholder, followed by several variables or values. The order must be consistent. If there is only one % ?, Parentheses can be omitted.
Common placeholders
Placeholder |
Replace content |
% D |
Integer |
% F |
Floating Point Number |
% S |
String |
% X |
Hexadecimal integer |
If you are not sure what to use, % s will always work, it will convert any data type to a string:
1 >>> 'Age: %s. Gender: %s' % (25, True)2 'Age: 25. Gender: True'
Some readers may have doubts. What should I do if the character string itself contains %?
This is simple. If you want to escape it, click OK. % Indicates a %.
1 # % escape 2 >>> 'growth rate: % d % '% 73 'growth rate: 100'
Another method for string formatting: use. format ().
It replaces the placeholders in the string with input parameters in turn.{0}
,{1}
......, However, this method is much more difficult to write than %.
The. format () syntax is as follows:
1 #. format syntax 2 >>> 'hello, {0}. The score is increased by {1 :. 1f} % '. format ('xiaoming ', 17.125) 3' Hello, James, score improved by 17.1%'
If a string contains both lower-case and upper-case letters, how can I convert the upper-case letters into lower-case letters?
In Python, the lower () method is included in the string.
# String lower () method >>> x = "HelLO wOrLd! ">>> X. lower () 'Hello world! '
Strings also provide many methods, such as find (), join (), replace (), split (), strip (), upper (), title (), and lstrip () and rstrip. In fact, the method serves humans, and the method can maximize the strength of the String function.
Find (): Search for the substring in the string and return the leftmost index of the substring location.
Join (): Elements in the join sequence.
Replace (): returns the new string after all matching items in a string are replaced.
Split (): separates strings into sequences.
Strip (): returns a string that removes spaces on both sides (excluding internal spaces.
......
......
String Method
Method |
Description |
String. capitalize () |
Returns a copy of an uppercase string. |
String. center () |
Returns a string with the length of max (len (String), width) and the string copy in the center. The two sides are filled with fillchar (null String by default ). |
String. count (sub [, start [, end]) |
Calculates the number of times that a sub-string appears. You can set the search range to string [start, end]. |
String. find (sub [, start [, end]) |
Returns the first index of the sub-string. If no index exists,-1 is returned. You can define the string search range as string [start: end]. |
String. isalnum () |
Check whether the string is composed of digits and letters |
String. isalpha () |
Check whether a string is composed of letters and characters |
String. isdigit () |
Check whether the string is composed of digits |
String. islower () |
Check whether all instance-based letters in the string are in lowercase. |
String. isspace () |
Check whether the string is composed of spaces. |
String. istitle () |
Check that all instance-based characters after the instance-based letters in the string are in upper case, and all other instance-based characters are in lower case. |
String. isupper () |
Check whether all instance-based characters in all strings are in uppercase |
String. join (sequence) |
Returns the String in which the String element of sequence is connected with a String. |
String. lower () |
Returns a copy of a string. All instance-based characters are in lowercase. |
String. replace (old, new [, max]) |
Returns a copy of the string, where the old matches are replaced by new, you can choose to replace up to max |
String. split ([sep [, maxsplit]) |
Returns the list of all words in the string. Use sep as the separator (if no special separator is specified, the default Delimiter is space). You can use maxsplit to specify the maximum score. |
String. strip ([chars]) |
Returns a copy of the string. All chars (default space) are removed from the start and end of the string (default value: All blank characters, such as spaces, tabs, and line breaks) |
String. title () |
Returns a copy of the string, where all words start with an uppercase letter. |
String. upper () |
Returns a copy of the string. All instance-based characters are in uppercase. |
There are many other string methods, so I will not detail them here ~~~