One, the code
a computer can only handle numbers, and if you want to work with text, you must convert the text to a number before processing it. The oldest computer was designed with 8 bits (bit) as a byte (byte), so a single word energy-saving representation of the largest integer is 255 (binary 11111111 = decimal 255), if you want to represent a larger integer, you must use more bytes. For example, two bytes can represent the largest integer is 65535
, 4 bytes can represent the largest integer is 4294967295
.
ASCII encoding (U.S. Information Interchange Standard code ), which typically represents a character in 1 bytes, it uses a numeric 0~127 to represent characters on the computer's keyboard and some special values called control codes to coordinate the sending and receiving of information. The uppercase letter A~z is represented by the value 65~90, and the lowercase letter a~z is represented by the value 97~122.
Unicode encoding unifies all languages into a set of encodings, but it is most commonly used to represent a character in two bytes (4 bytes If a very remote character is used). Unicode is supported directly by modern operating systems and most programming languages.
If Unicode encoding is unified, the garbled problem disappears. However, if the text is basically all in English, using Unicode encoding requires more storage space than ASCII encoding, which is not cost-effective in storage and transmission. Therefore, the encoding of converting Unicode encoding to "Variable length encoding" appears UTF-8
.
The UTF-8 encoding encodes a Unicode character into 1-6 bytes according to a different number size, the commonly used English letter is encoded in 1 bytes, the kanji is usually 3 bytes, and only the very uncommon characters are encoded into 4-6 bytes. If the text you want to transfer contains a large number of English characters, you can save space by coding with UTF-8. In fact , ASCII encoding can be seen as part of the UTF-8 encoding.
The above is the relationship between ASCII, Unicode and UTF-8, summarize the current computer system common character encoding work: in computer memory, Unicode encoding is used uniformly, when the need to save to the hard disk or need to transfer, the conversion to UTF-8 encoding.
In the latest version of Python 3, strings are encoded in Unicode, which is a python string that supports multiple languages. We often encounter str
and convert to and bytes
from each other when manipulating strings. In order to avoid garbled problems, we should always adhere to the use of UTF-8 encoding str
and bytes
conversion. str encode()
can be encoded as specified by means of the method bytes
, instead, the turn bytes通过decode()方法可
becomes str
.
Second, string manipulation
It constructs a string by gluing two strings together;
1 " Good " 2 " Luck " 3 str3 = str1 + str24print(STR3)
The result is:
1 Goodluck
It constructs a string with multiple connections to itself through a string;
1 " Good " 2 " Luck " 3 str3 = str1 * 24print("str3:", STR3) 5 STR4 = 3 * str26print("str4:", STR4)
The result is:
1 Str3:goodgood 2 Str4:luckluckluck
It gives the number of characters the string contains;
str = "Goodluck"
Print ("str length:", Len (str))
The result is:
The length of STR is: 8
A string is a sequence of characters that typically accesses a single character that consists of a string that can be indexed, and in N-character strings, the index starts at 0 and n-1 ends. A negative index is also allowed in Python, from the right-hand index of the string. The general form of a string index is:
<string>[<expr>]
1str ="Goodluck"2 Print("the length of STR is:", Len (str))3 Print("The first character of str:", str[0])4 Print("The third character of Str:", Str[2],"Str Fifth characters:", str[4])5 Print("last character of str:", Str[len (str)-1])6 Print("last character of str:", str[-1])7 Print("STR Countdown second character:", Str[-2])
The result is:
1 str length: 82str first character: G3str third character: O str fifth character: L4 The last character of str: K5str last character: K6 str Countdown second character: C
The index returns a single character of a string, which can be implemented by slicing if you need to access a contiguous sequence of characters or substrings from a string. The slices are in the form:
<string>[<start>: <end>]
The slice produces a substring from start until (but not including) the end position.
1str ="Hi Jack,good luck!"2 Print("Str[0:3]:", Str[0:3])3 Print("Str[3:14]:", str[3:14])4 Print("Str[:8]:", Str[:8])5 Print("str[8:]:", str[8:])6 Print("str[:]:", str[:])
The result is:
1 str[0:32 str[3:14]: Jack,good L3 str[:8]: Hi Jack,4 Str[8:]: Good luck! 5 str[:]: Hi jack,good luck!
Three, string built-in functions
The first character is converted to uppercase, and the remaining characters are converted to lowercase. If the first character is non-alphabetic, the initial letter is not converted to uppercase and converted to lowercase.
Returns a string that specifies the width of the center, Fillchar is a filled character, and the default is a space.
- Count (Sub, start= 0,end=len(string) )
Used to count the number of substrings in a string. The optional parameters are the start and end positions of the string search.
- EndsWith(suffix[, start[, end]])
Used to determine whether the string ends with the specified suffix and returns false if the end of the specified suffix returns true. The optional parameter "start" and "end" are the starting and ending positions of the retrieved string.
Find (str, beg=0 End=len (String))
Detects if the string contains substrings of STR, and if the beg (start) and end (end) ranges are specified, the check is contained within the specified range, and if the specified range contains the specified index value, returns the starting position of the index value in the string. Returns 1 if the index value is not included.
Index (str, beg=0, End=len (String))
Just like the Find () method, only if STR does not report an exception in the string.
Isalnum ()
Returns True if the string has at least one character and all characters are letters or numbers, otherwise False is returned.
Isalpha ()
Returns True if the string has at least one character and all characters are letters, otherwise False is returned.
Islower ()
Returns True if the string contains at least one case-sensitive character, and all of these (case-sensitive) characters are lowercase, otherwise False.
IsNumeric ()
Returns True if the string contains only numeric characters, otherwise False is returned.
Isspace ()
Returns True if the string contains only white space, otherwise False.
Istitle ()
Returns True if the string is heading (see Title ()), otherwise False.
Isupper ()
Returns True if the string contains at least one case-sensitive character, and all of these (case-sensitive) characters are uppercase, otherwise False.
Join (SEQ)
Merges all the elements in the SEQ (the string representation) into a new string, using the specified string as the delimiter.
Len (String)
Returns the string length.
Ljust (width[, Fillchar])
Returns a string that is left-justified by using Fillchar to fill a new string of length width, and fillchar the default is a space.
Lower ()
Converts all uppercase characters in a string to lowercase.
Lstrip ()
Truncates the left space of the string or specifies the character.
Max (str)
Returns the largest letter in the string str.
Min (str)
Returns the smallest letter in the string str.
Replace (old, new [, Max])
Replace the str1 in the string with str2, and if Max specifies it, the replacement does not exceed Max times.
RFind (str, Beg=0,end=len (String))
Similar to the Find () function, it is just looking from the right.
Rindex (str, beg=0, End=len (String))
Similar to index (), but starting from the right.
Rjust (width,[, Fillchar])
Returns an original string that is right-aligned and fills a new string of length width with Fillchar (the default space).
Rstrip ()
Removes a space at the end of a string string.
Split (str= "", Num=string.count (str))
Num=string.count (str)) intercepts a string with the Str delimiter, and if NUM has a specified value, only the NUM substring is truncated.
Splitlines ([keepends])
Separated by rows (' \ r ', ' \ r \ n ', \ n '), returns a list containing the rows as elements, if the argument keepends is False, does not contain a newline character, and if true, the newline character is preserved.
Strip ([chars])
Executes Lstrip () and Rstrip () on the string.
Swapcase ()
Converts uppercase in a string to lowercase and lowercase to uppercase.
Title ()
Returns the "heading" string, meaning that all words start with uppercase and the remaining letters are lowercase (see istitle ()).
Upper ()
lowercase letters in the converted string are capitalized.
Zfill (width)
Returns a string of length width, the original string is right-aligned, and the front padding is 0.
Isdecimal ()
Checks whether the string contains only decimal characters and returns False if true.
Day 3 python string