Python string details
This article mainly introduces the details of Python strings. This article describes the knowledge about strings, some features of strings, the original strings, unicode strings, common operation methods of strings, and the list of built-in functions, for more information, see
Introduction
String sequences are used to represent and store text. Strings in python cannot be changed once declared.
It is usually surrounded by single quotation marks ('), double quotation marks ("), and three quotation marks ('''").
The three quotation marks can be composed of multiple lines. They can be used to compile the quick syntax for multiple lines of text. common syntax document strings are used as comments in specific locations of files. Convenient multi-line comment
Python actually has three types of strings:
1. General meaning string (str)
2. The original string, starting with R or r in upper or lower case, r''. special characters are not escaped.
3. Unicode string, u'''basestring subclass
In Python, strings are "unchangeable sequences"
1. Immutable
2. Basic sequence operations, location-based access, slicing, and Indexing
String
1. Get help:
The Code is as follows:
>>> Help (str)
>>> Dir (str)
>>> Help (str. replace)
2. Immutable
After creation, you cannot change the characters in the same place (same as java). You cannot change the character division to an immutable sequence by assigning values to a specific position, the characters contained in these strings are in the left-to-right order and cannot be modified in the original place. The string in python is equivalent to a list of immutable sequences. Once declared, each character is fixed
This means you must create a new one if you want to change it!
The Code is as follows:
>>> S = 'spam'
>>> S [0] = 'K' # TypeError
# Modify a string similar to java and assign a value again
S = 'K' + s [1:]
Original string
The original String constant, r "abcd", (r/R) removes the backlash escape mechanism. Disable the escape mechanism, that is, \ does not indicate escape
Usage:
1. Regular Expression
Used to process regular expressions and reduce backslash
The Code is as follows:
P4search = re. compile (R' \ s *')
2. System Path
It can easily represent the system path
The Code is as follows:
Path = r'e: \ book'
Unicode string
Unicode is a standard method for writing international text.
Python allows you to process Unicode text-you only need to add the prefix u or U before the string. For example, u "This is a Unicode string ."
BP: Uses Unicode strings when processing text files, especially when you know that this file contains text written in non-English languages.
Common Operations
1. Basic operations
The Code is as follows:
+: String1 + string2 # Join string, connect the last string to the back of the previous string
Python does not allow other types in the + expression. You need to manually convert them to [this is different from java] 'abc' + str (9)
*: String * n # create a new string to repeat the original string n times.
[]: String [n] # obtain a character from the string at the corresponding position
[:]: String [n: m] # truncate the string. If the string is m, start from m to m. If the string is n: From n to end.
In: char in string # determines whether a character is in the string. If True is returned)
Not in: char not in string # determines whether a character is not in the string. If True is returned)
R/R: r/Rstring # the actual meaning of escape characters. The entire character is the original meaning.
Len (): len (s)
2. type conversion
String and number Conversion
String to number int/float/long
Number to string str
The Code is as follows:
>>> Int (42)
42
>>> Int ('42 ')
42
>>> Str (42)
'42'
>>> Float ('42. 0 ')
42.0
>>> Str (42.0)
'42. 0'
Or use the functions of the string module.
S: string to be converted. base: Optional. The target is in hexadecimal format.
The Code is as follows:
Import string
String. atoi (s [, base]) # The default value of base is 10. If it is 0, s can be a string in the format of 012 or 0x23, if it is 16 s, it can only be a string in the form of 0x23 or 0X12
String. atol (s [, base]) # convert it to long
String. atof (s [, base]) # convert to float
String and list Conversion
String Conversion list:
The Code is as follows:
S = 'spam'
L = list (s)
L2 = "hello world". spilt ()
List to string
Copy the Code as follows:
K = ''. join (l)
Note: you cannot join non-strings in the list.
3. Modify the string
The Code is as follows:
S = s + 'A'
S = s [3:] + 'B'
S = s. replace ('pl', 'pa ')
A = ''# null value assignment
Del a # Delete the entire variable
4. indexing and partitioning
Index s [I]
The Code is as follows:
S [0] First
S [-1] = s [len (s)-1] The Last
Shard s [I: j]
The Code is as follows:
Excluding the upper boundary, s [] takes [1-2]
S [1:] Get 1 to end s [: 3] Get start to 2
S [:-1] starts to the second to the last
S [:] From start to end, equivalent to a copy
S [] 1-9, step size = 2
S [a: B:-2] The step size is negative. The two boundary meanings are reversed, indicating that the step size ranges from B + 1 to a, and the step size ranges from-2.
S = 'abcdefg'
S [5: 1:-1] to obtain fedc
S [] = s [slice ()] built-in Function
String formatting
Here we will only describe basic string formatting. Extended sections will introduce % c single character % d decimal integer % o octal integer % s string % x hexadecimal integer, lowercase letters % X hexadecimal integers, with uppercase letters
The Code is as follows:
>>> Str = "so % s a day! "
>>> Str % 'beautiful'
'So beautiful a day! '
>>> '{0} is {1}'. format ('A', 'B ')
'A is B'
>>> Template = "{0}, {1} and {2 }"
>>> Template. format ('A', 'B', 'C ')
'A, B and C'
Built-in function list
[String method is the No. 1 tool for python Text Processing]
String. capitalize ()
The first character of the string is capitalized.
String. center (width, [, fill])
The original character is centered, and the space is filled with the width length.
String. count (str, beg = 0, end = len (string ))
Obtains the number of substrings in a string. The number of occurrences can be specified.
String. decode (encoding = 'utf-8', errors = 'strict ')
Decodes the string. ValueError is reported by default, unless the errors is ignore or replace
String. encode (encoding = 'utf-8', errors = 'strict ')
String. endswith (suffix, beg = 0, end = len (string ))
End **?
String. expandtabs (tabsize = 8)
Convert the tab in the string to a space. The default value is 8.
String. find (str, beg = 0, end = len (stirng ))
Check whether str exists. If a start index exists,-1 is returned.
String. index (str, begin = 0, end = len (string ))
Same as find, no exception is reported, ValueError
String. isalnum ()
It must contain at least one character and all characters are letters or numbers. True: checks whether a string contains only 0-9A-Za-z
String. isalpha ()
At least one character. All characters are letters. True. check whether a string contains only letters.
String. isdecimal ()
Contains only the decimal number, True
Stirng. isdigit ()
Contains only numbers. True: checks whether a string contains only numbers.
String. islower ()
At least one case sensitive character and all characters in lower case, True. Check whether the string is lowercase letters
String. isnumeric ()
Contains only numeric characters, True
String. isspace ()
Contains only spaces. True: checks whether the string is blank.
String. istitle ()
Title character, True. checks whether the word in the string is capitalized.
String. isupper ()
At least one case-sensitive character and all characters are uppercase. True. checks whether the string is uppercase letters.
String. join (seq)
Using string as the separator, all elements in seq are merged into new strings. The original string is inserted between every two characters in the parameter string.
String. ljust (width)
Returns the left alignment of an original string, and adds spaces to the length width.
String. lower ()
Converts all strings to lowercase letters.
String. lstrip ()
Truncates spaces on the left.
String. partition (str)
= Find + split. The first position is displayed from str. It is truncated to pre_str, str, after_str tuples. If str is not included, pre_str = strstring. replace (str1, str2, num = string. replace count (str1) with a specified number of num times, which can be implemented as a template.
String. rfind (str, beg = 0, end = len (string ))
Same as find, start on the right
String. rindex (str, beg = 0, end = len (string ))
Same as index, start on the right
String. Must ust (width)
Right alignment, space filling
String. rpartition (str)
Same as partition, start on the right
String. rstrip ([chars])
Clear the white space on the right, including line breaks, and return the processed string
String. split (str = ", maxsplit = string. count (str ))
Split into str slices. You can specify the number of splits. the string is split and returns to the list. The default Delimiter is space.
String. splitlines (num = string. count ('\ n '))
S. splitlines ([keepends]) are separated by rows. You can specify the number of splits.
String. startswith (obj, beg = 0, end = len (string ))
Start with str, True. Check whether the string starts with a substring
String. strip ([obj])
Execute lstrip and rstrip on string
String. swapcase
Returns the string in uppercase and lowercase letters. Converts lowercase letters to uppercase letters and lowercase letters.
String. title ()
Title flowers, uppercase letters, and other lowercase letters
String. translate (str, del = "")
S. translate (table) returns string characters for table Conversion Based on str. The characters to be filtered are placed in the del parameter.
String. upper ()
Converts all strings to uppercase.
String. zfill (width)
Returns the string with the length of width. The original string is right-aligned and filled with 0 in front.
Len (string)
Returns the length of a string.
Best practices
1. Length Used in the loop
The Code is as follows:
While I <len (stri ):
# Modify
Size = len (stri)
While I <size
2. String appending
Copy the Code as follows:
L = ['A', 'B']
Result =''
For I in l:
Result + = I
# Modify
Result = ''. join (l)
Others
1. Escape Character
Frequently used:
\ N line feed, \ backslash
\ T tabulation \ 'single quotes
\ R press enter \ "Double quotation marks
To be expanded later
Detailed description of string Encoding
String formatting
Regular Expression
String involves common modules (serialization/Text packaging, etc)