Python's built-in string method analysis

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article mainly introduces some of the following Python's built-in string methods, including overview, string case conversion, string format output, string search positioning and substitution, string union and segmentation, string conditional judgment, string encoding

String processing is a very common skill, but Python built-in string methods are too many, often forgotten, in order to facilitate quick reference, specifically based on the Python 3.5.1 for each built-in method of writing examples and categorized, easy to index.

PS: You can click the Green heading in the overview to enter the category or quickly index the appropriate method through the right sidebar article directory.

Uppercase and lowercase conversions

Str.capitalize ()

To convert the initial letter to uppercase, it is important to note that if the first word is not capitalized, the original string is returned.

' Adi Dog '. Capitalize ()
# ' Adi dog '

' ABCD Xu '. Capitalize ()
# ' ABCD Xu '

' Xu ABCD '. Capitalize ()
# ' Xu ABCD '

' ß '. Capitalize ()
# ' SS '

Str.lower ()

Converts a string to lowercase, which is valid only for ASCII-encoded letters.

' Dobi '. Lower ()
# ' Dobi '

' ß '. Lower () # ' ß ' is a German lowercase letter, it has another lowercase ' ss ', lower method cannot be converted
# ' ß '

' Xu ABCD '. Lower ()
# ' Xu ABCD '

Str.casefold ()

Converts a string to lowercase, and the Unicode encoding usually has a corresponding lowercase form.

' Dobi '. Casefold ()
# ' Dobi '

' ß '. Casefold () #德语中小写字母 ß equivalent to lowercase SS with uppercase SS
# ' SS '

Str.swapcase ()

Reverses the case of a string letter.

' Xu Dobi a123ß '. Swapcase ()
#: ' Xu Dobi A123 SS ' Here the ß is turned into SS is a kind of uppercase
However, it is important to note that s.swapcase (). Swapcase () = = S is not necessarily true:

U ' \xb5 '
# 'µ'

U ' \xb5 '. Swapcase ()
# ' Μ '

U ' \xb5 '. Swapcase (). Swapcase ()
# ' μ '

Hex (Ord (U ' \xb5 '. Swapcase (). Swapcase ()))
OUT[154]: ' 0X3BC '

Here the ' μ ' (Mu is not M) lowercase is exactly the same as the ' μ ' notation.

Str.title ()

Capitalize the first letter of each word in the string. The basis for judging "words" is based on spaces and punctuation, so it can be wrong to write down all the abbreviations in English or some English capitals.

' Hello World '. Title ()
# ' Hello World '

' Chinese ABC def 12gh '. Title ()
# ' Chinese ABC Def 12Gh '

# But this method is not perfect:
"They ' re bill ' s friends from the UK". Title ()
# "They ' Re Bill ' S Friends from the Uk"

Str.upper ()

Converts all letters of a string to uppercase, and automatically ignores characters that cannot be converted to uppercase.

' Chinese ABC def 12GH '. Upper ()
# ' Chinese ABC DEF 12GH '
It is important to note that S.upper (). Isupper () is not necessarily true.

string format output

Str.center (width[, Fillchar])
Centers the string at a given width, fills the extra length with a given character, and returns the original string if the specified length is less than the length of the string.

' 12345 '. Center (10, ' * ')
# ' **12345*** '

' 12345 '. Center (10)
# ' 12345 '
Str.ljust (width[, Fillchar]); Str.rjust (width[, Fillchar])

Returns a string of the specified length, with the string content left (right) if the length is less than the string length, the original string is returned, the default padding is ASCII space, and the filled string can be specified.

' Dobi '. Ljust (10)
# ' Dobi '

' Dobi '. Ljust (10, ' ~ ')
# ' dobi~~~~~~ '

' Dobi '. Ljust (3, ' ~ ')
# ' Dobi '

' Dobi '. Ljust (3)
# ' Dobi '
Str.zfill (width)

Fills a string with ' 0 ' and returns a string of the specified width.

"A". Zfill (5)
# ' 00042 '
" -42". Zfill (5)
# '-0042 '

' DD '. Zfill (5)
# ' 000DD '

'--'. Zfill (5)
# ' -000-'

". Zfill (5)
# ' 0000 '

". Zfill (5)
# ' 00000 '

' dddddddd '. Zfill (5)
# ' DDDDDDDD '
Str.expandtabs (tabsize=8)
Replaces the horizontal tab with the specified space so that the spacing between adjacent strings remains within the specified number of spaces.

tab = ' 1\t23\t456\t7890\t1112131415\t161718192021 '

Tab.expandtabs ()
# ' 1 23 456 7890 1112131415 161718192021 '
# ' 123456781234567812345678123456781234567812345678 ' note the relationship between the count of spaces and the output position above

Tab.expandtabs (4)
# ' 1 23 456 7890 1112131415 161718192021 '
# ' 12341234123412341234123412341234 '
Str.format (^args, ^^ Kwargs)

Formatting the syntax of the string is many, the official document already has the more detailed examples, here does not write the example, wants to understand the child shoe can directly poke here Format examples.

Str.format_map (mapping)

Similar to Str.format (*args, **kwargs), the difference is that mapping is a Dictionary object.

People = {' name ': ' John ', ' Age ': 56}

' My name is {name},i am {age} old '. Format_map (People)
# ' My name is john,i am '

String search positioning and substitution

Str.count (sub[, start[, end]])
Text = ' outer protective covering '

Text.count (' e ')
# 4

Text.count (' E ', 5, 11)
# 1

Text.count (' E ', 5, 10)
# 0
Str.find (sub[, start[, end]]); Str.rfind (sub[, start[, end]])
Text = ' outer protective covering '

Text.find (' er ')
# 3

Text.find (' to ')
#-1

Text.find (' er ', 3)
OUT[121]: 3

Text.find (' er ', 4)
OUT[122]: 20

Text.find (' er ', 4, 21)
OUT[123]:-1

Text.find (' er ', 4, 22)
OUT[124]: 20

Text.rfind (' er ')
OUT[125]: 20

Text.rfind (' er ', 20)
OUT[126]: 20

Text.rfind (' er ', 20, 21)
OUT[129]:-1
Str.index (sub[, start[, end]]); Str.rindex (sub[, start[, end]])
Similar to find () RFind (), the difference is that if it is not found, the valueerror is raised.

Str.replace (old, new[, Count])
' Dog wow wow jiao '. replace (' wow ', ' Wang ')
# ' dog Wang Wang Jiao '

' Dog wow wow jiao '. replace (' wow ', ' Wang ', 1)
# ' dog Wang Wow Jiao '

' Dog wow wow jiao '. replace (' wow ', ' Wang ', 0)
# ' Dog Wow wow jiao '

' Dog wow wow jiao '. replace (' wow ', ' Wang ', 2)
# ' dog Wang Wang Jiao '

' Dog wow wow jiao '. replace (' wow ', ' Wang ', 3)
# ' dog Wang Wang Jiao '
Str.lstrip ([chars]); Str.rstrip ([chars]); Str.strip ([chars])
' Dobi '. Lstrip ()
# ' Dobi '
' db.kun.ac.cn '. Lstrip (' DBK ')
# '. kun.ac.cn '

' Dobi '. Rstrip ()
# ' Dobi '
' db.kun.ac.cn '. Rstrip (' ACN ')
# ' Db.kun.ac. '

' Dobi '. Strip ()
# ' Dobi '
' db.kun.ac.cn '. Strip (' DB.C ')
# ' kun.ac.cn '
' db.kun.ac.cn '. Strip (' cbd.un ')
# ' Kun.a '
Static Str.maketrans (x[, y[, z]); Str.translate (table)
Maktrans is a static method used to generate a comparison table for use by translate.
If Maktrans has only one parameter, the argument must be a dictionary, the dictionary key is either a Unicode encoding (an integer), or a string of length 1, and the dictionary value can be any string, none, or Unicode encoding.

A = ' Dobi '
Ord (' O ')
# 111

Ord (' a ')
# 97

Hex (Ord (' Dog '))
# ' 0x72d7 '

b = {' d ': ' Dobi ', 111: ' Is ', ' B ': ', ' I ': ' \u72d7\u72d7 '}
Table = Str.maketrans (b)

A.translate (table)
# ' Dobi is a dog '

If the Maktrans has two parameters, two parameters form a mapping, and two strings must be of equal length, and if there is a third argument, the third argument must also be a string, and the string will be automatically mapped to None:

A = ' Dobi is a dog '

Table = Str.maketrans (' Dobi ', ' Alph ')

A.translate (table)
# ' Alph HS a ALG '

Table = Str.maketrans (' Dobi ', ' Alph ', ' O ')

A.translate (table)
# ' aph HS a AG '

Union and segmentation of strings

Str.join (iterable)

An iterator object that joins the element as a string, using the specified string.

'-'. Join ([' 2012 ', ' 3 ', ' 12 '])
# ' 2012-3-12 '

'-'. Join ([2012, 3, 12])
# typeerror:sequence Item 0:expected STR instance, int found

'-'. Join ([' + ', ' 3 ', B ']) #bytes to non-string
# typeerror:sequence Item 2:expected str instance, bytes found

'-'. Join ([' 2012 '])
# ' 2012 '

'-'. Join ([])
# ''

'-'. Join ([None])
# typeerror:sequence Item 0:expected STR instance, Nonetype found

'-'. Join (['])
# ''

', '. Join ({' Dobi ': ' Dog ', ' Polly ': ' Bird ')
# ' dobi,polly '

', '. Join ({' Dobi ': ' Dog ', ' Polly ': ' Bird '}.values ())
# ' Dog,bird '
Str.partition (Sep); Str.rpartition (Sep)
' Dog wow wow jiao '. partition (' Wow ')
# (' Dog ', ' wow ', ' wow jiao ')

' Dog wow wow jiao '. partition (' dog ')
# (', ' dog ', ' wow wow jiao ')

' Dog wow wow jiao '. partition (' Jiao ')
# (' Dog wow wow ', ' jiao ', ')

' Dog wow wow jiao '. partition (' WW ')
# (' Dog wow wow jiao ', ', ')

' Dog wow wow jiao '. rpartition (' wow ')
OUT[131]: (' Dog wow ', ' wow ', ' jiao ')

' Dog wow wow jiao '. rpartition (' dog ')
OUT[132]: (', ' dog ', ' wow wow jiao ')

' Dog wow wow jiao '. rpartition (' jiao ')
OUT[133]: (' Dog wow wow ', ' jiao ', ')

' Dog wow wow jiao '. Rpartition (' ww ')
OUT[135]: (', ', ', ' Dog wow wow jiao ')
Str.split (Sep=none, maxsplit=-1); Str.rsplit (Sep=none, Maxsplit=-1)
' A-st '. Split (', '), ' 1, 2, 3 '. Rsplit ()
# ([' 1 ', ' 2 ', ' 3 '], [' 1, ', ' 2, ', ' 3 '])

' Maxsplit=1 '. Split (', ', ', '), ' rsplit ' (', ', maxsplit=1)
# ([' 1 ', ' 2,3 '], [' Up ', ' 3 '])

' 1 2 3 '. Split (), ' 1 2 3 '. Rsplit ()
# ([' 1 ', ' 2 ', ' 3 '], [' 1 ', ' 2 ', ' 3 '])

' 1 2 3 '. Split (Maxsplit=1), ' 1 2 3 '. Rsplit (Maxsplit=1)
# ([' 1 ', ' 2 3 '], [' 1 2 ', ' 3 '])

' 1 2 3 '. Split ()
# [' 1 ', ' 2 ', ' 3 ']

' 1,2,,3, '. Split (', '), ' 1,2,,3, '. Rsplit (', ')
# ([' 1 ', ' 2 ', ' ', ' 3 ', '], [' 1 ', ' 2 ', ', ' 3 ', '])

". Split ()
# []
". Split (' a ')
# ['']
' BCD '. Split (' a ')
# [' BCD ']
' BCD '. Split (None)
# [' BCD ']
Str.splitlines ([keepends])

The string is split into a list with a row-bound character delimiter, and when Keepends is true, the row bounds are preserved after splitting, and the line interface characters that can be identified are found in the official document.

' Ab C\n\nde fg\rkl\r\n '. Splitlines ()
# [' Ab C ', ', ' de fg ', ' KL ']
' Ab C\n\nde fg\rkl\r\n '. Splitlines (Keepends=true)
# [' Ab c\n ', ' \ n ', ' de fg\r ', ' kl\r\n ']

". Splitlines (),". Split (' \ n ') #注意两者的区别
# ([], [''])
"One line\n". Splitlines ()
# ([' one line '], [' Both lines ', '])

String condition judgment

Str.endswith (suffix[, start[, end]]); Str.startswith (prefix[, start[, end]])
Text = ' outer protective covering '

Text.endswith (' ing ')
# True

Text.endswith (' gin ', ' ing ')
# True
Text.endswith (' ter ', 2, 5)
# True

Text.endswith (' ter ', 2, 4)
# False

Str.isalnum ()

Any combination of strings and numbers, which is true, in short:

As long as C.isalpha (), C.isdecimal (), C.isdigit (), C.isnumeric () are true, then C.isalnum () is true.

' Dobi '. Isalnum ()
# True

' Dobi123 '. Isalnum ()
# True

' 123 '. Isalnum ()
# True

' Xu '. isalnum ()
# True

' Dobi_123 '. Isalnum ()
# False

' Dobi 123 '. Isalnum ()
# False

'% '. Isalnum ()
# False
Str.isalpha ()
Unicode character database as "letter" (these characters generally have "Lm", "Lt", "Lu", "Ll", or "Lo" and other identifiers, different from alphabetic), are true.

' Dobi '. Isalpha ()
# True

' Do bi '. Isalpha ()
# False

' Dobi123 '. Isalpha ()
# False

' Xu '. Isalpha ()
# True
Str.isdecimal (); Str.isdigit (); Str.isnumeric ()
The difference between the three methods is the difference in the truth range of the Unicode generic identity:

Isdecimal:nd,
Isdigit:no, Nd,
Isnumeric:no, Nd, Nl

The difference between digit and decimal is that some numeric strings, which are digit but not decimal, are specifically stamped here

num = ' \u2155 '
Print (num)
#⅕
Num.isdecimal (), Num.isdigit (), Num.isnumeric ()
# (False, False, True)

num = ' \u00b2 '
Print (num)
# ²
Num.isdecimal (), Num.isdigit (), Num.isnumeric ()
# (False, True, True)

num = "1" #unicode
Num.isdecimal (), Num.isdigit (), Num.isnumeric ()
# (Ture, True, True)

num = "' Ⅶ '"
Num.isdecimal (), Num.isdigit (), Num.isnumeric ()
# (False, False, True)

num = "Ten"
Num.isdecimal (), Num.isdigit (), Num.isnumeric ()
# (False, False, True)

num = B "1" # Byte
Num.isdigit () # True
Num.isdecimal () # attributeerror ' bytes ' object has no attribute ' isdecimal '
Num.isnumeric () # attributeerror ' bytes ' object has no attribute ' IsNumeric '
Str.isidentifier ()

Determines whether a string can be a valid identifier.

' Def '. Isidentifier ()
# True

' With '. Isidentifier ()
# True

' False '. Isidentifier ()
# True

' Dobi_123 '. Isidentifier ()
# True

' Dobi 123 '. Isidentifier ()
# False

' 123 '. Isidentifier ()
# False
Str.islower ()
' Xu '. Islower ()
# False

' ß '. Islower () #德语大写字母
# False

' A Xu '. Islower ()
# True

' SS '. Islower ()
# True

' All '. Islower ()
# False

' Ab '. Islower ()
# False

Str.isprintable ()

All characters of the judging string are printable characters or the string is empty. The characters of the "other" "Separator" category in the Unicode character set are non-printable characters (but not ASCII spaces (0x20)).

' Dobi123 '. isprintable ()
# True

' dobi123\n '. isprintable ()
OUT[24]: False

' Dobi 123 '. Isprintable ()
# True

' Dobi.123 '. isprintable ()
# True

". Isprintable ()
# True

Str.isspace ()

Determines if there is at least one character in the string, and all characters are white space characters.

in []: ' \r\n\t '. Isspace ()
OUT[29]: True

in [+]: '. Isspace ()
OUT[30]: False

in []: ". Isspace ()
OUT[31]: True

Str.istitle ()

Determines whether a character in a string is capitalized in the first letter, ignoring non-alphabetic characters.

' How Python Works '. Istitle ()
# True

' How Python WORKS '. Istitle ()
# False

' How Python Works '. Istitle ()
# False

' How Python Works '. Istitle ()
# True

". Istitle ()
# False

' A '. Istitle ()
# True

' A '. Istitle ()
# False

' Shake off ABC Def 123 '. Istitle ()
# True
Str.isupper ()
' Xu '. Isupper ()
# False

' Dobi '. Isupper ()
OUT[41]: True

' Dobi '. Isupper ()
# False

' DOBI123 '. Isupper ()
# True

' Dobi 123 '. Isupper ()
# True

' Dobi\t 123 '. Isupper ()
# True

' Dobi_123 '. Isupper ()
# True

' _123 '. Isupper ()
# False

String encoding

Str.encode (encoding= "Utf-8", errors= "strict")

fname = ' Xu '

Fname.encode (' ASCII ')
# unicodeencodeerror: ' ASCII ' codec can ' t encode character ' \u5f90 ' ...

Fname.encode (' ASCII ', ' replace ')
# b '? '

Fname.encode (' ASCII ', ' ignore ')
# b '

Fname.encode (' ASCII ', ' xmlcharrefreplace ')
# b ' Xu '

Fname.encode (' ASCII ', ' backslashreplace ')
# b ' \\u5f90 '

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More