Case:
Filter the extra white space characters in the user input
' ++++abc123---'
Filter ' \ r ' in edit text under a Windows:
' Hello world \ r \ n '
Remove Unicode combination characters from text, tone
"Zhào Qián sūn lǐzhōu wúzhèng Wáng"
How to solve the above problem?
Remove both ends of the string: Strip (), Rstrip (), Lstrip ()
#!/usr/bin/python3s = ' -----abc123++++ ' # Remove both sides of the empty character print (S.strip ()) # Delete left empty character print (S.rstrip ()) # Delete right empty character print ( S.lstrip ()) # Delete both sides-+ and Null characters print (S.strip (). Strip ('-+ '))
Delete a single fixed position character: Slice + splice
#!/usr/bin/python3s = ' abc:123 ' # string stitching method to remove colon new_s = S[:3] + s[4:]print (new_s)
Remove any positional characters: replace (), re.sub ()
#!/usr/bin/python3# remove the same character in string s = ' \tabc\t123\tisk ' Print (s.replace (' \ t ', ')) #!/usr/bin/python3import re# Remove \r\n\ T character s = ' \r\nabc\t123\nxyz ' Print (Re.sub (' [\r\n\t] ', ' ', s))
Delete many different characters at the same time: translate () mapping for Str.maketrans () in Py3
#!/usr/bin/python3s = ' abc123xyz ' # a _> x, b_> y, c_> Z, Character Map encrypt print (Str.maketrans (' abcxyz ', ' Xyzabc ')) # Translat e convert it to string print (S.translate (Str.maketrans (' abcxyz ', ' xyzabc ')))
#!/usr/bin/python3import sysimport Unicodedatas = "Zhào qián sūn lǐzhōu wúzhèng wáng" remap = { # Ord returns ASCII value Ord (' \ t '): ', ord (' \f '): ', ord (' \ R '): None }# remove \ t, \f, \ra = s.translate (Remap) "is constructed by using the Dict.fromkeys () method A dictionary, each Unicode and note as a key, for which the value is all none and then uses Unicodedata.normalize () to normalize the original input to the exploded form character Sys.maxunicode: An integer that gives the value of the maximum Unicode code point, That is, 1114111 (16-0X10FFFF). Unicodedata.combining: Returns the canonical composition class assigned to the character Chr as an integer. If no composition class is defined, 0 is returned. "' cmb_chrs = Dict.fromkeys (c for C in range (Sys.maxunicode) if Unicodedata.combining (Chr (c))) #此部分建议拆分开来理解b = Unicodedat A.normalize (' NFD ', a) ' Call the Translate function to delete all the accent "print (b.translate (cmb_chrs))
Python_ How do I get rid of unwanted characters in a string?