How to remove unwanted characters from a string in python?

Source: Internet
Author: User
Tags translate function

How to remove unwanted characters from a string in python?

Problem:

Filter unnecessary leading and trailing spaces in user input

'+++ Abc123 ---'

Filter '\ R' in the edited text in a windows environment ':

'Hello world \ r \ N'

Remove the unicode Character and tone in the text.

"Zh à o qán S limit n L limit Zh ō u wúzh é ng wáng"

How to solve the above problems?

Remove strings at both ends: strip (), rstrip (), and lstrip ()

#! /Usr/bin/python3 s = '----- abc123 ++' # Delete null characters print (s. strip () # Delete print (s. rstrip () # Delete print (s. lstrip () # delete both sides-+ and null character print (s. strip (). strip ('-+ '))

Delete a single fixed position character: Slice + stitching

#! /Usr/bin/python3 s = 'abc: 8080' # Remove the colon new_s = s [: 3] + s [4:] print (new_s)

Delete arbitrary characters and delete multiple characters at the same time: replace (), re. sub ()

#! /Usr/bin/python3 # Remove the same character s = '\ tabc \ t123 \ tisk' print (s. replace ('\ t', '') import re # Remove the \ r \ n \ t character s =' \ r \ nabc \ t123 \ nxyz 'print (re. sub ('[\ r \ n \ t]', ', s ))

Delete multiple characters at the same time: Map str. maketrans () in translate () py3.

#! /Usr/bin/python3 s = 'abc123xyz '# a _> x, B _> y, c _> z, character ing encrypted print (str. maketrans ('abcxyz', 'xyzabc') # convert it into a string print (s. translate (str. maketrans ('abcxyzz', 'xyzabc ')))

Remove the tones From unicode characters

#! /Usr/bin/python3 import sysimport unicodedatas = "Zh à o qán S limit n L limit Zh limit u wúzh è ng wáng" remap = {# ord return ascii value ord ('\ t '): '', ord ('\ F'):'', ord (' \ R'): None} # Remove \ t, \ f, \ ra = s. translate (remap) ''' by using dict. the fromkeys () method constructs a dictionary. Each Unicode and note serves as the key. If all the values are None, unicodedata is used. normalize () standardizes the original input into a decomposed form character sys. maxunicode: the integer that gives the maximum Unicode code point value, that is, 1114111 (hexadecimal 0x10FFFF ). Unicodedata. combining: returns the normalized combination class assigned to the character chr as an integer. If no combination class is defined, 0 is returned. ''' Cmb _ chrs = dict. fromkeys (c for c in range (sys. maxunicode) if unicodedata. combining (chr (c) # we recommend that you split this part to understand B = unicodedata. normalize ('nfd ', a) ''' call the translate function to delete all heavy notes ''' print (B. translate (cmb_chrs ))

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.