Python judges the language type instance code based on unicode, and python language type

Source: Internet
Author: User
Tags ustring

Python judges the language type instance code based on unicode, and python language type

In this example, python judges the language type based on unicode as follows.

Instance code:

Def is_chinese (uchar): "determines whether a unicode character is a Chinese character" if uchar> = U' \ u4e00 'and uchar <= U' \ u9fa5': return True else: return False def is_number (uchar): "determines whether a unicode is a number" if uchar> = U' \ u0030 'and uchar <= U' \ u0039 ': return True else: return False def is_alphabet (uchar ): "determining whether a unicode is an English letter" if (uchar> = U' \ u0041 'and uchar <= U' \ u005a ') or (uchar> = U' \ u0061 'and uchar <= U' \ u007a'): return True else: re Turn False def is_other (uchar): "determines whether it is not a Chinese character, number, or English character" if not (is_chinese (uchar) or is_number (uchar) or is_alphabet (uchar): return True else: return False def B2Q (uchar): "halfwidth to fullwidth" inside_code = ord (uchar) if inside_code <0x0020 or inside_code> 0x7e: # returns the original character return uchar if inside_code = 0x0020 if it is not a halfwidth character: # except for spaces, the formula for the full-width halfwidth is: halfwidth = fullwidth-0xfee0 inside_code = 0x3000 else: inside_code + = 0xfee0 return unichr (inside_co De) def Q2B (uchar): "" inside_code = ord (uchar) if inside_code = 0x3000: inside_code = 0x0020 else: inside_code-= 0xfee0 if inside_code <0x0020 or inside_code> 0x7e: # return the original character return uchar return unichr (inside_code) def stringQ2B (ustring) after the conversion is completed ): "turn the full width of the string to the half width" return "". join ([Q2B (uchar) for uchar in ustring]) def uniform (ustring): "" format the string to complete full-width conversion to half-width conversion, converts uppercase to lowercase and returns stringQ2B (ustring ). lower () Def string2List (ustring): "separate ustring from Chinese characters, letters, and numbers" retList = [] utmp = [] for uchar in ustring: if is_other (uchar): if len (utmp) = 0: continue else: retList. append ("". join (utmp) utmp = [] else: utmp. append (uchar) if len (utmp )! = 0: retList. append ("". join (utmp) return retList

Summary

The above is all the content of the python code used to judge the language type instance based on unicode. I hope it will be helpful to you. If you are interested, you can continue to refer to other related topics on this site. If you have any shortcomings, please leave a message. Thank you for your support!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.