Python judges the language type instance code based on unicode, and python language type
In this example, python judges the language type based on unicode as follows.
Instance code:
Def is_chinese (uchar): "determines whether a unicode character is a Chinese character" if uchar> = U' \ u4e00 'and uchar <= U' \ u9fa5': return True else: return False def is_number (uchar): "determines whether a unicode is a number" if uchar> = U' \ u0030 'and uchar <= U' \ u0039 ': return True else: return False def is_alphabet (uchar ): "determining whether a unicode is an English letter" if (uchar> = U' \ u0041 'and uchar <= U' \ u005a ') or (uchar> = U' \ u0061 'and uchar <= U' \ u007a'): return True else: re Turn False def is_other (uchar): "determines whether it is not a Chinese character, number, or English character" if not (is_chinese (uchar) or is_number (uchar) or is_alphabet (uchar): return True else: return False def B2Q (uchar): "halfwidth to fullwidth" inside_code = ord (uchar) if inside_code <0x0020 or inside_code> 0x7e: # returns the original character return uchar if inside_code = 0x0020 if it is not a halfwidth character: # except for spaces, the formula for the full-width halfwidth is: halfwidth = fullwidth-0xfee0 inside_code = 0x3000 else: inside_code + = 0xfee0 return unichr (inside_co De) def Q2B (uchar): "" inside_code = ord (uchar) if inside_code = 0x3000: inside_code = 0x0020 else: inside_code-= 0xfee0 if inside_code <0x0020 or inside_code> 0x7e: # return the original character return uchar return unichr (inside_code) def stringQ2B (ustring) after the conversion is completed ): "turn the full width of the string to the half width" return "". join ([Q2B (uchar) for uchar in ustring]) def uniform (ustring): "" format the string to complete full-width conversion to half-width conversion, converts uppercase to lowercase and returns stringQ2B (ustring ). lower () Def string2List (ustring): "separate ustring from Chinese characters, letters, and numbers" retList = [] utmp = [] for uchar in ustring: if is_other (uchar): if len (utmp) = 0: continue else: retList. append ("". join (utmp) utmp = [] else: utmp. append (uchar) if len (utmp )! = 0: retList. append ("". join (utmp) return retList
Summary
The above is all the content of the python code used to judge the language type instance based on unicode. I hope it will be helpful to you. If you are interested, you can continue to refer to other related topics on this site. If you have any shortcomings, please leave a message. Thank you for your support!