Python kanji converted into pinyin

Source: Internet
Author: User

Reprinted from: https://www.cnblogs.com/code123-cc/p/4822886.html

Recently when using Python to do projects, you need to convert Chinese characters into corresponding pinyin. A ready-made program was found on GitHub.

Python kanji to Pinyin

Examples of usage are as follows:

from pinyin import pinyintest = Pinyin () Test.load_word () Print test.hanzi2pinyin (string= ' Diaoyu is Chinese ') print Test.hanzi2pinyin_split (string= ' Diaoyu Islands are Chinese ', split= "-")

Output:

[' Diao ', ' Yu ', ' dao ', ' Shi ', ' zhong ', ' Guo ', ' de '] ' diao-yu-dao-shi-zhong-guo-de '

Where the Hanzi2pinyin function return value is a list, and the Hanzi2pinyin_split function returns a list when the split argument is empty, not NULL is the return string.

However, there are two problems with the procedure, the first is that English will be lost when the text is in English. The second is that the return value of Hanzi2pinyin_split is a list, a string, which makes people confused.

For example:

Test.hanzi2pinyin_split (string= ' Diaoyu Islands is China's code123 ', split= "")

The results we are looking for are:

U ' diaoyudaoshizhongguodecode123 '

But the actual result is:

U ' diaoyudaoshizhongguode '

For this reason, the following rewrite was made in the original program.

1.hanzi2pinyin function Modification

The original Hanzi2pinyin function:

def hanzi2pinyin (self, string= ""):    result = []    If not isinstance (string, Unicode):        string = String.decode (" Utf-8 ") for            char in string:        key = '%x '% ord (char)        result.append (Self.word_dict.get (Key, Char). Split () [0][ : -1].lower ())    return result

The modified Hanzi2pinyin function:

def hanzi2pinyin (self, string= ""):    result = []    If not isinstance (string, Unicode):        string = String.decode (" Utf-8 ") for    char in string:        key = '%x ' percent ord (char)        if not Self.word_dict.get (key):            Result.append (char) C7/>else:            result.append (Self.word_dict.get (Key, Char). Split () [0][:-1].lower ())    return result

The modified Hanzi2pinyin function prevents English from being lost in the case of mixed English and Chinese.

The 2.hanzi2pinyin_split function modifies the return value to a uniform string

The original Hanzi2pinyin_split function:

def hanzi2pinyin_split (self, string= "", Split= ""):    result = Self.hanzi2pinyin (string=string)    if split = = "":        return result    else:        return split.join (Result)

The modified Hanzi2pinyin_split function (Hanzi2pinyin_split returns a string regardless of whether the split argument is empty):

def hanzi2pinyin_split (self, string= "", Split= ""):    result = Self.hanzi2pinyin (string=string)    #if split = = "": c2/>#    return result    #else:    return split.join (Result)

Python kanji converted into pinyin

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.