Conversion from Chinese to PinYin (with tone and multi-tone Word Recognition)

Source: Internet
Author: User
Tags comparison table

Conversion from Chinese to PinYin
----- Audio-tone and multi-tone Word Recognition

1, Background

I saw Chinese characters converted to PinYin on the Internet a few years ago.ProgramMost of them are encoded and converted according to Chinese characters.Algorithm. Search for "Chinese character to PinYin" on the InternetArticleYou can find many of them. They are basically the same algorithm. There are porting versions in various languages, but they all have a common drawback that they do not support polyphonic words. Words such as "Chongqing" and "weight" cannot be correctly identified, which is a huge defect in many applications, not to mention supporting the tone. (From http://sunli.cnblogs.com)

2, Solution

I found a lot of information, microsoft Office word2003 This Chinese character can be converted to pinyin, with a tone, the effect is much better than the online encoding and word matching (some of them are also inaccurate, for example, the "Conversion" of "Water Margin" is converted to " Chuan ). But I don't know how to use the program to call this tool. Later, I found the API calling methods is not good enough. The biggest drawback is fixed in Windows platform. Therefore, we decided to implement the algorithm for converting Chinese characters to pinyin.

 

3, Chinese character to PinYin Algorithm Implementation Chinese Word Segmentation

To realize the recognition of polyphonic words, word segmentation is required. According to the pinyin combination of words, the problem of polyphonic words and tones can be solved. Chinese word segmentation can be found on the InternetSource codeAnd algorithms, which are not described in detail here,

Pinyin

Implement a Chinese-pinyin-words control database. A word is a pinyin character, so that the corresponding pinyin character can be found. There is another step here. There will be many words after word segmentation, so we need a pinyin table for single words. This table is different from the one common on the Internet. It is a single word (not a group of synthetic words) for example, the Pinyin of "of" is" De " , Instead "Di" Because " Indeed " This word is" Di ".

Pinyin combination

Match the words after word segmentation with the comparison table to get the Chinese pinyin with a tone. You can perform a small amount of processing to get a non-tuned pinyin.

4, Summary

The key to converting Chinese characters into Pinyin is the word pinyin comparison table after word segmentation and word segmentation.

Conversion of Chinese characters to pinyin can be used in many occasions, such:

1.Pinyin prompt in the drop-down box for search

Demo:Http://www.google.cn

2.Intelligent Error Correction for input. For example, if you enter "Liu Dehua", the system prompts "Liu Dehua"
Enter"Liudehua"You can also prompt" Andy Lau"

3.Search by pinyin

5, Demo

Http://dev.oswind.com/pinyin/
(Servers created by adsl pc)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.