Recently HANLP wanted to support pinyin and traditional features, so I learned a few open source Java implementations, optimized for integration. Stuxuhai/jpinyin principle This is the most one of the stars on GitHub, the main principle is to use a Hashtable to match the word pinyin one by one. At the same time, in the scanning time will also be the current Chinese characters in turn with the back of the 3, 2, 1 Chinese characters, to determine whether there is a polyphone phrase. That is, it supports polyphone corrections of up to 4 words. At the same time, the constant term of the complexity is a bit high (approximately O (4n)) in sequential scanning and combination. Multiply the complexity of the hash table, the feeling is not a very efficient implementation. Dictionary format jpinyin A total of 3 tables, ...
Continue reading : Yards Farm» The Java implementation of translating Chinese characters into pinyin and simple multiplication
original link : http://www.hankcs.com/nlp/java-chinese-characters-to-pinyin-and-simplified-conversion-realization.html
Java implementation of translating Chinese characters into pinyin and simple multiplication