the Maximum symbol string, and match the Maximum symbol string with the word entries in the dictionary, then, a Chinese character is removed and matched until the corresponding word is found in the dictionary. The matching direction is from right to left.Backward Maximum Matching method (BMM): the Matching direction is opposite to the MM method, from left to right. Experiments show that the reverse maximum matching method is more effective than the maximum matching method in Chinese.Bi-directio
, the original string can be divided into smaller strings to the mechanical participle, thereby reducing the matching error rate. Another method is to combine the word segmentation and the parts of speech, use the rich speech information to help the word segmentation decision, and in the labeling process in turn to test and adjust the segmentation results, so as to greatly improve the accuracy rate of segmentation.Phpanalysis participle of the word to need word to rough, and then to the short se
, using these words as breakpoints, you can divide the original string into smaller strings and perform mechanical word segmentation to reduce the matching error rate. Another method is to combine word segmentation and word class tagging, and use rich word class information to help word segmentation decisions. In addition, the word segmentation results are verified and adjusted in turn during the tagging process, this greatly improves the accuracy of splitting.PHPAnalysis performs rough segmenta
speech information to help the word segmentation decision, and in the labeling process in turn to test and adjust the segmentation results, so as to greatly improve the accuracy rate of segmentation.Phpanalysis participle of the word to need word to rough, and then to the short sentence of the rough two times the inverse of the maximum matching method (RMM) of the method of Word segmentation, after the word segmentation results are optimized, and the
), only less than 1% of the sentence, or the forward maximum matching method and reverse The segmentation of the maximal matching method is wrong, or the forward maximum matching method and inverse maximum matching method are different but two are not correct (ambiguity detection fails). This is the reason why the two-way maximum matching method can be widely used in the practical Chinese processing system.1.3 Establishment of the segmentation mark methodCollect the Shard mark, in the automatic
making, and in the process of tagging in turn to the results of the word segmentation test, adjust, so as to greatly improve the accuracy of segmentation.
Phpanalysis participle first to the need for word segmentation, and then the rough short sentences two times the reverse maximum matching method (RMM) of the method of segmentation, word after the result of the optimization, and then get the final word segmentation results.
API documentation
Mem
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.