Http://blog.csdn.net/wen718/article/details/5960666
When evaluating the Chinese word segmentation performance, three commonly used evaluation indicators are used: accuracy (P), recall rate (R), and comprehensive indicator F (f ). Accuracy indicates the correct percentage of all words to be split. Recall rate refers to the ratio of words that are correctly segmented in all words (including words that are segmented and should not be ignored. Accuracy describes the correct percentage of words in the system segmentation. The recall rate indicates the number of words correctly divided by the system. The calculation formula is as follows:
P = Number of words to be accurately split/number of all words to be split
R = Number of words to be accurately split/number of words to be split
In actual evaluation of a system, P and R should be considered at the same time, but two values must be compared at the same time, it is difficult to make it clear at a glance. Therefore, two values are often used for evaluation. The F value of the comprehensive indicator is one of them. The calculation formula is as follows:
F = (Beta square + 1) Pr/(Beta square * P + r)
Among them, β decides whether to focus on P or R, usually set to 1, 2 or 1/2. The value of β is 1, that is, the value is equally important to both.
From word-based word location tagging Chinese Word Segmentation