Beginner Information Retrieval 5: accuracy rate-recall rate and search engine Evaluation

Source: Internet
Author: User

This article briefly introduces the evaluation methods of search engines. The best way to evaluate the search performance of a search engine from the user's perspective is to calculate the number of documents that the user has browsed when finding a satisfactory document. However, in practice, the query is ever-changing and the document is ever-changing, so this method is not feasible. People put forward the following concepts and established an evaluation standard.

There are three common concepts: accuracy, accuracy, and recall.

Accuracy(Precision, P for short) is defined as: P = number of relevant documents in the returned results/number of returned results.

Accuracy(Accuracy, referred to as a) is defined as: a = number of documents with correct judgment results/number of all documents.

Recall rate(Recall, R for short) is defined as: r = number of relevant documents in the returned results/number of all relevant documents.

  Actual number of related documents Actual number of irrelevant documents
Number of returned documents (Search Engine considers relevant) TP Fp
Number of documents not returned (not considered relevant by the search engine) FN TN

Based on the definitions of accuracy, accuracy, and recall rate:

P = TP/(TP + FP)

A = (TP + Tn)/(TP + FP + FN + Tn)

R = TP/(TP + FN)

There are two different concepts: accuracy rate and accuracy rate.

What if the search engine uses the accuracy rate to evaluate the search engine's performance? The accuracy rate is used in the evaluation of the second-class classifier. The evaluation of the second-class classifier is very effective, but it cannot be used in the evaluation of the search. The second-class classifier uses this concept to measure the correct score of the classifier, and the evaluation of the search is to measure the user's desired percentage. Since the number of documents in a document set that are not relevant to the query is more than 99%, as long as you simply think that all documents are irrelevant to the query, the accuracy rate of over 99% is obtained, only 1% of them are what the user wants, so the accuracy rate cannot be used.

What if I only use the recall rate to evaluate the search engine performance? No. As long as all documents are simply returned, we get a 100% recall rate, and users only want 1% of the total. Therefore, both the accuracy and recall rate are used to evaluate the search engine. Among them, the most famous is the 11-point-Pr curve.

This figure shows that the recall rate of search engines ranges from 0% ~ 100% indicates the accuracy of the result. This graph is used to measure a search engine. For example, the effect of ir2 is worse than that of ir1. How can this curve be obtained?

For this reason, people have established a standard test set that contains a certain number of documents, queries, queries, and relevance between documents.

Test-set = <D, Q, R <q, D>, test-set indicates the test set, D indicates the sample document set, and Q indicates the query sample set, r <q, D> is the correlation judgment between each query and each document, which must be determined manually beforehand.

The system then processes the sample query, and the system returns the sorted List of documents to the user based on the retrieval model. Because you know the correlation between the query and the document time in advance, you can view the document in the starting order and calculate the correct rate at different recall rates. This curve is obtained.

In fact, a search engine is measured not only by the accuracy rate and recall rate, but also by the response latency and interface friendliness of the search engine. These indicators are all taken into account from the user's perspective. There are also system index building overhead, update overhead, and so on, which are indicators to evaluate the system performance. These are not described in this article.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.