Precision and recall)

Source: Internet
Author: User

Many concepts do not understand the internet. Take the basic concepts of "recall rate" and "accuracy rate" and look at the information on the Internet. When I use the information, I want to understand it in my mind, when someone else is ready to use it, you still have to think about it. I specially found some materials and sorted out these two concepts, hoping to be more proficient.

Recall rate and accuracy are two important concepts and indicators in the design of search engines (or other retrieval systems.
Recall rate: Recall, also known as "Check full rate ";
Accuracy: precision, also known as "precision" and "accuracy ".

When retrieving a document in a large dataset, you can divide all the documents in the collection into four types:

 Irrelevant

Retrieved

  A B
Not retrieved C D

A: The retrieved and related items (the searched items are also desired)
B: retrieved but irrelevant (searched but useless)
C: not retrieved, but related (not found, but actually desired)
D: unretrieved and irrelevant (useless if not found)

We usually hope that the more documents in the database are retrieved, the better. This is the pursuit of "query full rate", that is, a/(A + C). The larger the query, the better.
At the same time, we also hope that the more relevant documents retrieved, the better, the less irrelevant, the better. This is the pursuit of "accuracy", that is, a/(a + B ), the larger the value, the better.

Summarized as follows:
Recall rate: The retrieved documents are compared with all relevant documents in the database.
Accuracy: The retrieved documents are more accurate than all retrieved documents.

Although there is no inevitable relationship between "recall rate" and "accuracy" (as can be seen from the above formula), these two indicators are mutually restricted in large-scale data sets.
Because the "search policy" is not perfect, when you want more relevant documents to be retrieved and relax the "search policy", there will often be some irrelevant results, this affects the accuracy.
When you want to remove irrelevant documents from the search results, make sure that the "search policy" is more strict. This will make some relevant documents no longer be retrieved, this affects the recall rate.

All the indexes designed for retrieving and selecting large-scale data sets involve the "recall rate" and "accuracy rate. Because of the mutual constraints between the two indicators, we usually choose an appropriate degree for the "search policy" based on our needs, which cannot be too strict or loose, it seeks a balance between the recall rate and accuracy. This balance point is determined by specific requirements.

In fact, accuracy (precision) is better understood. It is often difficult to quickly respond to the "recall rate ". I think this is also related to the literal meaning, and the meaning cannot be directly seen from the literal meaning of "Recall.
I think the word "recall rate" is not well translated. The meaning of "Recall" in Chinese is to call XX back. For example, if the Sony battery has a problem, the manufacturer will recall it.
Since the translation is not good, let's look back at the English "Recall" corresponding to the "recall rate". Besides the meaning of "order something toreturn" mentioned above, there is also the meaning of "remember.

Recall: the ability to remember something. That you have learned or something. That has happened in the past.

Here, recall should be the meaning, so it is easier to understand the meaning of "recall rate.
When we query all the details of a System Event (input query), recall refers to how much detail the system can "Recall" from those events, in general, it is "the ability to recall ". The number of details that can be recalled is divided by all the details that the system knows about the incident, that is, the "memory rate", that is, the recall-recall rate.

This is much easier.

The concept is similar when computer vision detects images.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.