# Precision and recall

Precision and recall |
---|

See also |

**Precision and recall** are two important measures used to evaluate the performance of a classification algorithm. Precision measures the accuracy of the algorithm's positive predictions, or the proportion of true positives out of all predicted positives. Recall measures the algorithm's ability to capture all positives, or the proportion of true positives out of all actual positives. In management, precision and recall can be used to assess the effectiveness of a model in predicting customer behavior, customer satisfaction, or other metrics. A model with high precision will make accurate predictions, while a model with high recall will capture all the important elements of customer behavior.

## Example of precision and recall

**Example 1**: A search engine uses precision and recall to determine the relevancy of search results. Precision is the percentage of relevant results returned by the search engine, while recall is the percentage of total relevant results retrieved. For example, if a search engine queries "shoe stores in Boston," a high precision result would return only shoe stores in Boston, while a high recall result would include all shoe stores in Boston, regardless of whether or not they are related to the query.**Example 2**: A marketing team uses precision and recall to analyze the effectiveness of their campaigns. Precision measures the accuracy of the team's targeting efforts, or the proportion of customers they successfully reached out to who were actually interested in the product. Recall, on the other hand, measures the ability of the campaign to capture all potential customers, or the proportion of customers the team was able to reach out to who were actually interested in the product.**Example 3**: An AI system for autonomous vehicles uses precision and recall to measure the accuracy and safety of its decisions. Precision measures the accuracy of the system's predictions, or the proportion of times it correctly predicted the outcome of a given situation. Recall measures the system's ability to detect all potential hazards, or the proportion of times it correctly identified potential risks.

## Formula of precision and recall

Precision is defined as the fraction of correctly identified positive predictions out of all positive predictions made. Mathematically, it is expressed as the following:

$$\textbf{Precision} = \frac{TP}{TP + FP}$$

Where TP stands for true positives and FP stands for false positives. True positives are the cases where the model correctly predicted a positive outcome, while false positives are the cases where the model incorrectly predicted a positive outcome.

Recall, on the other hand, is defined as the fraction of correctly identified positive predictions out of all actual positive cases. Mathematically, it is expressed as the following:

$$\textbf{Recall} = \frac{TP}{TP + FN}$$

Where FN stands for false negatives. False negatives are the cases where the model incorrectly predicted a negative outcome, while true positives are the cases where the model correctly predicted a positive outcome.

These two measures are important for evaluating the performance of a classification algorithm, as they measure the accuracy of the model's predictions, as well as its ability to capture all important elements of the data.

## When to use precision and recall

Precision and recall are important measures for evaluating the performance of a classification algorithm. They can be used in a variety of applications, such as:

**Fraud detection**: precision and recall can be used to assess the effectiveness of a model in identifying fraudulent transactions or activities.**Customer segmentation**: precision and recall can help identify customer segments and target marketing efforts more effectively.**Text classification**: precision and recall can be used to evaluate the accuracy of text classification algorithms, such as sentiment analysis.**Image recognition**: precision and recall can be used to evaluate the accuracy of image recognition algorithms, such as facial recognition.**Recommendation systems**: precision and recall can be used to evaluate the accuracy of recommendation systems, such as product recommendations.

## Types of precision and recall

Precision and recall are two important measures used to evaluate the performance of a classification algorithm. There are several types of precision and recall that can be used, including:

**Binary Precision and Recall**: Binary precision and recall measure the accuracy and completeness of a model’s predictions when the outcome is either positive or negative. For example, in a binary classification model, precision indicates how many of the model’s positive predictions were actually correct, while recall indicates how many of the true positives were captured.**Multiclass Precision and Recall**: Multiclass precision and recall measure the accuracy and completeness of a model’s predictions when the outcome is one of multiple classes. For example, in a multiclass classification model, precision indicates how many of the model’s predictions for each class were correct, while recall indicates how many of the true positives for each class were captured.**Micro Precision and Recall**: Micro precision and recall measure the accuracy and completeness of a model’s predictions when the outcome is one of multiple classes. For example, in a multiclass classification model, micro precision is the average of the precision of each class, while micro recall is the average of the recall of each class.**Macro Precision and Recall**: Macro precision and recall measure the accuracy and completeness of a model’s predictions when the outcome is one of multiple classes. For example, in a multiclass classification model, macro precision is the average of the precision of each class, while macro recall is the average of the recall of each class, weighted by the number of samples in each class.

## Steps of precision and recall

Precision and recall are two important measures used to evaluate the performance of a classification algorithm. Below are the steps that should be taken to calculate precision and recall:

**Calculate true positives (TP)**: These are the number of items that are correctly classified as positives by the algorithm.**Calculate false positives (FP)**: These are the number of items that are incorrectly classified as positives by the algorithm.**Calculate true negatives (TN)**: These are the number of items that are correctly classified as negatives by the algorithm.**Calculate false negatives (FN)**: These are the number of items that are incorrectly classified as negatives by the algorithm.**Calculate precision**: This is the ratio of true positives to the total number of predicted positives, which is TP/(TP + FP).**Calculate recall**: This is the ratio of true positives to the total number of actual positives, which is TP/(TP + FN).

## Advantages of precision and recall

The advantages of precision and recall are numerous. They are important metrics for evaluating the performance of a classification algorithm, as they measure how accurately it is predicting positive outcomes and how much of the relevant data it is capturing. Specifically, the advantages of precision and recall are:

- Precision measures the accuracy of the algorithm's positive predictions, providing a more accurate assessment of the algorithm's performance than a single measure such as accuracy.
- Recall measures the proportion of true positives out of all actual positives, allowing users to identify areas where the algorithm is missing important data and make adjustments accordingly.
- Precision and recall enable users to make informed decisions about their algorithms, as they provide a more comprehensive measure of performance than accuracy alone.
- Precision and recall are also useful for identifying and eliminating bias in algorithms, as they measure how well an algorithm is performing across all outcomes.

## Limitations of precision and recall

Precision and recall are important measures of a classification algorithm's performance, but they have some limitations. These include:

- They do not account for false negatives, or cases where a classifier fails to identify an event that should have been detected.
- They are dependent on the size of the dataset, and can be skewed if the dataset is unbalanced.
- They are not always a reliable measure of a model's generalizability, as precision and recall can be artificially inflated by overfitting the data.
- Precision and recall can be difficult to interpret and compare across different models.

Apart from precision and recall, there are several other approaches used to evaluate the performance of a classification algorithm. These include:

**F1 score**: F1 score is the harmonic mean of precision and recall which is used to measure the accuracy of a model.**Confusion matrix**: A confusion matrix is a table that is used to evaluate the performance of a classification algorithm by showing true positives, false positives, true negatives, and false negatives.**Receiver Operating Characteristic (ROC) curve**: A ROC curve is a graphical representation of the performance of a classification algorithm that plots true positive rate against false positive rate.**Area Under the Curve (AUC)**: AUC is the area under the ROC curve and is used to measure the accuracy of a model.

In summary, precision and recall are two important measures used to evaluate the performance of a classification algorithm. There are several other approaches used to evaluate the performance of a model, such as F1 score, confusion matrix, ROC curve, and AUC.

## Suggested literature

- Buckland, M., & Gey, F. (1994).
*The relationship between recall and precision*. Journal of the American society for information science, 45(1), 12-19. - Melamed, I. D., Green, R., & Turian, J. (2003).
*Precision and recall of machine translation*. In Companion Volume of the Proceedings of HLT-NAACL 2003-Short Papers (pp. 61-63). - Torgo, L., & Ribeiro, R. (2009).
*Precision and recall for regression*. In Discovery Science: 12th International Conference, DS 2009, Porto, Portugal, October 3-5, 2009 12 (pp. 332-346). Springer Berlin Heidelberg.