Read the classification model evaluation indicators in one article-AI-php.cn

Read the classification model evaluation indicators in one article

王林

Release： 2024-01-07 20:45:57

forward

1223 people have browsed it

Model evaluation is a very important part of deep learning and machine learning, used to measure the performance and effect of the model. This article will break down the confusion matrix, accuracy, precision, recall and F1 score step by step

Read the classification model evaluation indicators in one article

##Confusion Matrix

The confusion matrix is used to evaluate the performance of the model in classification problems. It is a table showing how the model classifies samples. Rows represent actual categories and columns represent predicted categories. For a two-classification problem, the structure of the confusion matrix is as follows:

Read the classification model evaluation indicators in one article

True Positive (TP): It is actually a positive example and the model predicts is the number of positive examples and the model's ability to correctly identify positive instances. Higher TP is usually desirable
False Negative (FN): Number of samples that are actually positive and predicted by the model to be negative, depending on the application this may be critical (for example, failure to detect a security threat).
False Positive (FP): The number of samples that are actually negative examples and the model predicts positive examples. It emphasizes the situation where the model predicts positive cases when it should not predict positive cases. This May have application-dependent consequences (e.g. unnecessary treatment in medical diagnosis)
True Negative (TN): Number of samples that are actually negative and predicted by the model to be negative , reflects the model’s ability to correctly identify negative instances. Usually requires a higher TN

It may seem confusing to beginners, but it's actually quite simple. The Negative/Positive in the back is the model prediction value, and the True/False in the front is the accuracy of the model prediction. For example, True Negative means that the model prediction is negative and consistent with the actual value, that is, the prediction is correct. This makes it easier to understand. Here is a simple confusion matrix:

from sklearn.metrics import confusion_matrix import seaborn as sns import matplotlib.pyplot as plt # Example predictions and true labels y_true = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0] y_pred = [1, 0, 1, 0, 0, 1, 0, 1, 1, 1] # Create a confusion matrix cm = confusion_matrix(y_true, y_pred) # Visualize the blueprint sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", xticklabels=["Predicted 0", "Predicted 1"], yticklabels=["Actual 0", "Actual 1"]) plt.xlabel("Predicted") plt.ylabel("Actual") plt.show()

Copy after login

Use TP and TN when you want to emphasize correct predictions and overall accuracy. Use FP and FN when you want to understand the types of errors your model makes. For example, in applications where the cost of false positives is high, minimizing false positives may be critical.

As an example, let’s talk about spam classifiers. The confusion matrix helps us understand how many spam emails the classifier correctly identified and how many non-spam emails it incorrectly marked as spam

Based on the confusion matrix, many other emails can be calculated Evaluation metrics such as accuracy, precision, recall and F1 score.

Accuracy

Read the classification model evaluation indicators in one article

##According to our summary above, what is calculated is the proportion that can be predicted correctly. The numerator is that both TP and TN are True, which is the total number of correct predictions by the model

Precision

Read the classification model evaluation indicators in one article # #You can see the formula. It calculates the proportion of Positive, that is to say, how many Positives in the data are correctly predicted, so Precision is also called the accuracy rate

This becomes important in situations where false positives have significant consequences or costs. Taking the medical diagnosis model as an example, the accuracy is ensured to ensure that only those who really need treatment receive treatment

Recall

Recall rate, also known as recovery rate is the sensitivity or true positive rate, which refers to the model’s ability to capture all positive instances

Read the classification model evaluation indicators in one article As can be seen from the formula, its main The purpose is to calculate the number of actual positive examples captured by the model, that is, the proportion of positive examples. Therefore, Recall is also called recall rate

F1 Score

The calculation formula of F1 score is: F1 = 2 * (Precision * Recall) / (Precision Recall) Among them, precision refers to the proportion of samples predicted as positive examples by the model that are actually positive examples; recall rate refers to the proportion of the number of samples correctly predicted as positive examples by the model to the number of samples that are actually positive examples. The F1 score is the harmonic mean of precision and recall, which can comprehensively consider the accuracy and comprehensiveness of the model to evaluate the performance of the model

# #F1 score is important because it provides a trade-off between precision and recall. When you want to find a balance between precision and recall, or for general applications, you can use F1 Score Read the classification model evaluation indicators in one article

Summary

In this article, we introduced the confusion matrix, accuracy, precision, recall and F1 score in detail, and pointed out that these indicators can effectively evaluate and Improve model performance

The above is the detailed content of Read the classification model evaluation indicators in one article. For more information, please follow other related articles on the PHP Chinese website!