Table of Contents
- An explanation you will never forget (unless you use a neuralizer)
- 0.Execution example
- 1. recall
- 1.1 🥱Boring definition
- 1.2 👽Interesting (contextual) definitions
- 1.3 📈What does high recall mean?
- 1.4 📉What does low recall mean?
- 1.5 💵Meaningful real world examples (involving money)
- 2. Precision
- 2.1 🥱Boring definition
- 2.2 👽Interesting (contextual) definitions
- 2.3 📈What does high precision mean?
- 2.4 📉What does low precision mean?
- 2.5 💵Meaningful real world examples (involving money)
- 3. F value
- 4. Shouldn’t we just pursue the accuracy rate simply and clearly?
- 5. Summary
An explanation you will never forget (unless you use a neuralizer)
Disclaimer: All opinions expressed are mine.
I don’t know about you, but when I come across the concept of precision and recall, I understand it perfectly then… but the next day it suddenly becomes difficult to explain. It’s as if a neuralizer was used to erase it from memory.
So I took a hint from the scene where Will Smith’s character shoots a little girl to pass the MIB (short for “Men In Black”) exams, and the concepts of precision and recall. I thought of an easy-to-understand example that I could understand and remember. Please read below.
Let’s say you’re an agent of the Men in Black, a secret agency dedicated to protecting humanity from aliens disguised as humans. You got word that a Halloween party was invaded by aliens. Your mission is to identify and capture disguised aliens (oops, not in the movies!).
In machine learning terms, these tasks are alien identification/classification problems. Given a dataset of real humans and aliens disguised as humans, we want to identify aliens.
You and your fellow agents head to a party to capture suspected aliens. Some were identified correctly, others were identified incorrectly. Let us now evaluate the ability to identify aliens disguised as humans using recall and precision.
How many of the aliens disguised as humans did you recognize correctly?
1.1 🥱Boring definition
1.2 👽Interesting (contextual) definitions
When I broke into a party to determine who was an alien and who was a human, while I was able to identify the aliens correctly, some aliens were mistaken for humans and missed. Recall is a measure of how accurately the aliens were able to pick out the aliens from the humans they were actually disguised as. This index can be said to be a measure of the degree to which aliens were not overlooked at the party.
1.3 📈What does high recall mean?
High recall means that disguised aliens were less likely to be mistaken for humans.
A high recall rate, on the other hand, can result in judging too many humans as aliens in disguise. If everyone at the party were identified as aliens, the recall might be perfect (everyone is considered a “positive” case, so there are zero false negatives). Therefore, many of the real people you have captured may not be very comfortable with unnecessary interrogation. But if your priority is catching as many real aliens as possible and you don’t care too much about accidentally catching real humans, then recall might be a good metric for you. They may end up getting mad at the humans (who they accidentally caught), but they are safe humans!
Mixing matrix with 100% recall for 100 party participants (30 aliens/70 humans)
|alien (predicted value)||human (predicted value)|
|alien (real value)||30||0|
|human (actual value)||70||0|
While the recall is 100%, the false positives are 70 . In other words, while they caught all the aliens, there were 70 humans who mistook them for aliens .
1.4 📉What does low recall mean?
Conversely, low recall means that the ability to pick aliens out of real aliens was low. You should get more training.
1.5 Meaningful real world examples 💵
In the field of online transactions, high recall is often required in scenarios for fraud detection. Some transactions may be falsely identified as fraudulent, but a high recall rate makes it more certain that the majority of fraudulent transactions will be caught. Some customers may feel a little frustrated because their transactions are considered fraudulent, but the chances of the customer or company suffering unjustified losses are reduced.
How many of the humans you thought were aliens actually turned into humans?
2.1 🥱Boring definition
2.2 👽Interesting (contextual) definitions
If you identify and capture humans thinking they are aliens, some of the captured humans are aliens and innocent humans. Precision rate is an index that shows how many people who were thought to be aliens actually turned out to be aliens. This index can be said to be a measure of the high ability to avoid erroneously recognizing real humans as aliens.
2.3 📈What does high precision mean?
A high precision means that real humans were misidentified as aliens in fewer cases.
After locating and capturing a single human thought to be an alien, it turns out that the individual is actually an alien in disguise. In that case, numerically the precision is perfect. The downside is that you may leave many disguised aliens at your party. But don’t forget that the MIB is a secret agency. You, the MIB agent, would think you wouldn’t want to accidentally arrest a real person and jeopardize the confidentiality of the MIB, or the fact that aliens are lurking among us in disguise. . In such scenarios, precision is the criterion to be adhered to.
2.4 📉What does low precision mean?
Conversely, if the precision rate is low, there is a possibility of catching too many real humans, mistaking them for aliens. In such a case, before the existence of aliens and MIB is known to the world, there is no choice but to erase the memory using a neuralizer.
2.5 Meaningful real world examples 💵
In the banking arena, the problem of identifying loan defaulters requires a high precision rate. By mistakenly identifying too many customers as loan delinquents, banks will not be able to lend enough people. That would reduce the bank’s income from interest paid by borrowers, which would be bad for the bank’s bottom line.
3. F value
We value both precision and recall, so we want to balance them – Boss of MIB
Let’s say there are only a few spots left for this year. That means it’s also time for his boss’ year-end performance review. To ensure that year-end bonuses are paid out fairly, managers must align the performance of all MIB agents against the agency’s overall goals. MIB has two goals. The MIB must successfully capture the aliens, but at the same time must keep the aliens’ existence secret from the rest of the world. Should your boss use precision or recall?
One possible solution is to use the F value (sometimes called the F1 score). The F-measure helps balance precision and recall.
The F value is the harmonic mean of precision and recall .
4. Shouldn’t we just pursue the accuracy rate simply and clearly?
I know what you mean. I love the simplicity of machine learning, but for some problems it may not be wise to use accuracy as a measure of classifier performance. As you know, we are talking about the problem of disproportionate number of taxonomy members. For example, suppose the aforementioned Halloween party had 100 attendees, only 5 of whom were aliens in disguise. In this case, you’d get a 95% accuracy rate if you identified all 100 people as humans, but if your fellow agents with high F-scores were contributing to the MIB’s core objectives, you’d be better off at the end of the year. You won’t get the bonus. Therefore, consideration of precision, recall, and F-value are viable alternatives for measuring classification performance.
Mixed matrix with 95% accuracy rate for 100 party participants (5 aliens/95 humans)
|alien (predicted value)||human (predicted value)|
|alien (real value)||0||Five|
|human (actual value)||0||95|
In the above, the recall rate is 0%, that is , the accuracy rate is 95% even though no aliens are detected .
- When the number of class members is imbalanced, evaluation based on precision rate, recall rate, and F value is a wiser choice than accuracy rate.
- Recall is used when the emphasis is on finding as many true cases as possible.
- The precision rate is used when emphasizing whether or not the case judged to be positive by oneself is correct.