ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 107 - Professional Machine Learning Engineer discussion

Report
Export

You are working on a binary classification ML algorithm that detects whether an image of a classified scanned document contains a company's logo. In the dataset, 96% of examples don't have the logo, so the dataset is very skewed. Which metrics would give you the most confidence in your model?

A.
F-score where recall is weighed more than precision
Answers
A.
F-score where recall is weighed more than precision
B.
RMSE
Answers
B.
RMSE
C.
F1 score
Answers
C.
F1 score
D.
F-score where precision is weighed more than recall
Answers
D.
F-score where precision is weighed more than recall
Suggested answer: A

Explanation:

Option A is correct because using F-score where recall is weighed more than precision is a suitable metric for binary classification with imbalanced data.F-score is a harmonic mean of precision and recall, which are two metrics that measure the accuracy and completeness of the positive class1.Precision is the fraction of true positives among all predicted positives, while recall is the fraction of true positives among all actual positives1. When the data is imbalanced, the positive class is the minority class, which is usually the class of interest. For example, in this case, the positive class is the images that contain the company's logo, which are rare but important to detect.By weighing recall more than precision, we can emphasize the importance of finding all the positive examples, even if some false positives are included2.

Option B is incorrect because using RMSE (root mean squared error) is not a valid metric for binary classification with imbalanced data.RMSE is a metric that measures the average magnitude of the errors between the predicted and actual values3.RMSE is suitable for regression problems, where the target variable is continuous, not for classification problems, where the target variable is discrete4.

Option C is incorrect because using F1 score is not the best metric for binary classification with imbalanced data.F1 score is a special case of F-score where precision and recall are equally weighted1.F1 score is suitable for balanced data, where the positive and negative classes are equally important and frequent5.However, for imbalanced data, the positive class is more important and less frequent than the negative class, so F1 score may not reflect the performance of the model well2.

Option D is incorrect because using F-score where precision is weighed more than recall is not a good metric for binary classification with imbalanced data.By weighing precision more than recall, we can emphasize the importance of minimizing the false positives, even if some true positives are missed2.However, for imbalanced data, the true positives are more important and less frequent than the false positives, so this metric may not reflect the performance of the model well2.

Precision, recall, and F-measure

F-score for imbalanced data

RMSE

Regression vs classification

F1 score

[Imbalanced classification]

[Binary classification]

asked 18/09/2024
ERIK BURDETT
42 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first