ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 45 - DP-100 discussion

Report
Export

You are building a binary classification model by using a supplied training set.

The training set is imbalanced between two classes.

You need to resolve the data imbalance.

What are three possible ways to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

A.
Penalize the classification
Answers
A.
Penalize the classification
B.
Resample the dataset using undersampling or oversampling
Answers
B.
Resample the dataset using undersampling or oversampling
C.
Normalize the training feature set
Answers
C.
Normalize the training feature set
D.
Generate synthetic samples in the minority class
Answers
D.
Generate synthetic samples in the minority class
E.
Use accuracy as the evaluation metric of the model
Answers
E.
Use accuracy as the evaluation metric of the model
Suggested answer: A, B, D

Explanation:

A: Try Penalized Models

You can use the same algorithms but give them a different perspective on the problem.

Penalized classification imposes an additional cost on the model for making classification mistakes on the minority class during training. These penalties can bias the model to pay more attention to the minority class.

B: You can change the dataset that you use to build your predictive model to have more balanced data.

This change is called sampling your dataset and there are two main methods that you can use to even-up the classes:

Consider testing under-sampling when you have an a lot data (tens- or hundreds of thousands of instances or more)

Consider testing over-sampling when you don't have a lot of data (tens of thousands of records or less)

D: Try Generate Synthetic Samples

A simple way to generate synthetic samples is to randomly sample the attributes from instances in the minority class.

Reference:

https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/

asked 02/10/2024
Anton Khodyakov
46 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first