ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 297 - MLS-C01 discussion

Report
Export

A machine learning (ML) engineer is preparing a dataset for a classification model. The ML engineer notices that some continuous numeric features have a significantly greater value than most other features. A business expert explains that the features are independently informative and that the dataset is representative of the target distribution.

After training, the model's inferences accuracy is lower than expected.

Which preprocessing technique will result in the GREATEST increase of the model's inference accuracy?

A.

Normalize the problematic features.

Answers
A.

Normalize the problematic features.

B.

Bootstrap the problematic features.

Answers
B.

Bootstrap the problematic features.

C.

Remove the problematic features.

Answers
C.

Remove the problematic features.

D.

Extrapolate synthetic features.

Answers
D.

Extrapolate synthetic features.

Suggested answer: A

Explanation:

In a classification model, features with significantly larger scales can dominate the model training process, leading to poor performance. Normalization scales the values of continuous features to a uniform range, such as [0, 1], which prevents large-value features from disproportionately influencing the model. This is particularly beneficial for algorithms sensitive to the scale of input data, such as neural networks or distance-based algorithms.

Given that the problematic features are informative and representative of the target distribution, removing or bootstrapping these features is not advisable. Normalization will bring all features to a similar scale and improve the model's inference accuracy without losing important information.

asked 31/10/2024
Alvaro Peralta
24 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first