ExamGecko
Question list
Search
Search

Question 43 - DSA-C02 discussion

Report
Export

Which method is used for detecting data outliers in Machine learning?

A.
Scaler
Answers
A.
Scaler
B.
Z-Score
Answers
B.
Z-Score
C.
BOXI
Answers
C.
BOXI
D.
CMIYC
Answers
D.
CMIYC
Suggested answer: B

Explanation:

What are outliers?

Outliers are the values that look different from the other values in the data. Below is a plot high-lighting the outliers in 'red' and outliers can be seen in both the extremes of data.

Reasons for outliers in data

Errors during data entry or a faulty measuring device (a faulty sensor may result in extreme readings).

Natural occurrence (salaries of junior level employees vs C-level employees)

Problems caused by outliers

Outliers in the data may causes problems during model fitting (esp. linear models).

Outliers may inflate the error metrics which give higher weights to large errors (example, mean squared error, RMSE).

Z-score method is of the method for detecting outliers. This method is generally used when a variable' distribution looks close to Gaussian. Z-score is the number of standard deviations a value of a variable is away from the variable' mean.

Z-Score = (X-mean) / Standard deviation

IQR method , Box plots are some more example of methods used to detect data outliers in Data science.

asked 23/09/2024
saharat pinsaran
43 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first