A data engineer at a bank is evaluating a new tabular dataset that includes customer data. The data engineer will use the customer data to create a new model to predict customer behavior. After creating a correlation matrix for the variables, the data engineer notices that many of the 100 features are highly correlated with each other.
Which steps should the data engineer take to address this issue? (Choose two.)

Question

A data engineer at a bank is evaluating a new tabular dataset that includes customer data. The data engineer will use the customer data to create a new model to predict customer behavior. After creating a correlation matrix for the variables, the data engineer notices that many of the 100 features are highly correlated with each other.

Which steps should the data engineer take to address this issue? (Choose two.)

Liusel Herrera Garcia · Accepted Answer

Apply principal component analysis (PCA).

Liusel Herrera Garcia · Accepted Answer

Remove a portion of highly correlated features from the dataset.

Liusel Herrera Garcia · Answer

Use a linear-based algorithm to train the model.

Liusel Herrera Garcia · Answer

Apply min-max feature scaling to the dataset.

Liusel Herrera Garcia · Answer

Apply one-hot encoding category-based variables.

Question list

List of questions

Question 1

(0)

Question 2

(0)

Question 3

(0)

Question 4

(0)

Question 5

(0)

Question 6

(0)

Question 7

(0)

Question 8

(0)

Question 9

(0)

Question 10

(0)

Related questions

Question 208 - MLS-C01 discussion

Suggested answer: B, C

0 comments