You work for a retailer that sells clothes to customers around the world. You have been tasked with ensuring that ML models are built in a secure manner. Specifically, you need to protect sensitive customer data that might be used in the models. You have identified four fields containing sensitive data that are being used by your data science team: AGE, IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and SHIRT_SIZE. What should you do with the data before it is made available to the data science team for training purposes?

Question

marco damone · Accepted Answer

Coarsen the data by putting AGE into quantiles and rounding LATITUDE_LONGTTUDE into single precision. The other two fields are already as coarse as possible.

marco damone · Answer

Tokenize all of the fields using hashed dummy values to replace the real values.

marco damone · Answer

Use principal component analysis (PCA) to reduce the four sensitive fields to one PCA vector.

marco damone · Answer

Remove all sensitive data fields, and ask the data science team to build their models using non-sensitive data.

Question list

List of questions

Question 1

(0)

Question 2

(0)

Question 3

(0)

Question 4

(0)

Question 5

(0)

Question 6

(0)

Question 7

(0)

Question 8

(0)

Question 9

(0)

Question 10

(0)

Related questions

Question 136 - Professional Machine Learning Engineer discussion

Suggested answer: C

0 comments