ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 112 - DEA-C01 discussion

Report
Export

A company analyzes data in a data lake every quarter to perform inventory assessments. A data engineer uses AWS Glue DataBrew to detect any personally identifiable information (PII) about customers within the data. The company's privacy policy considers some custom categories of information to be PII. However, the categories are not included in standard DataBrew data quality rules.

The data engineer needs to modify the current process to scan for the custom PII categories across multiple datasets within the data lake.

Which solution will meet these requirements with the LEAST operational overhead?

A.

Manually review the data for custom PII categories.

Answers
A.

Manually review the data for custom PII categories.

B.

Implement custom data quality rules in Data Brew. Apply the custom rules across datasets.

Answers
B.

Implement custom data quality rules in Data Brew. Apply the custom rules across datasets.

C.

Develop custom Python scripts to detect the custom PII categories. Call the scripts from DataBrew.

Answers
C.

Develop custom Python scripts to detect the custom PII categories. Call the scripts from DataBrew.

D.

Implement regex patterns to extract PII information from fields during extract transform, and load (ETL) operations into the data lake.

Answers
D.

Implement regex patterns to extract PII information from fields during extract transform, and load (ETL) operations into the data lake.

Suggested answer: B

Explanation:

The data engineer needs to detect custom categories of PII within the data lake using AWS Glue DataBrew. While DataBrew provides standard data quality rules, the solution must support custom PII categories.

Option B: Implement custom data quality rules in DataBrew. Apply the custom rules across datasets. This option is the most efficient because DataBrew allows the creation of custom data quality rules that can be applied to detect specific data patterns, including custom PII categories. This approach minimizes operational overhead while ensuring that the specific privacy requirements are met.

Options A, C, and D either involve manual intervention or developing custom scripts, both of which increase operational effort compared to using DataBrew's built-in capabilities.

AWS Glue DataBrew Documentation

asked 29/10/2024
Tillmon, Quinton
37 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first