ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 129 - DEA-C01 discussion

Report
Export

A data engineer needs to build an enterprise data catalog based on the company's Amazon S3 buckets and Amazon RDS databases. The data catalog must include storage format metadata for the data in the catalog.

Which solution will meet these requirements with the LEAST effort?

A.

Use an AWS Glue crawler to scan the S3 buckets and RDS databases and build a data catalog. Use data stewards to inspect the data and update the data catalog with the data format.

Answers
A.

Use an AWS Glue crawler to scan the S3 buckets and RDS databases and build a data catalog. Use data stewards to inspect the data and update the data catalog with the data format.

B.

Use an AWS Glue crawler to build a data catalog. Use AWS Glue crawler classifiers to recognize the format of data and store the format in the catalog.

Answers
B.

Use an AWS Glue crawler to build a data catalog. Use AWS Glue crawler classifiers to recognize the format of data and store the format in the catalog.

C.

Use Amazon Macie to build a data catalog and to identify sensitive data elements. Collect the data format information from Macie.

Answers
C.

Use Amazon Macie to build a data catalog and to identify sensitive data elements. Collect the data format information from Macie.

D.

Use scripts to scan data elements and to assign data classifications based on the format of the data.

Answers
D.

Use scripts to scan data elements and to assign data classifications based on the format of the data.

Suggested answer: B

Explanation:

To build an enterprise data catalog with metadata for storage formats, the easiest and most efficient solution is using an AWS Glue crawler. The Glue crawler can scan Amazon S3 buckets and Amazon RDS databases to automatically create a data catalog that includes metadata such as the schema and storage format (e.g., CSV, Parquet, etc.). By using AWS Glue crawler classifiers, you can configure the crawler to recognize the format of the data and store this information directly in the catalog.

Option B: Use an AWS Glue crawler to build a data catalog. Use AWS Glue crawler classifiers to recognize the format of data and store the format in the catalog. This option meets the requirements with the least effort because Glue crawlers automate the discovery and cataloging of data from multiple sources, including S3 and RDS, while recognizing various file formats via classifiers.

Other options (A, C, D) involve additional manual steps, like having data stewards inspect the data, or using services like Amazon Macie that focus more on sensitive data detection rather than format cataloging.

AWS Glue Crawler Documentation

AWS Glue Classifiers

asked 29/10/2024
Rehan Malik
51 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first