ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 81 - DAS-C01 discussion

Report
Export

A healthcare company ingests patient data from multiple data sources and stores it in an Amazon S3 staging bucket. An AWS Glue ETL job transforms the data, which is written to an S3-based data lake to be queried using Amazon Athena. The company wants to match patient records even when the records do not have a common unique identifier. Which solution meets this requirement?

A.
Use Amazon Macie pattern matching as part of the ETLjob
Answers
A.
Use Amazon Macie pattern matching as part of the ETLjob
B.
Train and use the AWS Glue PySpark filter class in the ETLjob
Answers
B.
Train and use the AWS Glue PySpark filter class in the ETLjob
C.
Partition tables and use the ETL job to partition the data on patient name
Answers
C.
Partition tables and use the ETL job to partition the data on patient name
D.
Train and use the AWS Glue FindMatches ML transform in the ETLjob
Answers
D.
Train and use the AWS Glue FindMatches ML transform in the ETLjob
Suggested answer: D

Explanation:


The FindMatches transform enables you to identify duplicate or matching records in your dataset, even when the records do not have a common unique identifier and no fields match exactly. Reference: https://docs.aws.amazon.com/glue/latest/dg/machine-learning.html

asked 16/09/2024
lagwendon Scott
35 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first