ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 244 - MLS-C01 discussion

Report
Export

A retail company stores 100 GB of daily transactional data in Amazon S3 at periodic intervals. The company wants to identify the schema of the transactional data. The company also wants to perform transformations on the transactional data that is in Amazon S3.

The company wants to use a machine learning (ML) approach to detect fraud in the transformed data.

Which combination of solutions will meet these requirements with the LEAST operational overhead? {Select THREE.)

A.
Use Amazon Athena to scan the data and identify the schema.
Answers
A.
Use Amazon Athena to scan the data and identify the schema.
B.
Use AWS Glue crawlers to scan the data and identify the schema.
Answers
B.
Use AWS Glue crawlers to scan the data and identify the schema.
C.
Use Amazon Redshift to store procedures to perform data transformations
Answers
C.
Use Amazon Redshift to store procedures to perform data transformations
D.
Use AWS Glue workflows and AWS Glue jobs to perform data transformations.
Answers
D.
Use AWS Glue workflows and AWS Glue jobs to perform data transformations.
E.
Use Amazon Redshift ML to train a model to detect fraud.
Answers
E.
Use Amazon Redshift ML to train a model to detect fraud.
F.
Use Amazon Fraud Detector to train a model to detect fraud.
Answers
F.
Use Amazon Fraud Detector to train a model to detect fraud.
Suggested answer: B, D, F

Explanation:

To meet the requirements with the least operational overhead, the company should use AWS Glue crawlers, AWS Glue workflows and jobs, and Amazon Fraud Detector. AWS Glue crawlers can scan the data in Amazon S3 and identify the schema, which is then stored in the AWS Glue Data Catalog. AWS Glue workflows and jobs can perform data transformations on the data in Amazon S3 using serverless Spark or Python scripts. Amazon Fraud Detector can train a model to detect fraud using the transformed data and the company's historical fraud labels, and then generate fraud predictions using a simple API call.

Option A is incorrect because Amazon Athena is a serverless query service that can analyze data in Amazon S3 using standard SQL, but it does not perform data transformations or fraud detection.

Option C is incorrect because Amazon Redshift is a cloud data warehouse that can store and query data using SQL, but it requires provisioning and managing clusters, which adds operational overhead. Moreover, Amazon Redshift does not provide a built-in fraud detection capability.

Option E is incorrect because Amazon Redshift ML is a feature that allows users to create, train, and deploy machine learning models using SQL commands in Amazon Redshift. However, using Amazon Redshift ML would require loading the data from Amazon S3 to Amazon Redshift, which adds complexity and cost. Also, Amazon Redshift ML does not support fraud detection as a use case.

References:

AWS Glue Crawlers

AWS Glue Workflows and Jobs

Amazon Fraud Detector

asked 16/09/2024
Eric Hebert
35 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first