ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 127 - MLS-C01 discussion

Report
Export

A Machine Learning Specialist needs to move and transform data in preparation for training Some of the data needs to be processed in near-real time and other data can be moved hourly There are existing Amazon EMR MapReduce jobs to clean and feature engineering to perform on the data

Which of the following services can feed data to the MapReduce jobs? (Select TWO )

A.
AWSDMS
Answers
A.
AWSDMS
B.
Amazon Kinesis
Answers
B.
Amazon Kinesis
C.
AWS Data Pipeline
Answers
C.
AWS Data Pipeline
D.
Amazon Athena
Answers
D.
Amazon Athena
E.
Amazon ES
Answers
E.
Amazon ES
Suggested answer: B, C

Explanation:

Amazon Kinesis and AWS Data Pipeline are two services that can feed data to the Amazon EMR MapReduce jobs. Amazon Kinesis is a service that can ingest, process, and analyze streaming data in real time. Amazon Kinesis can be integrated with Amazon EMR to run MapReduce jobs on streaming data sources, such as web logs, social media, IoT devices, and clickstreams. Amazon Kinesis can handle data that needs to be processed in near-real time, such as for anomaly detection, fraud detection, or dashboarding. AWS Data Pipeline is a service that can orchestrate and automate data movement and transformation across various AWS services and on-premises data sources. AWS Data Pipeline can be integrated with Amazon EMR to run MapReduce jobs on batch data sources, such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon Redshift. AWS Data Pipeline can handle data that can be moved hourly, such as for data warehousing, reporting, or machine learning.

AWSDMS is not a valid service name. AWS Database Migration Service (AWS DMS) is a service that can migrate data from various sources to various targets, but it does not support streaming data or MapReduce jobs.

Amazon Athena is a service that can query data stored in Amazon S3 using standard SQL, but it does not feed data to Amazon EMR or run MapReduce jobs.

Amazon ES is a service that provides a fully managed Elasticsearch cluster, which can be used for search, analytics, and visualization, but it does not feed data to Amazon EMR or run MapReduce jobs.References:

Using Amazon Kinesis with Amazon EMR - Amazon EMR

AWS Data Pipeline - Amazon Web Services

Using AWS Data Pipeline to Run Amazon EMR Jobs - AWS Data Pipeline

asked 16/09/2024
Giulia Maggio
38 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first