A company receives datasets from partners at various frequencies. The datasets include baseline data and incremental data. The company needs to merge and store all the datasets without reprocessing the data.
Which solution will meet these requirements with the LEAST development effort?

Question

A company receives datasets from partners at various frequencies. The datasets include baseline data and incremental data. The company needs to merge and store all the datasets without reprocessing the data.

Which solution will meet these requirements with the LEAST development effort?

Eric Zarghami · Accepted Answer

Use an AWS Glue job with job bookmarks enabled to process the datasets. Store the data in Amazon S3.

Eric Zarghami · Answer

Use an AWS Glue job with a temporary table to process the datasets. Store the data in an Amazon RDS table.

Eric Zarghami · Answer

Use an Apache Spark job in an Amazon EMR cluster to process the datasets. Store the data in EMR File System (EMRFS).

Eric Zarghami · Answer

Use an AWS Lambda function to process the datasets. Store the data in Amazon S3.

Question list

List of questions

Question 1

(0)

Question 2

(0)

Question 3

(0)

Question 4

(0)

Question 5

(0)

Question 6

(0)

Question 7

(0)

Question 8

(0)

Question 9

(0)

Question 10

(0)

Related questions

Question 197 - DAS-C01 discussion

Suggested answer: C

0 comments