ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 49 - BDS-C00 discussion

Report
Export

A city has been collecting data on its public bicycle share program for the past three years. The 5PB dataset currently resides on Amazon S3. The data contains the following datapoints: Bicycle origination points

Bicycle destination points

Mileage between the points

Number of bicycle slots available at the station (which is variable based on the station location) Number of slots available and taken at a given time The program has received additional funds to increase the number of bicycle stations available. All data is regularly archived to Amazon Glacier. The new bicycle stations must be located to provide the most riders access to bicycles. How should this task be performed?

A.
Move the data from Amazon S3 into Amazon EBS-backed volumes and use an EC-2 based Hadoop cluster with spot instances to run a Spark job thatperforms a stochastic gradient descent optimization.
Answers
A.
Move the data from Amazon S3 into Amazon EBS-backed volumes and use an EC-2 based Hadoop cluster with spot instances to run a Spark job thatperforms a stochastic gradient descent optimization.
B.
Use the Amazon Redshift COPY command to move the data from Amazon S3 into Redshift and perform a SQL query that outputs the most popular bicyclestations.
Answers
B.
Use the Amazon Redshift COPY command to move the data from Amazon S3 into Redshift and perform a SQL query that outputs the most popular bicyclestations.
C.
Persist the data on Amazon S3 and use a transient EMR cluster with spot instances to run a Spark streaming job that will move the data into AmazonKinesis.
Answers
C.
Persist the data on Amazon S3 and use a transient EMR cluster with spot instances to run a Spark streaming job that will move the data into AmazonKinesis.
D.
Keep the data on Amazon S3 and use an Amazon EMR-based Hadoop cluster with spot instances to run a Spark job that performs a stochastic gradientdescent optimization over EMRFS.
Answers
D.
Keep the data on Amazon S3 and use an Amazon EMR-based Hadoop cluster with spot instances to run a Spark job that performs a stochastic gradientdescent optimization over EMRFS.
Suggested answer: B
asked 16/09/2024
Arvin Lee
39 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first