ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 200 - Professional Data Engineer discussion

Report
Export

You are designing a cloud-native historical data processing system to meet the following conditions:

The data being analyzed is in CSV, Avro, and PDF formats and will be accessed by multiple analysis tools including Cloud Dataproc, BigQuery, and Compute Engine.

A streaming data pipeline stores new data daily.

Peformance is not a factor in the solution.

The solution design should maximize availability.

How should you design data storage for this solution?

A.
Create a Cloud Dataproc cluster with high availability. Store the data in HDFS, and peform analysis as needed.
Answers
A.
Create a Cloud Dataproc cluster with high availability. Store the data in HDFS, and peform analysis as needed.
B.
Store the data in BigQuery. Access the data using the BigQuery Connector or Cloud Dataproc and Compute Engine.
Answers
B.
Store the data in BigQuery. Access the data using the BigQuery Connector or Cloud Dataproc and Compute Engine.
C.
Store the data in a regional Cloud Storage bucket. Aceess the bucket directly using Cloud Dataproc, BigQuery, and Compute Engine.
Answers
C.
Store the data in a regional Cloud Storage bucket. Aceess the bucket directly using Cloud Dataproc, BigQuery, and Compute Engine.
D.
Store the data in a multi-regional Cloud Storage bucket. Access the data directly using Cloud Dataproc, BigQuery, and Compute Engine.
Answers
D.
Store the data in a multi-regional Cloud Storage bucket. Access the data directly using Cloud Dataproc, BigQuery, and Compute Engine.
Suggested answer: D
asked 18/09/2024
Roberto Garavaglia
45 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first