ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 293 - Professional Data Engineer discussion

Report
Export

You have designed an Apache Beam processing pipeline that reads from a Pub/Sub topic. The topic has a message retention duration of one day, and writes to a Cloud Storage bucket. You need to select a bucket location and processing strategy to prevent data loss in case of a regional outage with an RPO of 15 minutes. What should you do?

A.
1 Use a regional Cloud Storage bucket 2 Monitor Dataflow metrics with Cloud Monitoring to determine when an outage occurs 3 Seek the subscription back in time by one day to recover the acknowledged messages 4 Start the Dataflow job in a secondary region and write in a bucket in the same region
Answers
A.
1 Use a regional Cloud Storage bucket 2 Monitor Dataflow metrics with Cloud Monitoring to determine when an outage occurs 3 Seek the subscription back in time by one day to recover the acknowledged messages 4 Start the Dataflow job in a secondary region and write in a bucket in the same region
B.
1 Use a multi-regional Cloud Storage bucket 2 Monitor Dataflow metrics with Cloud Monitoring to determine when an outage occurs 3 Seek the subscription back in time by 60 minutes to recover the acknowledged messages 4 Start the Dataflow job in a secondary region
Answers
B.
1 Use a multi-regional Cloud Storage bucket 2 Monitor Dataflow metrics with Cloud Monitoring to determine when an outage occurs 3 Seek the subscription back in time by 60 minutes to recover the acknowledged messages 4 Start the Dataflow job in a secondary region
C.
1. Use a dual-region Cloud Storage bucket. 2. Monitor Dataflow metrics with Cloud Monitoring to determine when an outage occurs 3 Seek the subscription back in time by 15 minutes to recover the acknowledged messages 4 Start the Dataflow job in a secondary region
Answers
C.
1. Use a dual-region Cloud Storage bucket. 2. Monitor Dataflow metrics with Cloud Monitoring to determine when an outage occurs 3 Seek the subscription back in time by 15 minutes to recover the acknowledged messages 4 Start the Dataflow job in a secondary region
D.
1. Use a dual-region Cloud Storage bucket with turbo replication enabled 2 Monitor Dataflow metrics with Cloud Monitoring to determine when an outage occurs 3 Seek the subscription back in time by 60 minutes to recover the acknowledged messages 4 Start the Dataflow job in a secondary region.
Answers
D.
1. Use a dual-region Cloud Storage bucket with turbo replication enabled 2 Monitor Dataflow metrics with Cloud Monitoring to determine when an outage occurs 3 Seek the subscription back in time by 60 minutes to recover the acknowledged messages 4 Start the Dataflow job in a secondary region.
Suggested answer: C

Explanation:

A dual-region Cloud Storage bucket is a type of bucket that stores data redundantly across two regions within the same continent. This provides higher availability and durability than a regional bucket, which stores data in a single region. A dual-region bucket also provides lower latency and higher throughput than a multi-regional bucket, which stores data across multiple regions within a continent or across continents. A dual-region bucket with turbo replication enabled is a premium option that offers even faster replication across regions, but it is more expensive and not necessary for this scenario.

By using a dual-region Cloud Storage bucket, you can ensure that your data is protected from regional outages, and that you can access it from either region with low latency and high performance. You can also monitor the Dataflow metrics with Cloud Monitoring to determine when an outage occurs, and seek the subscription back in time by 15 minutes to recover the acknowledged messages. Seeking a subscription allows you to replay the messages from a Pub/Sub topic that were published within the message retention duration, which is one day in this case. By seeking the subscription back in time by 15 minutes, you can meet the RPO of 15 minutes, which means the maximum amount of data loss that is acceptable for your business. You can then start the Dataflow job in a secondary region and write to the same dual-region bucket, which will resume the processing of the messages and prevent data loss.

Option A is not a good solution, as using a regional Cloud Storage bucket does not provide any redundancy or protection from regional outages. If the region where the bucket is located experiences an outage, you will not be able to access your data or write new data to the bucket. Seeking the subscription back in time by one day is also unnecessary and inefficient, as it will replay all the messages from the past day, even though you only need to recover the messages from the past 15 minutes.

Option B is not a good solution, as using a multi-regional Cloud Storage bucket does not provide the best performance or cost-efficiency for this scenario. A multi-regional bucket stores data across multiple regions within a continent or across continents, which provides higher availability and durability than a dual-region bucket, but also higher latency and lower throughput. A multi-regional bucket is more suitable for serving data to a global audience, not for processing data with Dataflow within a single continent. Seeking the subscription back in time by 60 minutes is also unnecessary and inefficient, as it will replay more messages than needed to meet the RPO of 15 minutes.

Option D is not a good solution, as using a dual-region Cloud Storage bucket with turbo replication enabled does not provide any additional benefit for this scenario, but only increases the cost. Turbo replication is a premium option that offers faster replication across regions, but it is not required to meet the RPO of 15 minutes. Seeking the subscription back in time by 60 minutes is also unnecessary and inefficient, as it will replay more messages than needed to meet the RPO of 15 minutes.Reference:Storage locations | Cloud Storage | Google Cloud,Dataflow metrics | Cloud Dataflow | Google Cloud,Seeking a subscription | Cloud Pub/Sub | Google Cloud,Recovery point objective (RPO) | Acronis.

asked 18/09/2024
Norm Scott
31 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first