ExamGecko
Home / Google / Professional Data Engineer / List of questions
Ask Question

Google Professional Data Engineer Practice Test - Questions Answers, Page 29

List of questions

Question 281

Report
Export
Collapse

You are designing a Dataflow pipeline for a batch processing job. You want to mitigate multiple zonal failures at job submission time. What should you do?

Specify a worker region by using the ---region flag.
Specify a worker region by using the ---region flag.
Set the pipeline staging location as a regional Cloud Storage bucket.
Set the pipeline staging location as a regional Cloud Storage bucket.
Submit duplicate pipelines in two different zones by using the ---zone flag.
Submit duplicate pipelines in two different zones by using the ---zone flag.
Create an Eventarc trigger to resubmit the job in case of zonal failure when submitting the job.
Create an Eventarc trigger to resubmit the job in case of zonal failure when submitting the job.
Suggested answer: B

Explanation:


asked 18/09/2024
Monterio Weaver
33 questions

Question 282

Report
Export
Collapse

You are designing the architecture to process your data from Cloud Storage to BigQuery by using Dataflow. The network team provided you with the Shared VPC network and subnetwork to be used by your pipelines. You need to enable the deployment of the pipeline on the Shared VPC network. What should you do?

Become a Premium Member for full access
  Unlock Premium Member

Question 283

Report
Export
Collapse

You are building an ELT solution in BigQuery by using Dataform. You need to perform uniqueness and null value checks on your final tables. What should you do to efficiently integrate these checks into your pipeline?

Become a Premium Member for full access
  Unlock Premium Member

Question 284

Report
Export
Collapse

You are designing a real-time system for a ride hailing app that identifies areas with high demand for rides to effectively reroute available drivers to meet the demand. The system ingests data from multiple sources to Pub/Sub. processes the data, and stores the results for visualization and analysis in real-time dashboards. The data sources include driver location updates every 5 seconds and app-based booking events from riders. The data processing involves real-time aggregation of supply and demand data for the last 30 seconds, every 2 seconds, and storing the results in a low-latency system for visualization. What should you do?

Become a Premium Member for full access
  Unlock Premium Member

Question 285

Report
Export
Collapse

You orchestrate ETL pipelines by using Cloud Composer One of the tasks in the Apache Airflow directed acyclic graph (DAG) relies on a third-party service. You want to be notified when the task does not succeed. What should you do?

Become a Premium Member for full access
  Unlock Premium Member

Question 286

Report
Export
Collapse

You have a streaming pipeline that ingests data from Pub/Sub in production. You need to update this streaming pipeline with improved business logic. You need to ensure that the updated pipeline reprocesses the previous two days of delivered Pub/Sub messages. What should you do?

Choose 2 answers

Become a Premium Member for full access
  Unlock Premium Member

Question 287

Report
Export
Collapse

You stream order data by using a Dataflow pipeline, and write the aggregated result to Memorystore. You provisioned a Memorystore for Redis instance with Basic Tier. 4 GB capacity, which is used by 40 clients for read-only access. You are expecting the number of read-only clients to increase significantly to a few hundred and you need to be able to support the demand. You want to ensure that read and write access availability is not impacted, and any changes you make can be deployed quickly. What should you do?

Become a Premium Member for full access
  Unlock Premium Member

Question 288

Report
Export
Collapse

You are troubleshooting your Dataflow pipeline that processes data from Cloud Storage to BigQuery. You have discovered that the Dataflow worker nodes cannot communicate with one another Your networking team relies on Google Cloud network tags to define firewall rules You need to identify the issue while following Google-recommended networking security practices. What should you do?

Become a Premium Member for full access
  Unlock Premium Member

Question 289

Report
Export
Collapse

You are configuring networking for a Dataflow job. The data pipeline uses custom container images with the libraries that are required for the transformation logic preinstalled. The data pipeline reads the data from Cloud Storage and writes the data to BigQuery. You need to ensure cost-effective and secure communication between the pipeline and Google APIs and services. What should you do?

Become a Premium Member for full access
  Unlock Premium Member

Question 290

Report
Export
Collapse

You work for a large ecommerce company. You store your customers order data in Bigtable. You have a garbage collection policy set to delete the data after 30 days and the number of versions is set to 1. When the data analysts run a query to report total customer spending, the analysts sometimes see customer data that is older than 30 days. You need to ensure that the analysts do not see customer data older than 30 days while minimizing cost and overhead. What should you do?

Become a Premium Member for full access
  Unlock Premium Member
Total 377 questions
Go to page: of 38
Search

Related questions