Home / Google / Professional Data Engineer / List of questions

Ask Question

Google Professional Data Engineer Practice Test - Questions Answers, Page 14

Add to Whishlist

List of questions

Question 131

What is the recommended action to do in order to switch between SSD and HDD storage for your Google Cloud Bigtable instance?

Become a Premium Member for full access

Unlock Premium Member

Question 132

You are training a spam classifier. You notice that you are overfitting the training dat a. Which three actions can you take to resolve this problem? (Choose three.)

Become a Premium Member for full access

Unlock Premium Member

Question 133

You are implementing security best practices on your data pipeline. Currently, you are manually executing jobs as the Project Owner. You want to automate these jobs by taking nightly batch files containing non-public information from

Google Cloud Storage, processing them with a Spark Scala job on a Google Cloud Dataproc cluster, and depositing the results into Google BigQuery.

How should you securely run this workload?

Become a Premium Member for full access

Unlock Premium Member

Question 134

You are using Google BigQuery as your data warehouse. Your users report that the following simple query is running very slowly, no matter when they run the query:

SELECT country, state, city FROM [myproject:mydataset.mytable] GROUP BY country You check the query plan for the query and see the following output in the Read section of Stage:1:

Google Professional Data Engineer image Question 134 29730 09182024191422000000

What is the most likely cause of the delay for this query?

Become a Premium Member for full access

Unlock Premium Member

Question 135

Your globally distributed auction application allows users to bid on items. Occasionally, users place identical bids at nearly identical times, and different application servers process those bids. Each bid event contains the item, amount, user, and timestamp. You want to collate those bid events into a single location in real time to determine which user bid first. What should you do?

Become a Premium Member for full access

Unlock Premium Member

Question 136

Your organization has been collecting and analyzing data in Google BigQuery for 6 months. The majority of the data analyzed is placed in a time-partitioned table named events_partitioned. To reduce the cost of queries, your organization created a view called events, which queries only the last 14 days of dat a. The view is described in legacy SQL. Next month, existing applications will be connecting to BigQuery to read the events data via an ODBC connection. You need to ensure the applications can connect. Which two actions should you take? (Choose two.)

Become a Premium Member for full access

Unlock Premium Member

Question 137

You have enabled the free integration between Firebase Analytics and Google BigQuery. Firebase now automatically creates a new table daily in BigQuery in the format app_events_YYYYMMDD. You want to query all of the tables for the past 30 days in legacy SQL. What should you do?

Become a Premium Member for full access

Unlock Premium Member

Question 138

Your company is currently setting up data pipelines for their campaign. For all the Google Cloud Pub/Sub streaming data, one of the important business requirements is to be able to periodically identify the inputs and their timings during their campaign. Engineers have decided to use windowing and transformation in Google Cloud Dataflow for this purpose. However, when testing this feature, they find that the Cloud Dataflow job fails for the all streaming insert. What is the most likely cause of this problem?

Become a Premium Member for full access

Unlock Premium Member

Question 139

You architect a system to analyze seismic dat a. Your extract, transform, and load (ETL) process runs as a series of MapReduce jobs on an Apache Hadoop cluster. The ETL process takes days to process a data set because some steps are computationally expensive. Then you discover that a sensor calibration step has been omitted. How should you change your ETL process to carry out sensor calibration systematically in the future?

Become a Premium Member for full access

Unlock Premium Member

Question 140

An online retailer has built their current application on Google App Engine. A new initiative at the company mandates that they extend their application to allow their customers to transact directly via the application.

They need to manage their shopping transactions and analyze combined data from multiple datasets using a business intelligence (BI) tool. They want to use only a single database for this purpose.

Which Google Cloud database should they choose?

Become a Premium Member for full access

Unlock Premium Member

Total 377 questions