ExamGecko
Home / Google / Professional Data Engineer / List of questions
Ask Question

Google Professional Data Engineer Practice Test - Questions Answers, Page 24

List of questions

Question 231

Report
Export
Collapse

You plan to deploy Cloud SQL using MySQL. You need to ensure high availability in the event of a zone failure. What should you do?

Create a Cloud SQL instance in one zone, and create a failover replica in another zone within the same region.
Create a Cloud SQL instance in one zone, and create a failover replica in another zone within the same region.
Create a Cloud SQL instance in one zone, and create a read replica in another zone within the same region.
Create a Cloud SQL instance in one zone, and create a read replica in another zone within the same region.
Create a Cloud SQL instance in one zone, and configure an external read replica in a zone in a different region.
Create a Cloud SQL instance in one zone, and configure an external read replica in a zone in a different region.
Create a Cloud SQL instance in a region, and configure automatic backup to a Cloud Storage bucket in the same region.
Create a Cloud SQL instance in a region, and configure automatic backup to a Cloud Storage bucket in the same region.
Suggested answer: C
asked 18/09/2024
Tommy Svendsen
37 questions

Question 232

Report
Export
Collapse

Your company is selecting a system to centralize data ingestion and delivery. You are considering messaging and data integration systems to address the requirements. The key requirements are:

The ability to seek to a particular offset in a topic, possibly back to the start of all data ever captured Support for publish/subscribe semantics on hundreds of topics Retain per-key ordering Which system should you choose?

Apache Kafka
Apache Kafka
Cloud Storage
Cloud Storage
Cloud Pub/Sub
Cloud Pub/Sub
Firebase Cloud Messaging
Firebase Cloud Messaging
Suggested answer: A
asked 18/09/2024
Devon Woods
37 questions

Question 233

Report
Export
Collapse

You are planning to migrate your current on-premises Apache Hadoop deployment to the cloud. You need to ensure that the deployment is as fault-tolerant and cost-effective as possible for longrunning batch jobs. You want to use a managed service. What should you do?

Deploy a Cloud Dataproc cluster. Use a standard persistent disk and 50% preemptible workers.Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://
Deploy a Cloud Dataproc cluster. Use a standard persistent disk and 50% preemptible workers.Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://
Deploy a Cloud Dataproc cluster. Use an SSD persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://
Deploy a Cloud Dataproc cluster. Use an SSD persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://
Install Hadoop and Spark on a 10-node Compute Engine instance group with standard instances.Install the Cloud Storage connector, and store the data in Cloud Storage. Change references in scripts from hdfs:// to gs://
Install Hadoop and Spark on a 10-node Compute Engine instance group with standard instances.Install the Cloud Storage connector, and store the data in Cloud Storage. Change references in scripts from hdfs:// to gs://
Install Hadoop and Spark on a 10-node Compute Engine instance group with preemptible instances. Store data in HDFS. Change references in scripts from hdfs:// to gs://
Install Hadoop and Spark on a 10-node Compute Engine instance group with preemptible instances. Store data in HDFS. Change references in scripts from hdfs:// to gs://
Suggested answer: A
asked 18/09/2024
michael hunter
33 questions

Question 234

Report
Export
Collapse

Your team is working on a binary classification problem. You have trained a support vector machine (SVM) classifier with default parameters, and received an area under the Curve (AUC) of 0.87 on the validation set. You want to increase the AUC of the model. What should you do?

Perform hyperparameter tuning
Perform hyperparameter tuning
Train a classifier with deep neural networks, because neural networks would always beat SVMs
Train a classifier with deep neural networks, because neural networks would always beat SVMs
Deploy the model and measure the real-world AUC; it's always higher because of generalization
Deploy the model and measure the real-world AUC; it's always higher because of generalization
Scale predictions you get out of the model (tune a scaling factor as a hyperparameter) in order to get the highest AUC
Scale predictions you get out of the model (tune a scaling factor as a hyperparameter) in order to get the highest AUC
Suggested answer: A

Explanation:

https://towardsdatascience.com/understanding-hyperparameters-and-its-optimisation-techniquesf0debba07568

asked 18/09/2024
Youssef El Akhal
39 questions

Question 235

Report
Export
Collapse

You need to deploy additional dependencies to all of a Cloud Dataproc cluster at startup using an existing initialization action. Company security policies require that Cloud Dataproc nodes do not have access to the Internet so public initialization actions cannot fetch resources. What should you do?

Deploy the Cloud SQL Proxy on the Cloud Dataproc master
Deploy the Cloud SQL Proxy on the Cloud Dataproc master
Use an SSH tunnel to give the Cloud Dataproc cluster access to the Internet
Use an SSH tunnel to give the Cloud Dataproc cluster access to the Internet
Copy all dependencies to a Cloud Storage bucket within your VPC security perimeter
Copy all dependencies to a Cloud Storage bucket within your VPC security perimeter
Use Resource Manager to add the service account used by the Cloud Dataproc cluster to the Network User role
Use Resource Manager to add the service account used by the Cloud Dataproc cluster to the Network User role
Suggested answer: C

Explanation:


asked 18/09/2024
Peter Keijer
36 questions

Question 236

Report
Export
Collapse

You need to choose a database for a new project that has the following requirements:

Fully managed

Able to automatically scale up

Transactionally consistent

Able to scale up to 6 TB

Able to be queried using SQL

Which database do you choose?

Cloud SQL
Cloud SQL
Cloud Bigtable
Cloud Bigtable
Cloud Spanner
Cloud Spanner
Cloud Datastore
Cloud Datastore
Suggested answer: C
asked 18/09/2024
Martinho Hinterholz
34 questions

Question 237

Report
Export
Collapse

You work for a mid-sized enterprise that needs to move its operational system transaction data from an on-premises database to GCP. The database is about 20 TB in size. Which database should you choose?

Cloud SQL
Cloud SQL
Cloud Bigtable
Cloud Bigtable
Cloud Spanner
Cloud Spanner
Cloud Datastore
Cloud Datastore
Suggested answer: A
asked 18/09/2024
Siegfried Paul
33 questions

Question 238

Report
Export
Collapse

You need to choose a database to store time series CPU and memory usage for millions of computers. You need to store this data in one-second interval samples. Analysts will be performing real-time, ad hoc analytics against the database.

You want to avoid being charged for every query executed and ensure that the schema design will allow for future growth of the dataset. Which database and data model should you choose?

Create a table in BigQuery, and append the new samples for CPU and memory to the table
Create a table in BigQuery, and append the new samples for CPU and memory to the table
Create a wide table in BigQuery, create a column for the sample value at each second, and update the row with the interval for each second
Create a wide table in BigQuery, create a column for the sample value at each second, and update the row with the interval for each second
Create a narrow table in Cloud Bigtable with a row key that combines the Computer Engine computer identifier with the sample time at each second
Create a narrow table in Cloud Bigtable with a row key that combines the Computer Engine computer identifier with the sample time at each second
Create a wide table in Cloud Bigtable with a row key that combines the computer identifier with the sample time at each minute, and combine the values for each second as column data.
Create a wide table in Cloud Bigtable with a row key that combines the computer identifier with the sample time at each minute, and combine the values for each second as column data.
Suggested answer: C

Explanation:

A tall and narrow table has a small number of events per row, which could be just one event, whereas a short and wide table has a large number of events per row. As explained in a moment, tall and narrow tables are best suited for time-series data. For time series, you should generally use tall and narrow tables. This is for two reasons: Storing one event per row makes it easier to run queries against your data. Storing many events per row makes it more likely that the total row size will exceed the recommended maximum (see Rows can be big but are not infinite).

https://cloud.google.com/bigtable/docs/schema-design-time-series#patterns_for_row_key_design

asked 18/09/2024
Massimo Magliocca
37 questions

Question 239

Report
Export
Collapse

You want to archive data in Cloud Storage. Because some data is very sensitive, you want to use the "Trust No One" (TNO) approach to encrypt your data to prevent the cloud provider staff from decrypting your dat a. What should you do?

Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key and unique additional authenticated data (AAD). Use gsutil cp to upload each encrypted file to the Cloud Storage bucket, and keep the AAD outside of Google Cloud.
Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key and unique additional authenticated data (AAD). Use gsutil cp to upload each encrypted file to the Cloud Storage bucket, and keep the AAD outside of Google Cloud.
Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key. Use gsutil cp to upload each encrypted file to the Cloud Storage bucket. Manually destroy the key previously used for encryption, and rotate the key once and rotate the key once.
Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key. Use gsutil cp to upload each encrypted file to the Cloud Storage bucket. Manually destroy the key previously used for encryption, and rotate the key once and rotate the key once.
Specify customer-supplied encryption key (CSEK) in the .boto configuration file. Use gsutil cp to upload each archival file to the Cloud Storage bucket. Save the CSEK in Cloud Memorystore as permanent storage of the secret.
Specify customer-supplied encryption key (CSEK) in the .boto configuration file. Use gsutil cp to upload each archival file to the Cloud Storage bucket. Save the CSEK in Cloud Memorystore as permanent storage of the secret.
Specify customer-supplied encryption key (CSEK) in the .boto configuration file. Use gsutil cp to upload each archival file to the Cloud Storage bucket. Save the CSEK in a different project that only the security team can access.
Specify customer-supplied encryption key (CSEK) in the .boto configuration file. Use gsutil cp to upload each archival file to the Cloud Storage bucket. Save the CSEK in a different project that only the security team can access.
Suggested answer: B
asked 18/09/2024
Paramdeep Saini
39 questions

Question 240

Report
Export
Collapse

You have data pipelines running on BigQuery, Cloud Dataflow, and Cloud Dataproc. You need to perform health checks and monitor their behavior, and then notify the team managing the pipelines if they fail. You also need to be able to work across multiple projects. Your preference is to use managed products of features of the platform. What should you do?

Export the information to Cloud Stackdriver, and set up an Alerting policy
Export the information to Cloud Stackdriver, and set up an Alerting policy
Run a Virtual Machine in Compute Engine with Airflow, and export the information to Stackdriver
Run a Virtual Machine in Compute Engine with Airflow, and export the information to Stackdriver
Export the logs to BigQuery, and set up App Engine to read that information and send emails if you find a failure in the logs
Export the logs to BigQuery, and set up App Engine to read that information and send emails if you find a failure in the logs
Develop an App Engine application to consume logs using GCP API calls, and send emails if you find a failure in the logs
Develop an App Engine application to consume logs using GCP API calls, and send emails if you find a failure in the logs
Suggested answer: B
asked 18/09/2024
Tobi Space
39 questions
Total 377 questions
Go to page: of 38
Search

Related questions