ExamGecko
Home Home / Google / Professional Data Engineer

Google Professional Data Engineer Practice Test - Questions Answers, Page 18

Question list
Search
Search

List of questions

Search

Related questions











You have a data pipeline that writes data to Cloud Bigtable using well-designed row keys. You want to monitor your pipeline to determine when to increase the size of you Cloud Bigtable cluster. Which two actions can you take to accomplish this? Choose 2 answers.

A.
Review Key Visualizer metrics. Increase the size of the Cloud Bigtable cluster when the Read pressure index is above 100.
A.
Review Key Visualizer metrics. Increase the size of the Cloud Bigtable cluster when the Read pressure index is above 100.
Answers
B.
Review Key Visualizer metrics. Increase the size of the Cloud Bigtable cluster when the Write pressure index is above 100.
B.
Review Key Visualizer metrics. Increase the size of the Cloud Bigtable cluster when the Write pressure index is above 100.
Answers
C.
Monitor the latency of write operations. Increase the size of the Cloud Bigtable cluster when there is a sustained increase in write latency.
C.
Monitor the latency of write operations. Increase the size of the Cloud Bigtable cluster when there is a sustained increase in write latency.
Answers
D.
Monitor storage utilization. Increase the size of the Cloud Bigtable cluster when utilization increases above 70% of max capacity.
D.
Monitor storage utilization. Increase the size of the Cloud Bigtable cluster when utilization increases above 70% of max capacity.
Answers
E.
Monitor latency of read operations. Increase the size of the Cloud Bigtable cluster of read operations take longer than 100 ms.
E.
Monitor latency of read operations. Increase the size of the Cloud Bigtable cluster of read operations take longer than 100 ms.
Answers
Suggested answer: A, C

You want to analyze hundreds of thousands of social media posts daily at the lowest cost and with the fewest steps.

You have the following requirements:

You will batch-load the posts once per day and run them through the Cloud Natural Language API.

You will extract topics and sentiment from the posts.

You must store the raw posts for archiving and reprocessing.

You will create dashboards to be shared with people both inside and outside your organization.

You need to store both the data extracted from the API to perform analysis as well as the raw social media posts for historical archiving. What should you do?

A.
Store the social media posts and the data extracted from the API in BigQuery.
A.
Store the social media posts and the data extracted from the API in BigQuery.
Answers
B.
Store the social media posts and the data extracted from the API in Cloud SQL.
B.
Store the social media posts and the data extracted from the API in Cloud SQL.
Answers
C.
Store the raw social media posts in Cloud Storage, and write the data extracted from the API into BigQuery.
C.
Store the raw social media posts in Cloud Storage, and write the data extracted from the API into BigQuery.
Answers
D.
Feed to social media posts into the API directly from the source, and write the extracted data from the API into BigQuery.
D.
Feed to social media posts into the API directly from the source, and write the extracted data from the API into BigQuery.
Answers
Suggested answer: D

You store historic data in Cloud Storage. You need to perform analytics on the historic dat a. You want to use a solution to detect invalid data entries and perform data transformations that will not require programming or knowledge of SQL.

What should you do?

A.
Use Cloud Dataflow with Beam to detect errors and perform transformations.
A.
Use Cloud Dataflow with Beam to detect errors and perform transformations.
Answers
B.
Use Cloud Dataprep with recipes to detect errors and perform transformations.
B.
Use Cloud Dataprep with recipes to detect errors and perform transformations.
Answers
C.
Use Cloud Dataproc with a Hadoop job to detect errors and perform transformations.
C.
Use Cloud Dataproc with a Hadoop job to detect errors and perform transformations.
Answers
D.
Use federated tables in BigQuery with queries to detect errors and perform transformations.
D.
Use federated tables in BigQuery with queries to detect errors and perform transformations.
Answers
Suggested answer: B

Your company needs to upload their historic data to Cloud Storage. The security rules don't allow access from external IPs to their on-premises resources. After an initial upload, they will add new data from existing on-premises applications every day. What should they do?

A.
Execute gsutil rsync from the on-premises servers.
A.
Execute gsutil rsync from the on-premises servers.
Answers
B.
Use Cloud Dataflow and write the data to Cloud Storage.
B.
Use Cloud Dataflow and write the data to Cloud Storage.
Answers
C.
Write a job template in Cloud Dataproc to perform the data transfer.
C.
Write a job template in Cloud Dataproc to perform the data transfer.
Answers
D.
Install an FTP server on a Compute Engine VM to receive the files and move them to Cloud Storage.
D.
Install an FTP server on a Compute Engine VM to receive the files and move them to Cloud Storage.
Answers
Suggested answer: B

You have a query that filters a BigQuery table using a WHERE clause on timestamp and ID columns.

By using bq query ñ -dry_run you learn that the query triggers a full scan of the table, even though the filter on timestamp and ID select a tiny fraction of the overall dat a. You want to reduce the amount of data scanned by BigQuery with minimal changes to existing SQL queries. What should you do?

A.
Create a separate table for each ID.
A.
Create a separate table for each ID.
Answers
B.
Use the LIMIT keyword to reduce the number of rows returned.
B.
Use the LIMIT keyword to reduce the number of rows returned.
Answers
C.
Recreate the table with a partitioning column and clustering column.
C.
Recreate the table with a partitioning column and clustering column.
Answers
D.
Use the bq query - -maximum_bytes_billed flag to restrict the number of bytes billed.
D.
Use the bq query - -maximum_bytes_billed flag to restrict the number of bytes billed.
Answers
Suggested answer: C

You have a requirement to insert minute-resolution data from 50,000 sensors into a BigQuery table.

You expect significant growth in data volume and need the data to be available within 1 minute of ingestion for real-time analysis of aggregated trends. What should you do?

A.
Use bq load to load a batch of sensor data every 60 seconds.
A.
Use bq load to load a batch of sensor data every 60 seconds.
Answers
B.
Use a Cloud Dataflow pipeline to stream data into the BigQuery table.
B.
Use a Cloud Dataflow pipeline to stream data into the BigQuery table.
Answers
C.
Use the INSERT statement to insert a batch of data every 60 seconds.
C.
Use the INSERT statement to insert a batch of data every 60 seconds.
Answers
D.
Use the MERGE statement to apply updates in batch every 60 seconds.
D.
Use the MERGE statement to apply updates in batch every 60 seconds.
Answers
Suggested answer: C

You need to copy millions of sensitive patient records from a relational database to BigQuery. The total size of the database is 10 TB. You need to design a solution that is secure and time-efficient.

What should you do?

A.
Export the records from the database as an Avro file. Upload the file to GCS using gsutil, and then load the Avro file into BigQuery using the BigQuery web UI in the GCP Console.
A.
Export the records from the database as an Avro file. Upload the file to GCS using gsutil, and then load the Avro file into BigQuery using the BigQuery web UI in the GCP Console.
Answers
B.
Export the records from the database as an Avro file. Copy the file onto a Transfer Appliance and send it to Google, and then load the Avro file into BigQuery using the BigQuery web UI in the GCP Console.
B.
Export the records from the database as an Avro file. Copy the file onto a Transfer Appliance and send it to Google, and then load the Avro file into BigQuery using the BigQuery web UI in the GCP Console.
Answers
C.
Export the records from the database into a CSV file. Create a public URL for the CSV file, and then use Storage Transfer Service to move the file to Cloud Storage. Load the CSV file into BigQuery using the BigQuery web UI in the GCPConsole.
C.
Export the records from the database into a CSV file. Create a public URL for the CSV file, and then use Storage Transfer Service to move the file to Cloud Storage. Load the CSV file into BigQuery using the BigQuery web UI in the GCPConsole.
Answers
D.
Export the records from the database as an Avro file. Create a public URL for the Avro file, and then use Storage Transfer Service to move the file to Cloud Storage. Load the Avro file into BigQuery using the BigQuery web UI in theGCP Console.
D.
Export the records from the database as an Avro file. Create a public URL for the Avro file, and then use Storage Transfer Service to move the file to Cloud Storage. Load the Avro file into BigQuery using the BigQuery web UI in theGCP Console.
Answers
Suggested answer: A

You need to create a near real-time inventory dashboard that reads the main inventory tables in your BigQuery data warehouse. Historical inventory data is stored as inventory balances by item and location. You have several thousand updates to inventory every hour. You want to maximize performance of the dashboard and ensure that the data is accurate. What should you do?

A.
Leverage BigQuery UPDATE statements to update the inventory balances as they are changing.
A.
Leverage BigQuery UPDATE statements to update the inventory balances as they are changing.
Answers
B.
Partition the inventory balance table by item to reduce the amount of data scanned with each inventory update.
B.
Partition the inventory balance table by item to reduce the amount of data scanned with each inventory update.
Answers
C.
Use the BigQuery streaming the stream changes into a daily inventory movement table. Calculate balances in a view that joins it to the historical inventory balance table. Update the inventory balance table nightly.
C.
Use the BigQuery streaming the stream changes into a daily inventory movement table. Calculate balances in a view that joins it to the historical inventory balance table. Update the inventory balance table nightly.
Answers
D.
Use the BigQuery bulk loader to batch load inventory changes into a daily inventory movement table. Calculate balances in a view that joins it to the historical inventory balance table. Update the inventory balance table nightly.
D.
Use the BigQuery bulk loader to batch load inventory changes into a daily inventory movement table. Calculate balances in a view that joins it to the historical inventory balance table. Update the inventory balance table nightly.
Answers
Suggested answer: A

You have a data stored in BigQuery. The data in the BigQuery dataset must be highly available. You need to define a storage, backup, and recovery strategy of this data that minimizes cost. How should you configure the BigQuery table?

A.
Set the BigQuery dataset to be regional. In the event of an emergency, use a point-in-time snapshot to recover the data.
A.
Set the BigQuery dataset to be regional. In the event of an emergency, use a point-in-time snapshot to recover the data.
Answers
B.
Set the BigQuery dataset to be regional. Create a scheduled query to make copies of the data to tables suffixed with the time of the backup. In the event of an emergency, use the backup copy of the table.
B.
Set the BigQuery dataset to be regional. Create a scheduled query to make copies of the data to tables suffixed with the time of the backup. In the event of an emergency, use the backup copy of the table.
Answers
C.
Set the BigQuery dataset to be multi-regional. In the event of an emergency, use a point-in-time snapshot to recover the data.
C.
Set the BigQuery dataset to be multi-regional. In the event of an emergency, use a point-in-time snapshot to recover the data.
Answers
D.
Set the BigQuery dataset to be multi-regional. Create a scheduled query to make copies of the data to tables suffixed with the time of the backup. In the event of an emergency, use the backup copy of the table.
D.
Set the BigQuery dataset to be multi-regional. Create a scheduled query to make copies of the data to tables suffixed with the time of the backup. In the event of an emergency, use the backup copy of the table.
Answers
Suggested answer: B

You used Cloud Dataprep to create a recipe on a sample of data in a BigQuery table. You want to reuse this recipe on a daily upload of data with the same schema, after the load job with variable execution time completes. What should you do?

A.
Create a cron schedule in Cloud Dataprep.
A.
Create a cron schedule in Cloud Dataprep.
Answers
B.
Create an App Engine cron job to schedule the execution of the Cloud Dataprep job.
B.
Create an App Engine cron job to schedule the execution of the Cloud Dataprep job.
Answers
C.
Export the recipe as a Cloud Dataprep template, and create a job in Cloud Scheduler.
C.
Export the recipe as a Cloud Dataprep template, and create a job in Cloud Scheduler.
Answers
D.
Export the Cloud Dataprep job as a Cloud Dataflow template, and incorporate it into a Cloud Composer job.
D.
Export the Cloud Dataprep job as a Cloud Dataflow template, and incorporate it into a Cloud Composer job.
Answers
Suggested answer: D
Total 372 questions
Go to page: of 38