ExamGecko
Home / Google / Associate Data Practitioner / List of questions
Ask Question

Google Associate Data Practitioner Practice Test - Questions Answers, Page 4

Add to Whishlist

List of questions

Question 31

Report Export Collapse

You are a data analyst at your organization. You have been given a BigQuery dataset that includes customer information. The dataset contains inconsistencies and errors, such as missing values, duplicates, and formatting issues. You need to effectively and quickly clean the data. What should you do?

Develop a Dataflow pipeline to read the data from BigQuery, perform data quality rules and transformations, and write the cleaned data back to BigQuery.

Develop a Dataflow pipeline to read the data from BigQuery, perform data quality rules and transformations, and write the cleaned data back to BigQuery.

Use Cloud Data Fusion to create a data pipeline to read the data from BigQuery, perform data quality transformations, and write the clean data back to BigQuery.

Use Cloud Data Fusion to create a data pipeline to read the data from BigQuery, perform data quality transformations, and write the clean data back to BigQuery.

Export the data from BigQuery to CSV files. Resolve the errors using a spreadsheet editor, and re-import the cleaned data into BigQuery.

Export the data from BigQuery to CSV files. Resolve the errors using a spreadsheet editor, and re-import the cleaned data into BigQuery.

Use BigQuery's built-in functions to perform data quality transformations.

Use BigQuery's built-in functions to perform data quality transformations.

Suggested answer: D
Explanation:

Using BigQuery's built-in functions is the most effective and efficient way to clean the dataset directly within BigQuery. BigQuery provides powerful SQL capabilities to handle missing values, remove duplicates, and resolve formatting issues without needing to export data or create complex pipelines. This approach minimizes overhead and leverages the scalability of BigQuery for large datasets, making it an ideal solution for quickly addressing data quality issues.

asked 13/02/2025
metodija durtanoski
45 questions

Question 32

Report Export Collapse

Your organization has several datasets in their data warehouse in BigQuery. Several analyst teams in different departments use the datasets to run queries. Your organization is concerned about the variability of their monthly BigQuery costs. You need to identify a solution that creates a fixed budget for costs associated with the queries run by each department. What should you do?

Create a custom quota for each analyst in BigQuery.

Create a custom quota for each analyst in BigQuery.

Create a single reservation by using BigQuery editions. Assign all analysts to the reservation.

Create a single reservation by using BigQuery editions. Assign all analysts to the reservation.

Assign each analyst to a separate project associated with their department. Create a single reservation by using BigQuery editions. Assign all projects to the reservation.

Assign each analyst to a separate project associated with their department. Create a single reservation by using BigQuery editions. Assign all projects to the reservation.

Assign each analyst to a separate project associated with their department. Create a single reservation for each department by using BigQuery editions. Create assignments for each project in the appropriate reservation.

Assign each analyst to a separate project associated with their department. Create a single reservation for each department by using BigQuery editions. Create assignments for each project in the appropriate reservation.

Suggested answer: D
Explanation:

Assigning each analyst to a separate project associated with their department and creating a single reservation for each department using BigQuery editions allows for precise cost management. By assigning each project to its department's reservation, you can allocate fixed compute resources and budgets for each department, ensuring that their query costs are predictable and controlled. This approach aligns with your organization's goal of creating a fixed budget for query costs while maintaining departmental separation and accountability.

asked 13/02/2025
franz yap
34 questions

Question 33

Report Export Collapse

You manage a web application that stores data in a Cloud SQL database. You need to improve the read performance of the application by offloading read traffic from the primary database instance. You want to implement a solution that minimizes effort and cost. What should you do?

Use Cloud CDN to cache frequently accessed data.

Use Cloud CDN to cache frequently accessed data.

Store frequently accessed data in a Memorystore instance.

Store frequently accessed data in a Memorystore instance.

Migrate the database to a larger Cloud SQL instance.

Migrate the database to a larger Cloud SQL instance.

Enable automatic backups, and create a read replica of the Cloud SQL instance.

Enable automatic backups, and create a read replica of the Cloud SQL instance.

Suggested answer: D
Explanation:

Enabling automatic backups and creating a read replica of the Cloud SQL instance is the best solution to improve read performance. Read replicas allow you to offload read traffic from the primary database instance, reducing its load and improving overall performance. This approach is cost-effective and easy to implement within Cloud SQL. It ensures that the primary instance focuses on write operations while replicas handle read queries, providing a seamless performance boost with minimal effort.

asked 13/02/2025
ANNA RIBALTA
37 questions

Question 34

Report Export Collapse

Your organization plans to move their on-premises environment to Google Cloud. Your organization's network bandwidth is less than 1 Gbps. You need to move over 500 of data to Cloud Storage securely, and only have a few days to move the data. What should you do?

Request multiple Transfer Appliances, copy the data to the appliances, and ship the appliances back to Google Cloud to upload the data to Cloud Storage.

Request multiple Transfer Appliances, copy the data to the appliances, and ship the appliances back to Google Cloud to upload the data to Cloud Storage.

Connect to Google Cloud using VPN. Use Storage Transfer Service to move the data to Cloud Storage.

Connect to Google Cloud using VPN. Use Storage Transfer Service to move the data to Cloud Storage.

Connect to Google Cloud using VPN. Use the gcloud storage command to move the data to Cloud Storage.

Connect to Google Cloud using VPN. Use the gcloud storage command to move the data to Cloud Storage.

Connect to Google Cloud using Dedicated Interconnect. Use the gcloud storage command to move the data to Cloud Storage.

Connect to Google Cloud using Dedicated Interconnect. Use the gcloud storage command to move the data to Cloud Storage.

Suggested answer: A
Explanation:

Using Transfer Appliances is the best solution for securely and efficiently moving over 500 TB of data to Cloud Storage within a limited timeframe, especially with network bandwidth below 1 Gbps. Transfer Appliances are physical devices provided by Google Cloud to securely transfer large amounts of data. After copying the data to the appliances, they are shipped back to Google, where the data is uploaded to Cloud Storage. This approach bypasses bandwidth limitations and ensures the data is migrated quickly and securely.

asked 13/02/2025
Miguel Angel Rico MÑrquez
39 questions

Question 35

Report Export Collapse

Your organization uses a BigQuery table that is partitioned by ingestion time. You need to remove data that is older than one year to reduce your organization's storage costs. You want to use the most efficient approach while minimizing cost. What should you do?

Create a scheduled query that periodically runs an update statement in SQL that sets the ''deleted' column to ''yes'' for data that is more than one year old. Create a view that filters out rows that have been marked deleted.

Create a scheduled query that periodically runs an update statement in SQL that sets the ''deleted' column to ''yes'' for data that is more than one year old. Create a view that filters out rows that have been marked deleted.

Create a view that filters out rows that are older than one year.

Create a view that filters out rows that are older than one year.

Require users to specify a partition filter using the alter table statement in SQL.

Require users to specify a partition filter using the alter table statement in SQL.

Set the table partition expiration period to one year using the ALTER TABLE statement in SQL.

Set the table partition expiration period to one year using the ALTER TABLE statement in SQL.

Suggested answer: D
Explanation:

Setting the table partition expiration period to one year using the ALTER TABLE statement is the most efficient and cost-effective approach. This automatically deletes data in partitions older than one year, reducing storage costs without requiring manual intervention or additional queries. It minimizes administrative overhead and ensures compliance with your data retention policy while optimizing storage usage in BigQuery.

asked 13/02/2025
CHUN KIT HO
33 questions

Question 36

Report Export Collapse

Your company is migrating their batch transformation pipelines to Google Cloud. You need to choose a solution that supports programmatic transformations using only SQL. You also want the technology to support Git integration for version control of your pipelines. What should you do?

Use Cloud Data Fusion pipelines.

Use Cloud Data Fusion pipelines.

Use Dataform workflows.

Use Dataform workflows.

Use Dataflow pipelines.

Use Dataflow pipelines.

Use Cloud Composer operators.

Use Cloud Composer operators.

Suggested answer: B
Explanation:

Dataform workflows are the ideal solution for migrating batch transformation pipelines to Google Cloud when you want to perform programmatic transformations using only SQL. Dataform allows you to define SQL-based workflows for data transformations and supports Git integration for version control, enabling collaboration and version tracking of your pipelines. This approach is purpose-built for SQL-driven data pipeline management and aligns perfectly with your requirements.

asked 13/02/2025
Genivaldo Costa
55 questions

Question 37

Report Export Collapse

You manage a BigQuery table that is used for critical end-of-month reports. The table is updated weekly with new sales data. You want to prevent data loss and reporting issues if the table is accidentally deleted. What should you do?

Configure the time travel duration on the table to be exactly seven days. On deletion, re-create the deleted table solely from the time travel data.

Configure the time travel duration on the table to be exactly seven days. On deletion, re-create the deleted table solely from the time travel data.

Schedule the creation of a new snapshot of the table once a week. On deletion, re-create the deleted table using the snapshot and time travel data.

Schedule the creation of a new snapshot of the table once a week. On deletion, re-create the deleted table using the snapshot and time travel data.

Create a clone of the table. On deletion, re-create the deleted table by copying the content of the clone.

Create a clone of the table. On deletion, re-create the deleted table by copying the content of the clone.

Create a view of the table. On deletion, re-create the deleted table from the view and time travel data.

Create a view of the table. On deletion, re-create the deleted table from the view and time travel data.

Suggested answer: B
Explanation:

Scheduling the creation of a snapshot of the table weekly ensures that you have a point-in-time backup of the table. In case of accidental deletion, you can re-create the table from the snapshot. Additionally, BigQuery's time travel feature allows you to recover data from up to seven days prior to deletion. Combining snapshots with time travel provides a robust solution for preventing data loss and ensuring reporting continuity for critical tables. This approach minimizes risks while offering flexibility for recovery.

asked 13/02/2025
Guilherme Silva
34 questions

Question 38

Report Export Collapse

Your organization sends IoT event data to a Pub/Sub topic. Subscriber applications read and perform transformations on the messages before storing them in the data warehouse. During particularly busy times when more data is being written to the topic, you notice that the subscriber applications are not acknowledging messages within the deadline. You need to modify your pipeline to handle these activity spikes and continue to process the messages. What should you do?

Retry messages until they are acknowledged.

Retry messages until they are acknowledged.

Implement flow control on the subscribers

Implement flow control on the subscribers

Forward unacknowledged messages to a dead-letter topic.

Forward unacknowledged messages to a dead-letter topic.

Seek back to the last acknowledged message.

Seek back to the last acknowledged message.

Suggested answer: B
Explanation:

Implementing flow control on the subscribers allows the subscriber applications to manage message processing during activity spikes by controlling the rate at which messages are pulled and processed. This prevents overwhelming the subscribers and ensures that messages are acknowledged within the deadline. Flow control helps maintain the stability of your pipeline during high-traffic periods without dropping or delaying messages unnecessarily.

asked 13/02/2025
Liusel Herrera Garcia
49 questions

Question 39

Report Export Collapse

You have millions of customer feedback records stored in BigQuery. You want to summarize the data by using the large language model (LLM) Gemini. You need to plan and execute this analysis using the most efficient approach. What should you do?

Query the BigQuery table from within a Python notebook, use the Gemini API to summarize the data within the notebook, and store the summaries in BigQuery.

Query the BigQuery table from within a Python notebook, use the Gemini API to summarize the data within the notebook, and store the summaries in BigQuery.

Use a BigQuery ML model to pre-process the text data, export the results to Cloud Storage, and use the Gemini API to summarize the pre- processed data.

Use a BigQuery ML model to pre-process the text data, export the results to Cloud Storage, and use the Gemini API to summarize the pre- processed data.

Create a BigQuery Cloud resource connection to a remote model in Vertex Al, and use Gemini to summarize the data.

Create a BigQuery Cloud resource connection to a remote model in Vertex Al, and use Gemini to summarize the data.

Export the raw BigQuery data to a CSV file, upload it to Cloud Storage, and use the Gemini API to summarize the data.

Export the raw BigQuery data to a CSV file, upload it to Cloud Storage, and use the Gemini API to summarize the data.

Suggested answer: C
Explanation:

Creating a BigQuery Cloud resource connection to a remote model in Vertex AI and using Gemini to summarize the data is the most efficient approach. This method allows you to seamlessly integrate BigQuery with the Gemini model via Vertex AI, avoiding the need to export data or perform manual steps. It ensures scalability for large datasets and minimizes data movement, leveraging Google Cloud's ecosystem for efficient data summarization and storage.

asked 13/02/2025
Carola Lotito
51 questions

Question 40

Report Export Collapse

You are working on a data pipeline that will validate and clean incoming data before loading it into BigQuery for real-time analysis. You want to ensure that the data validation and cleaning is performed efficiently and can handle high volumes of data. What should you do?

Write custom scripts in Python to validate and clean the data outside of Google Cloud. Load the cleaned data into BigQuery.

Write custom scripts in Python to validate and clean the data outside of Google Cloud. Load the cleaned data into BigQuery.

Use Cloud Run functions to trigger data validation and cleaning routines when new data arrives in Cloud Storage.

Use Cloud Run functions to trigger data validation and cleaning routines when new data arrives in Cloud Storage.

Use Dataflow to create a streaming pipeline that includes validation and transformation steps.

Use Dataflow to create a streaming pipeline that includes validation and transformation steps.

Load the raw data into BigQuery using Cloud Storage as a staging area, and use SQL queries in BigQuery to validate and clean the data.

Load the raw data into BigQuery using Cloud Storage as a staging area, and use SQL queries in BigQuery to validate and clean the data.

Suggested answer: C
Explanation:

Using Dataflow to create a streaming pipeline that includes validation and transformation steps is the most efficient and scalable approach for real-time analysis. Dataflow is optimized for high-volume data processing and allows you to apply validation and cleaning logic as the data flows through the pipeline. This ensures that only clean, validated data is loaded into BigQuery, supporting real-time analysis while handling high data volumes effectively.

asked 13/02/2025
Mikalai Yurouski
36 questions
Total 72 questions
Go to page: of 8
Search

Related questions