Exam Associate Data Practitioner – Member Shared Questions, Page 2

Question 11

You work for an ecommerce company that has a BigQuery dataset that contains customer purchase history, demographics, and website interactions. You need to build a machine learning (ML) model to predict which customers are most likely to make a purchase in the next month. You have limited engineering resources and need to minimize the ML expertise required for the solution. What should you do?

A.

Use BigQuery ML to create a logistic regression model for purchase prediction.

B.

Use Vertex AI Workbench to develop a custom model for purchase prediction.

C.

Use Colab Enterprise to develop a custom model for purchase prediction.

D.

Export the data to Cloud Storage, and use AutoML Tables to build a classification model for purchase prediction.

Show Answer Comment (0)

Suggested answer: A

Explanation:

Using BigQuery ML is the best solution in this case because:

Ease of use: BigQuery ML allows users to build machine learning models using SQL, which requires minimal ML expertise.

Integrated platform: Since the data already exists in BigQuery, there's no need to move it to another service, saving time and engineering resources.

Logistic regression: This is an appropriate model for binary classification tasks like predicting the likelihood of a customer making a purchase in the next month.

asked 13/02/2025

Meena Utsaha

33 questions

Question 12

You are designing a pipeline to process data files that arrive in Cloud Storage by 3:00 am each day. Data processing is performed in stages, where the output of one stage becomes the input of the next. Each stage takes a long time to run. Occasionally a stage fails, and you have to address the problem. You need to ensure that the final output is generated as quickly as possible. What should you do?

A.

Design a Spark program that runs under Dataproc. Code the program to wait for user input when an error is detected. Rerun the last action after correcting any stage output data errors.

B.

Design the pipeline as a set of PTransforms in Dataflow. Restart the pipeline after correcting any stage output data errors.

C.

Design the workflow as a Cloud Workflow instance. Code the workflow to jump to a given stage based on an input parameter. Rerun the workflow after correcting any stage output data errors.

D.

Design the processing as a directed acyclic graph (DAG) in Cloud Composer. Clear the state of the failed task after correcting any stage output data errors.

Show Answer Comment (0)

Question 13

Another team in your organization is requesting access to a BigQuery dataset. You need to share the dataset with the team while minimizing the risk of unauthorized copying of data. You also want to create a reusable framework in case you need to share this data with other teams in the future. What should you do?

A.

Create authorized views in the team's Google Cloud project that is only accessible by the team.

B.

Create a private exchange using Analytics Hub with data egress restriction, and grant access to the team members.

C.

Enable domain restricted sharing on the project. Grant the team members the BigQuery Data Viewer IAM role on the dataset.

D.

Export the dataset to a Cloud Storage bucket in the team's Google Cloud project that is only accessible by the team.

Show Answer Comment (0)

Question 14

Your company has developed a website that allows users to upload and share video files. These files are most frequently accessed and shared when they are initially uploaded. Over time, the files are accessed and shared less frequently, although some old video files may remain very popular.

You need to design a storage system that is simple and cost-effective. What should you do?

A.

Create a single-region bucket with Autoclass enabled.

B.

Create a single-region bucket. Configure a Cloud Scheduler job that runs every 24 hours and changes the storage class based on upload date.

C.

Create a single-region bucket with custom Object Lifecycle Management policies based on upload date.

D.

Create a single-region bucket with Archive as the default storage class.

Show Answer Comment (0)

Question 15

You recently inherited a task for managing Dataflow streaming pipelines in your organization and noticed that proper access had not been provisioned to you. You need to request a Google-provided IAM role so you can restart the pipelines. You need to follow the principle of least privilege. What should you do?

A.

Request the Dataflow Developer role.

B.

Request the Dataflow Viewer role.

C.

Request the Dataflow Worker role.

D.

Request the Dataflow Admin role.

Show Answer Comment (0)

Suggested answer: A

Explanation:

The Dataflow Developer role provides the necessary permissions to manage Dataflow streaming pipelines, including the ability to restart pipelines. This role adheres to the principle of least privilege, as it grants only the permissions required to manage and operate Dataflow jobs without unnecessary administrative access. Other roles, such as Dataflow Admin, would grant broader permissions, which are not needed in this scenario.

asked 13/02/2025

Freddy KUBIAK

57 questions

Question 16

You need to create a new data pipeline. You want a serverless solution that meets the following requirements:

* Data is streamed from Pub/Sub and is processed in real-time.

* Data is transformed before being stored.

* Data is stored in a location that will allow it to be analyzed with SQL using Looker.

Google Associate Data Practitioner image Question 16 638750842214636576873

Which Google Cloud services should you recommend for the pipeline?

A.

1. Dataproc Serverless 2. Bigtable

B.

1. Cloud Composer 2. Cloud SQL for MySQL

C.

1. BigQuery 2. Analytics Hub

D.

1. Dataflow 2. BigQuery

Show Answer Comment (0)

Question 17

Your team wants to create a monthly report to analyze inventory data that is updated daily. You need to aggregate the inventory counts by using only the most recent month of data, and save the results to be used in a Looker Studio dashboard. What should you do?

A.

Create a materialized view in BigQuery that uses the SUM( ) function and the DATE_SUB( ) function.

B.

Create a saved query in the BigQuery console that uses the SUM( ) function and the DATE_SUB( ) function. Re-run the saved query every month, and save the results to a BigQuery table.

C.

Create a BigQuery table that uses the SUM( ) function and the _PARTITIONDATE filter.

D.

Create a BigQuery table that uses the SUM( ) function and the DATE_DIFF( ) function.

Show Answer Comment (0)

Suggested answer: A

Explanation:

Creating a materialized view in BigQuery with the SUM() function and the DATE_SUB() function is the best approach. Materialized views allow you to pre-aggregate and cache query results, making them efficient for repeated access, such as monthly reporting. By using the DATE_SUB() function, you can filter the inventory data to include only the most recent month. This approach ensures that the aggregation is up-to-date with minimal latency and provides efficient integration with Looker Studio for dashboarding.

asked 13/02/2025

Dereque Datson

49 questions

Question 18

You have a BigQuery dataset containing sales data. This data is actively queried for the first 6 months. After that, the data is not queried but needs to be retained for 3 years for compliance reasons. You need to implement a data management strategy that meets access and compliance requirements, while keeping cost and administrative overhead to a minimum. What should you do?

A.

Use BigQuery long-term storage for the entire dataset. Set up a Cloud Run function to delete the data from BigQuery after 3 years.

B.

Partition a BigQuery table by month. After 6 months, export the data to Coldline storage. Implement a lifecycle policy to delete the data from Cloud Storage after 3 years.

C.

Set up a scheduled query to export the data to Cloud Storage after 6 months. Write a stored procedure to delete the data from BigQuery after 3 years.

D.

Store all data in a single BigQuery table without partitioning or lifecycle policies.

Show Answer Comment (0)

Question 19

You have created a LookML model and dashboard that shows daily sales metrics for five regional managers to use. You want to ensure that the regional managers can only see sales metrics specific to their region. You need an easy-to-implement solution. What should you do?

A.

Create a sales_region user attribute, and assign each manager's region as the value of their user attribute. Add an access_filter Explore filter on the region_name dimension by using the sales_region user attribute.

B.

Create five different Explores with the sql_always_filter Explore filter applied on the region_name dimension. Set each region_name value to the corresponding region for each manager.

C.

Create separate Looker dashboards for each regional manager. Set the default dashboard filter to the corresponding region for each manager.

D.

Create separate Looker instances for each regional manager. Copy the LookML model and dashboard to each instance. Provision viewer access to the corresponding manager.

Show Answer Comment (0)

Suggested answer: A

Explanation:

Using a sales_region user attribute is the best solution because it allows you to dynamically filter data based on each manager's assigned region. By adding an access_filter Explore filter on the region_name dimension that references the sales_region user attribute, each manager sees only the sales metrics specific to their region. This approach is easy to implement, scalable, and avoids duplicating dashboards or Explores, making it both efficient and maintainable.

asked 13/02/2025

Kenny McCue

38 questions

Question 20

You need to design a data pipeline that ingests data from CSV, Avro, and Parquet files into Cloud Storage. The data includes raw user input. You need to remove all malicious SQL injections before storing the data in BigQuery. Which data manipulation methodology should you choose?

A.

EL

B.

ELT

C.

ETL

D.

ETLT

Show Answer Comment (0)

Google Associate Data Practitioner Practice Test - Questions Answers, Page 2

List of questions

Question 11

Question 12

Question 13

Question 14

Question 15

Question 16

Question 17

Question 18

Question 19

Question 20

Related questions

Google Associate Data Practitioner Practice Test - Questions Answers, Page 2

List of questions

Question 11

Question 12

Question 13

Question 14

Question 15

Question 16

Question 17

Question 18

Question 19

Question 20

Question

Case Study

Drag and Drop

Hot Area

Related questions

Export

Practice Tests