ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 360 - Professional Data Engineer discussion

Report
Export

The data analyst team at your company uses BigQuery for ad-hoc queries and scheduled SQL pipelines in a Google Cloud project with a slot reservation of 2000 slots. However, with the recent introduction of hundreds of new non time-sensitive SQL pipelines, the team is encountering frequent quota errors. You examine the logs and notice that approximately 1500 queries are being triggered concurrently during peak time. You need to resolve the concurrency issue. What should you do?

A.
Update SQL pipelines and ad-hoc queries to run as interactive query jobs.
Answers
A.
Update SQL pipelines and ad-hoc queries to run as interactive query jobs.
B.
Increase the slot capacity of the project with baseline as 0 and maximum reservation size as 3000.
Answers
B.
Increase the slot capacity of the project with baseline as 0 and maximum reservation size as 3000.
C.
Update SOL pipelines to run as a batch query, and run ad-hoc queries as interactive query jobs.
Answers
C.
Update SOL pipelines to run as a batch query, and run ad-hoc queries as interactive query jobs.
D.
Increase the slot capacity of the project with baseline as 2000 and maximum reservation size as 3000.
Answers
D.
Increase the slot capacity of the project with baseline as 2000 and maximum reservation size as 3000.
Suggested answer: C

Explanation:

To resolve the concurrency issue in BigQuery caused by the introduction of hundreds of non-time-sensitive SQL pipelines, the best approach is to differentiate the types of queries based on their urgency and resource requirements. Here's why option C is the best choice:

SQL Pipelines as Batch Queries:

Batch queries in BigQuery are designed for non-time-sensitive operations. They run in a lower priority queue and do not consume slots immediately, which helps to reduce the overall slot consumption during peak times.

By converting non-time-sensitive SQL pipelines to batch queries, you can significantly alleviate the pressure on slot reservations.

Ad-Hoc Queries as Interactive Queries:

Interactive queries are prioritized to run immediately and are suitable for ad-hoc analysis where users expect quick results.

Running ad-hoc queries as interactive jobs ensures that analysts can get their results without delay, improving productivity and user satisfaction.

Concurrency Management:

This approach helps balance the workload by leveraging BigQuery's ability to handle different types of queries efficiently, reducing the likelihood of encountering quota errors due to slot exhaustion.

Steps to Implement:

Identify Non-Time-Sensitive Pipelines:

Review and identify SQL pipelines that are not time-critical and can be executed as batch jobs.

Update Pipelines to Batch Queries:

Modify these pipelines to run as batch queries. This can be done by setting the priority of the query job to BATCH.

Ensure Ad-Hoc Queries are Interactive:

Ensure that all ad-hoc queries are submitted as interactive jobs, allowing them to run with higher priority and immediate slot allocation.

BigQuery Batch Queries

BigQuery Slot Allocation and Management

asked 18/09/2024
Arvind Prasad S
41 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first