ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 320 - Professional Data Engineer discussion

Report
Export

You have a table that contains millions of rows of sales data, partitioned by date Various applications and users query this data many times a minute. The query requires aggregating values by using avg. max. and sum, and does not require joining to other tables. The required aggregations are only computed over the past year of data, though you need to retain full historical data in the base tables You want to ensure that the query results always include the latest data from the tables, while also reducing computation cost, maintenance overhead, and duration. What should you do?

A.
Create a materialized view to aggregate the base table data Configure a partition expiration on the base table to retain only the last one year of partitions.
Answers
A.
Create a materialized view to aggregate the base table data Configure a partition expiration on the base table to retain only the last one year of partitions.
B.
Create a materialized view to aggregate the base table data include a filter clause to specify the last one year of partitions.
Answers
B.
Create a materialized view to aggregate the base table data include a filter clause to specify the last one year of partitions.
C.
Create a new table that aggregates the base table data include a filter clause to specify the last year of partitions. Set up a scheduled query to recreate the new table every hour.
Answers
C.
Create a new table that aggregates the base table data include a filter clause to specify the last year of partitions. Set up a scheduled query to recreate the new table every hour.
D.
Create a view to aggregate the base table data Include a filter clause to specify the last year of partitions.
Answers
D.
Create a view to aggregate the base table data Include a filter clause to specify the last year of partitions.
Suggested answer: C

Explanation:

A materialized view is a database object that contains the results of a query, which can be updated periodically. It can improve the performance and efficiency of queries that involve aggregations, joins, or filters. By creating a materialized view to aggregate the base table data and include a filter clause to specify the last one year of partitions, you can ensure that the query results always include the latest data from the tables, while also reducing computation cost, maintenance overhead, and duration. The materialized view will automatically refresh when the base table data changes, and will only use the partitions that match the filter clause. Option A is incorrect because it will delete the historical data from the base table, which is not desired. Option C is incorrect because it will create a redundant table that needs to be updated manually by a scheduled query, which is more complex and costly than using a materialized view. Option D is incorrect because a view does not store any data, but only references the base table data, which means it will not reduce the computation cost or duration of the query.Reference:

Materialized views, ML models in data warehouse - Google Cloud

Data Engineering with Google Cloud Platform - Packt Subscription

asked 18/09/2024
Prakhar Sengar
33 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first