ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 64 - DP-203 discussion

Report
Export

You are designing a fact table named FactPurchase in an Azure Synapse Analytics dedicated SQL pool. The table contains purchases from suppliers for a retail store. FactPurchase will contain the following columns.

FactPurchase will have 1 million rows of data added daily and will contain three years of data.

Transact-SQL queries similar to the following query will be executed daily.

SELECT

SupplierKey, StockItemKey, IsOrderFinalized, COUNT(*)

FROM FactPurchase

WHERE DateKey >= 20210101

AND DateKey <= 20210131

GROUP By SupplierKey, StockItemKey, IsOrderFinalized

Which table distribution will minimize query times?

A.
replicated
Answers
A.
replicated
B.
hash-distributed on PurchaseKey
Answers
B.
hash-distributed on PurchaseKey
C.
round-robin
Answers
C.
round-robin
D.
hash-distributed on IsOrderFinalized
Answers
D.
hash-distributed on IsOrderFinalized
Suggested answer: B

Explanation:

Hash-distributed tables improve query performance on large fact tables. To balance the parallel processing, select a distribution column that:

Has many unique values. The column can have duplicate values. All rows with the same value are assigned to the same distribution. Since there are 60 distributions, some distributions can have > 1 unique values while others may end with zero values.

Does not have NULLs, or has only a few NULLs. Is not a date column. Incorrect Answers:

C: Round-robin tables are useful for improving loading speed.

Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute

asked 02/10/2024
shafinaaz hossenny
37 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first