ExamGecko
Question list
Search
Search

List of questions

Search

Question 43 - ARA-C01 discussion

Report
Export

Which steps are recommended best practices for prioritizing cluster keys in Snowflake? (Choose two.)

A.
Choose columns that are frequently used in join predicates.
Answers
A.
Choose columns that are frequently used in join predicates.
B.
Choose lower cardinality columns to support clustering keys and cost effectiveness.
Answers
B.
Choose lower cardinality columns to support clustering keys and cost effectiveness.
C.
Choose TIMESTAMP columns with nanoseconds for the highest number of unique rows.
Answers
C.
Choose TIMESTAMP columns with nanoseconds for the highest number of unique rows.
D.
Choose cluster columns that are most actively used in selective filters.
Answers
D.
Choose cluster columns that are most actively used in selective filters.
E.
Choose cluster columns that are actively used in the GROUP BY clauses.
Answers
E.
Choose cluster columns that are actively used in the GROUP BY clauses.
Suggested answer: A, D

Explanation:

According to the Snowflake documentation, the best practices for choosing clustering keys are:

Choose columns that are frequently used in join predicates. This can improve the join performance by reducing the number of micro-partitions that need to be scanned and joined.

Choose columns that are most actively used in selective filters. This can improve the scan efficiency by skipping micro-partitions that do not match the filter predicates.

Avoid using low cardinality columns, such as gender or country, as clustering keys. This can result in poor clustering and high maintenance costs.

Avoid using TIMESTAMP columns with nanoseconds, as they tend to have very high cardinality and low correlation with other columns. This can also result in poor clustering and high maintenance costs.

Avoid using columns with duplicate values or NULLs, as they can cause skew in the clustering and reduce the benefits of pruning.

Cluster on multiple columns if the queries use multiple filters or join predicates. This can increase the chances of pruning more micro-partitions and improve the compression ratio.

Clustering is not always useful, especially for small or medium-sized tables, or tables that are not frequently queried or updated. Clustering can incur additional costs for initially clustering the data and maintaining the clustering over time.

Clustering Keys & Clustered Tables | Snowflake Documentation

[Considerations for Choosing Clustering for a Table | Snowflake Documentation]

asked 23/09/2024
Zaw Zaw
32 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first