ExamGecko
Home Home / Snowflake / ARA-C01

Snowflake ARA-C01 Practice Test - Questions Answers, Page 11

Question list
Search
Search

List of questions

Search

An Architect has designed a data pipeline that Is receiving small CSV files from multiple sources. All of the files are landing in one location. Specific files are filtered for loading into Snowflake tables using the copy command. The loading performance is poor.

What changes can be made to Improve the data loading performance?

A.
Increase the size of the virtual warehouse.
A.
Increase the size of the virtual warehouse.
Answers
B.
Create a multi-cluster warehouse and merge smaller files to create bigger files.
B.
Create a multi-cluster warehouse and merge smaller files to create bigger files.
Answers
C.
Create a specific storage landing bucket to avoid file scanning.
C.
Create a specific storage landing bucket to avoid file scanning.
Answers
D.
Change the file format from CSV to JSON.
D.
Change the file format from CSV to JSON.
Answers
Suggested answer: B

Explanation:

According to the Snowflake documentation, the data loading performance can be improved by following some best practices and guidelines for preparing and staging the data files. One of the recommendations is to aim for data files that are roughly 100-250 MB (or larger) in size compressed, as this will optimize the number of parallel operations for a load. Smaller files should be aggregated and larger files should be split to achieve this size range. Another recommendation is to use a multi-cluster warehouse for loading, as this will allow for scaling up or out the compute resources depending on the load demand. A single-cluster warehouse may not be able to handle the load concurrency and throughput efficiently. Therefore, by creating a multi-cluster warehouse and merging smaller files to create bigger files, the data loading performance can be improved.Reference:

Data Loading Considerations

Preparing Your Data Files

Planning a Data Load

A table, EMP_ TBL has three records as shown:

The following variables are set for the session:

Which SELECT statements will retrieve all three records? (Select TWO).

A.
Select * FROM Stbl_ref WHERE Scol_ref IN ('Name1','Nam2','Name3');
A.
Select * FROM Stbl_ref WHERE Scol_ref IN ('Name1','Nam2','Name3');
Answers
B.
SELECT * FROM EMP_TBL WHERE identifier(Scol_ref) IN ('Namel','Name2', 'Name3');
B.
SELECT * FROM EMP_TBL WHERE identifier(Scol_ref) IN ('Namel','Name2', 'Name3');
Answers
C.
SELECT * FROM identifier<Stbl_ref> WHERE NAME IN ($var1, $var2, $var3);
C.
SELECT * FROM identifier<Stbl_ref> WHERE NAME IN ($var1, $var2, $var3);
Answers
D.
SELECT * FROM identifier($tbl_ref) WHERE ID IN Cvarl','var2','var3');
D.
SELECT * FROM identifier($tbl_ref) WHERE ID IN Cvarl','var2','var3');
Answers
E.
SELECT * FROM $tb1_ref WHERE $col_ref IN ($var1, Svar2, Svar3);
E.
SELECT * FROM $tb1_ref WHERE $col_ref IN ($var1, Svar2, Svar3);
Answers
Suggested answer: B, E

Explanation:

The correct answer is B and E because they use the correct syntax and values for the identifier function and the session variables.

The identifier function allows you to use a variable or expression as an identifier (such as a table name or column name) in a SQL statement. It takes a single argument and returns it as an identifier. For example, identifier($tbl_ref) returns EMP_TBL as an identifier.

The session variables are set using the SET command and can be referenced using the $ sign. For example, $var1 returns Name1 as a value.

Option A is incorrect because it uses Stbl_ref and Scol_ref, which are not valid session variables or identifiers. They should be $tbl_ref and $col_ref instead.

Option C is incorrect because it uses identifier<Stbl_ref>, which is not a valid syntax for the identifier function. It should be identifier($tbl_ref) instead.

Option D is incorrect because it uses Cvarl, var2, and var3, which are not valid session variables or values. They should be $var1, $var2, and $var3 instead.Reference:

Snowflake Documentation: Identifier Function

Snowflake Documentation: Session Variables

Snowflake Learning: SnowPro Advanced: Architect Exam Study Guide

A data platform team creates two multi-cluster virtual warehouses with the AUTO_SUSPEND value set to NULL on one. and '0' on the other. What would be the execution behavior of these virtual warehouses?

A.
Setting a '0' or NULL value means the warehouses will never suspend.
A.
Setting a '0' or NULL value means the warehouses will never suspend.
Answers
B.
Setting a '0' or NULL value means the warehouses will suspend immediately.
B.
Setting a '0' or NULL value means the warehouses will suspend immediately.
Answers
C.
Setting a '0' or NULL value means the warehouses will suspend after the default of 600 seconds.
C.
Setting a '0' or NULL value means the warehouses will suspend after the default of 600 seconds.
Answers
D.
Setting a '0' value means the warehouses will suspend immediately, and NULL means the warehouses will never suspend.
D.
Setting a '0' value means the warehouses will suspend immediately, and NULL means the warehouses will never suspend.
Answers
Suggested answer: D

Explanation:

The AUTO_SUSPEND parameter controls the amount of time, in seconds, of inactivity after which a warehouse is automatically suspended. If the parameter is set to NULL, the warehouse never suspends. If the parameter is set to '0', the warehouse suspends immediately after executing a query. Therefore, the execution behavior of the two virtual warehouses will be different depending on the AUTO_SUSPEND value. The warehouse with NULL value will keep running until it is manually suspended or the resource monitor limits are reached. The warehouse with '0' value will suspend as soon as it finishes a query and release the compute resources.Reference:

ALTER WAREHOUSE

Parameters

A user named USER_01 needs access to create a materialized view on a schema EDW. STG_SCHEM

A.
How can this access be provided?
A.
How can this access be provided?
Answers
B.
GRANT CREATE MATERIALIZED VIEW ON SCHEMA EDW.STG_SCHEMA TO USER USER_01;
B.
GRANT CREATE MATERIALIZED VIEW ON SCHEMA EDW.STG_SCHEMA TO USER USER_01;
Answers
C.
GRANT CREATE MATERIALIZED VIEW ON DATABASE EDW TO USER USERJD1;
C.
GRANT CREATE MATERIALIZED VIEW ON DATABASE EDW TO USER USERJD1;
Answers
D.
GRANT ROLE NEW_ROLE TO USER USER_01; GRANT CREATE MATERIALIZED VIEW ON SCHEMA ECW.STG_SCHEKA TO NEW_ROLE;
D.
GRANT ROLE NEW_ROLE TO USER USER_01; GRANT CREATE MATERIALIZED VIEW ON SCHEMA ECW.STG_SCHEKA TO NEW_ROLE;
Answers
E.
GRANT ROLE NEW_ROLE TO USER_01; GRANT CREATE MATERIALIZED VIEW ON EDW.STG_SCHEMA TO NEW_ROLE;
E.
GRANT ROLE NEW_ROLE TO USER_01; GRANT CREATE MATERIALIZED VIEW ON EDW.STG_SCHEMA TO NEW_ROLE;
Answers
Suggested answer: A

Explanation:

The correct answer is A because it grants the specific privilege to create a materialized view on the schema EDW.STG_SCHEMA to the user USER_01 directly.

Option B is incorrect because it grants the privilege to create a materialized view on the entire database EDW, which is too broad and unnecessary. Also, there is a typo in the user name (USERJD1 instead of USER_01).

Option C is incorrect because it grants the privilege to create a materialized view on a different schema (ECW.STG_SCHEKA instead of EDW.STG_SCHEMA). Also, there is no need to create a new role for this purpose.

Option D is incorrect because it grants the privilege to create a materialized view on an invalid object (EDW.STG_SCHEMA is not a valid schema name, it should be EDW.STG_SCHEMA). Also, there is no need to create a new role for this purpose.Reference:

Snowflake Documentation: CREATE MATERIALIZED VIEW

Snowflake Documentation: Working with Materialized Views

[Snowflake Documentation: GRANT Privileges on a Schema]

An Architect Is designing a data lake with Snowflake. The company has structured, semi-structured, and unstructured data. The company wants to save the data inside the data lake within the Snowflake system. The company is planning on sharing data among Its corporate branches using Snowflake data sharing.

What should be considered when sharing the unstructured data within Snowflake?

A.
A pre-signed URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with no time limit for the URL.
A.
A pre-signed URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with no time limit for the URL.
Answers
B.
A scoped URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with a 24-hour time limit for the URL.
B.
A scoped URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with a 24-hour time limit for the URL.
Answers
C.
A file URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with a 7-day time limit for the URL.
C.
A file URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with a 7-day time limit for the URL.
Answers
D.
A file URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with the 'expiration_time' argument defined for the URL time limit.
D.
A file URL should be used to save the unstructured data into Snowflake in order to share data over secure views, with the 'expiration_time' argument defined for the URL time limit.
Answers
Suggested answer: D

Explanation:

According to the Snowflake documentation, unstructured data files can be shared by using a secure view and Secure Data Sharing. A secure view allows the result of a query to be accessed like a table, and a secure view is specifically designated for data privacy. A scoped URL is an encoded URL that permits temporary access to a staged file without granting privileges to the stage. The URL expires when the persisted query result period ends, which is currently 24 hours. A scoped URL is recommended for file administrators to give scoped access to data files to specific roles in the same account. Snowflake records information in the query history about who uses a scoped URL to access a file, and when. Therefore, a scoped URL is the best option to share unstructured data within Snowflake, as it provides security, accountability, and control over the data access.Reference:

Sharing unstructured Data with a secure view

Introduction to Loading Unstructured Data

Consider the following scenario where a masking policy is applied on the CREDICARDND column of the CREDITCARDINFO table. The masking policy definition Is as follows:

Sample data for the CREDITCARDINFO table is as follows:

NAME EXPIRYDATE CREDITCARDNO

JOHN DOE 2022-07-23 4321 5678 9012 1234

if the Snowflake system rotes have not been granted any additional roles, what will be the result?

A.
The sysadmin can see the CREDICARDND column data in clear text.
A.
The sysadmin can see the CREDICARDND column data in clear text.
Answers
B.
The owner of the table will see the CREDICARDND column data in clear text.
B.
The owner of the table will see the CREDICARDND column data in clear text.
Answers
C.
Anyone with the Pl_ANALYTICS role will see the last 4 characters of the CREDICARDND column data in dear text.
C.
Anyone with the Pl_ANALYTICS role will see the last 4 characters of the CREDICARDND column data in dear text.
Answers
D.
Anyone with the Pl_ANALYTICS role will see the CREDICARDND column as*** 'MASKED* **'.
D.
Anyone with the Pl_ANALYTICS role will see the CREDICARDND column as*** 'MASKED* **'.
Answers
Suggested answer: D

Explanation:

The masking policy defined in the image indicates that if a user has the PI_ANALYTICS role, they will be able to see the last 4 characters of the CREDITCARDNO column data in clear text. Otherwise, they will see 'MASKED'. Since Snowflake system roles have not been granted any additional roles, they won't have the PI_ANALYTICS role and therefore cannot view the last 4 characters of credit card numbers.

To apply a masking policy on a column in Snowflake, you need to use the ALTER TABLE ... ALTER COLUMN command or the ALTER VIEW command and specify the policy name. For example, to apply the creditcardno_mask policy on the CREDITCARDNO column of the CREDITCARDINFO table, you can use the following command:

ALTER TABLE CREDITCARDINFO ALTER COLUMN CREDITCARDNO SET MASKING POLICY creditcardno_mask;

For more information on how to create and use masking policies in Snowflake, you can refer to the following resources:

CREATE MASKING POLICY: This document explains the syntax and usage of the CREATE MASKING POLICY command, which allows you to create a new masking policy or replace an existing one.

Using Dynamic Data Masking: This guide provides instructions on how to configure and use dynamic data masking in Snowflake, which is a feature that allows you to mask sensitive data based on the execution context of the user.

ALTER MASKING POLICY: This document explains the syntax and usage of the ALTER MASKING POLICY command, which allows you to modify the properties of an existing masking policy.

A company is trying to Ingest 10 TB of CSV data into a Snowflake table using Snowpipe as part of Its migration from a legacy database platform. The records need to be ingested in the MOST performant and cost-effective way.

How can these requirements be met?

A.
Use ON_ERROR = continue in the copy into command.
A.
Use ON_ERROR = continue in the copy into command.
Answers
B.
Use purge = TRUE in the copy into command.
B.
Use purge = TRUE in the copy into command.
Answers
C.
Use FURGE = FALSE in the copy into command.
C.
Use FURGE = FALSE in the copy into command.
Answers
D.
Use on error = SKIP_FILE in the copy into command.
D.
Use on error = SKIP_FILE in the copy into command.
Answers
Suggested answer: D

Explanation:

For ingesting a large volume of CSV data into Snowflake using Snowpipe, especially for a substantial amount like 10 TB, the on error = SKIP_FILE option in the COPY INTO command can be highly effective. This approach allows Snowpipe to skip over files that cause errors during the ingestion process, thereby not halting or significantly slowing down the overall data load. It helps in maintaining performance and cost-effectiveness by avoiding the reprocessing of problematic files and continuing with the ingestion of other data.

An Architect is troubleshooting a query with poor performance using the QUERY function. The Architect observes that the COMPILATION_TIME Is greater than the EXECUTION_TIME.

What is the reason for this?

A.
The query is processing a very large dataset.
A.
The query is processing a very large dataset.
Answers
B.
The query has overly complex logic.
B.
The query has overly complex logic.
Answers
C.
The query Is queued for execution.
C.
The query Is queued for execution.
Answers
D.
The query Is reading from remote storage
D.
The query Is reading from remote storage
Answers
Suggested answer: B

Explanation:

The correct answer is B because the compilation time is the time it takes for the optimizer to create an optimal query plan for the efficient execution of the query. The compilation time depends on the complexity of the query, such as the number of tables, columns, joins, filters, aggregations, subqueries, etc. The more complex the query, the longer it takes to compile.

Option A is incorrect because the query processing time is not affected by the size of the dataset, but by the size of the virtual warehouse. Snowflake automatically scales the compute resources to match the data volume and parallelizes the query execution. The size of the dataset may affect the execution time, but not the compilation time.

Option C is incorrect because the query queue time is not part of the compilation time or the execution time. It is a separate metric that indicates how long the query waits for a warehouse slot before it starts running. The query queue time depends on the warehouse load, concurrency, and priority settings.

Option D is incorrect because the query remote IO time is not part of the compilation time or the execution time. It is a separate metric that indicates how long the query spends reading data from remote storage, such as S3 or Azure Blob Storage. The query remote IO time depends on the network latency, bandwidth, and caching efficiency.Reference:

Understanding Why Compilation Time in Snowflake Can Be Higher than Execution Time: This article explains why the total duration (compilation + execution) time is an essential metric to measure query performance in Snowflake. It discusses the reasons for the long compilation time, including query complexity and the number of tables and columns.

Exploring Execution Times: This document explains how to examine the past performance of queries and tasks using Snowsight or by writing queries against views in the ACCOUNT_USAGE schema. It also describes the different metrics and dimensions that affect query performance, such as duration, compilation, execution, queue, and remote IO time.

What is the ''compilation time'' and how to optimize it?: This community post provides some tips and best practices on how to reduce the compilation time, such as simplifying the query logic, using views or common table expressions, and avoiding unnecessary columns or joins.

A user is executing the following command sequentially within a timeframe of 10 minutes from start to finish:

What would be the output of this query?

A.
Table T_SALES_CLONE successfully created.
A.
Table T_SALES_CLONE successfully created.
Answers
B.
Time Travel data is not available for table T_SALES.
B.
Time Travel data is not available for table T_SALES.
Answers
C.
The offset -> is not a valid clause in the clone operation.
C.
The offset -> is not a valid clause in the clone operation.
Answers
D.
Syntax error line 1 at position 58 unexpected 'at'.
D.
Syntax error line 1 at position 58 unexpected 'at'.
Answers
Suggested answer: A

Explanation:

The query is executing a clone operation on an existing table t_sales with an offset to account for the retention time. The syntax used is correct for cloning a table in Snowflake, and the use of the at(offset => -60*30) clause is valid. This specifies that the clone should be based on the state of the table 30 minutes prior (60 seconds * 30). Assuming the table t_sales exists and has been modified within the last 30 minutes, and considering the data_retention_time_in_days is set to 1 day (which enables time travel queries for the past 24 hours), the table t_sales_clone would be successfully created based on the state of t_sales 30 minutes before the clone command was issued.

A Snowflake Architect Is working with Data Modelers and Table Designers to draft an ELT framework specifically for data loading using Snowpipe. The Table Designers will add a timestamp column that Inserts the current tlmestamp as the default value as records are loaded into a table. The Intent is to capture the time when each record gets loaded into the table; however, when tested the timestamps are earlier than the loae_take column values returned by the copy_history function or the Copy_HISTORY view (Account Usage).

Why Is this occurring?

A.
The timestamps are different because there are parameter setup mismatches. The parameters need to be realigned
A.
The timestamps are different because there are parameter setup mismatches. The parameters need to be realigned
Answers
B.
The Snowflake timezone parameter Is different from the cloud provider's parameters causing the mismatch.
B.
The Snowflake timezone parameter Is different from the cloud provider's parameters causing the mismatch.
Answers
C.
The Table Designer team has not used the localtimestamp or systimestamp functions in the Snowflake copy statement.
C.
The Table Designer team has not used the localtimestamp or systimestamp functions in the Snowflake copy statement.
Answers
D.
The CURRENT_TIMEis evaluated when the load operation is compiled in cloud services rather than when the record is inserted into the table.
D.
The CURRENT_TIMEis evaluated when the load operation is compiled in cloud services rather than when the record is inserted into the table.
Answers
Suggested answer: D

Explanation:

The correct answer is D because the CURRENT_TIME function returns the current timestamp at the start of the statement execution, not at the time of the record insertion. Therefore, if the load operation takes some time to complete, the CURRENT_TIME value may be earlier than the actual load time.

Option A is incorrect because the parameter setup mismatches do not affect the timestamp values. The parameters are used to control the behavior and performance of the load operation, such as the file format, the error handling, the purge option, etc.

Option B is incorrect because the Snowflake timezone parameter and the cloud provider's parameters are independent of each other. The Snowflake timezone parameter determines the session timezone for displaying and converting timestamp values, while the cloud provider's parameters determine the physical location and configuration of the storage and compute resources.

Option C is incorrect because the localtimestamp and systimestamp functions are not relevant for the Snowpipe load operation. The localtimestamp function returns the current timestamp in the session timezone, while the systimestamp function returns the current timestamp in the system timezone. Neither of them reflect the actual load time of the records.Reference:

Snowflake Documentation: Loading Data Using Snowpipe: This document explains how to use Snowpipe to continuously load data from external sources into Snowflake tables. It also describes the syntax and usage of the COPY INTO command, which supports various options and parameters to control the loading behavior.

Snowflake Documentation: Date and Time Data Types and Functions: This document explains the different data types and functions for working with date and time values in Snowflake. It also describes how to set and change the session timezone and the system timezone.

Snowflake Documentation: Querying Metadata: This document explains how to query the metadata of the objects and operations in Snowflake using various functions, views, and tables. It also describes how to access the copy history information using the COPY_HISTORY function or the COPY_HISTORY view.

Total 162 questions
Go to page: of 17