ExamGecko
Home Home / Snowflake / DEA-C01

Snowflake DEA-C01 Practice Test - Questions Answers, Page 8

Question list
Search
Search

List of questions

Search

For the most efficient and cost-effective Data load experience, Data Engineer needs to inconsider-ate which of the following considerations?

A.
Split larger files into a greater number of smaller files, maximize the processing over-head for each file.(Correct)
A.
Split larger files into a greater number of smaller files, maximize the processing over-head for each file.(Correct)
Answers
B.
Enabling the STRIP_OUTER_ARRAY file format option for the COPY INTO <ta-ble> command to remove the outer array structure and load the records into separate table rows.
B.
Enabling the STRIP_OUTER_ARRAY file format option for the COPY INTO <ta-ble> command to remove the outer array structure and load the records into separate table rows.
Answers
C.
Amazon Kinesis Firehose can be convenient way to aggregate and batch data files which also allows defining both the desired file size, called the buffer size, and the wait interval after which a new file is sent, called the buffer interval.
C.
Amazon Kinesis Firehose can be convenient way to aggregate and batch data files which also allows defining both the desired file size, called the buffer size, and the wait interval after which a new file is sent, called the buffer interval.
Answers
D.
When preparing your delimited text (CSV) files for loading, the number of columns in each row should be consistent.
D.
When preparing your delimited text (CSV) files for loading, the number of columns in each row should be consistent.
Answers
E.
if the "null" values in your files indicate missing values and have no other special mean-ing, Snowflake recommend setting the file format option STRIP_NULL_VALUES to TRUE when loading the semi-structured data file.
E.
if the "null" values in your files indicate missing values and have no other special mean-ing, Snowflake recommend setting the file format option STRIP_NULL_VALUES to TRUE when loading the semi-structured data file.
Answers
Suggested answer: A

Explanation:

Split larger files into a greater number of smaller files to distribute the load among the compute resources in an active warehouse. This would minimize the processing overhead rather than maximize it.

Rest is recommended Data loading considerations.

The COPY command supports several options for loading data files from a stage i.e.

A.
By pathII. Specifying a list of specific files to load.III. Using pattern matching to identify specific files by pattern.IV. Organize files into logical paths that reflect a scheduling pattern.Of the aforesaid options for identifying/specifying data files to load from a stage, which option in general is the fastest & best considerate?
A.
By pathII. Specifying a list of specific files to load.III. Using pattern matching to identify specific files by pattern.IV. Organize files into logical paths that reflect a scheduling pattern.Of the aforesaid options for identifying/specifying data files to load from a stage, which option in general is the fastest & best considerate?
Answers
B.
I
B.
I
Answers
C.
II
C.
II
Answers
D.
III
D.
III
Answers
E.
IV
E.
IV
Answers
Suggested answer: B

Explanation:

Of the above options for identifying/specifying data files to load from a stage, providing a discrete list of files is generally the fastest; however, the FILES parameter supports a maximum of 1,000 files, meaning a COPY command executed with the FILES parameter can only load up to 1,000 files.

For example:

copy into load1 from @%load1/Snow1/ files=('mydata1.csv', 'mydata2.csv', 'mydata3.csv')

As Data Engineer, you have requirement to Load set of New Product Files containing Product relevant information into the Snowflake internal tables, Later you analyzed that some of the Source files are already loaded in one of the historical batch & for that you have prechecked Metadata col-umn LAST_MODIFIED date for a staged data file & found out that LAST_MODIFIED date is older than 64 days for few files and the initial set of data was loaded into the table more than 64 days earlier, Which one is the best approach to Load Source data files with expired load metadata along with set of files whose metadata might be available to avoid data duplication?

A.
Since the initial set of data for the table (i.e. the first batch after the table was created) was loaded, we can simply use the COPY INTO command to load all the product files with the known load status irrespective of their column LAST_MODIFIED date values.
A.
Since the initial set of data for the table (i.e. the first batch after the table was created) was loaded, we can simply use the COPY INTO command to load all the product files with the known load status irrespective of their column LAST_MODIFIED date values.
Answers
B.
The COPY command cannot definitively determine whether a file has been loaded al-ready if theLAST_MODIFIED date is older than 64 days and the initial set of data was loaded into the table more than 64 days earlier (and if the file was loaded into the table, that also occurred more than 64 days earlier). In this case, to prevent accidental reload, the command skips the product files by default.
B.
The COPY command cannot definitively determine whether a file has been loaded al-ready if theLAST_MODIFIED date is older than 64 days and the initial set of data was loaded into the table more than 64 days earlier (and if the file was loaded into the table, that also occurred more than 64 days earlier). In this case, to prevent accidental reload, the command skips the product files by default.
Answers
C.
Set the FORCE option to load all files, ignoring load metadata if it exists.
C.
Set the FORCE option to load all files, ignoring load metadata if it exists.
Answers
D.
To load files whose metadata has expired, set the LOAD_UNCERTAIN_FILES copy option to true.
D.
To load files whose metadata has expired, set the LOAD_UNCERTAIN_FILES copy option to true.
Answers
Suggested answer: D

Explanation:

To load files whose metadata has expired, set the LOAD_UNCERTAIN_FILES copy option to true. The copy option references load metadata, if available, to avoid data duplication, but also at-tempts to load files with expired load metadata.

Alternatively, set the FORCE option to load all files, ignoring load metadata if it exists. Note that this option reloads files, potentially duplicating data in a table.

Please refer the Example as mentioned in the link below:

https://docs.snowflake.com/en/user-guide/data-load-considerations-load.html#loading-older-files

If external software i.e. TIBCO, exports Data fields enclosed in quotes but inserts a leading space before the opening quotation character for each field, How Snowflake handle it? [Select 2]

A.
Snowflake automatically handles leading spaces by trimming implicitly & removes the quotation marks enclosing each field.
A.
Snowflake automatically handles leading spaces by trimming implicitly & removes the quotation marks enclosing each field.
Answers
B.
field_optionally_enclosed_by option along with TRIM_IF function in COPY INTO statement can be used to handle this scenario successfully.
B.
field_optionally_enclosed_by option along with TRIM_IF function in COPY INTO statement can be used to handle this scenario successfully.
Answers
C.
Snowflake reads the leading space rather than the opening quotation character as the beginning of the field and the quotation characters are interpreted as string data.(Correct)
C.
Snowflake reads the leading space rather than the opening quotation character as the beginning of the field and the quotation characters are interpreted as string data.(Correct)
Answers
D.
COPY command trims the leading space and removes the quotation marks enclosing each field
D.
COPY command trims the leading space and removes the quotation marks enclosing each field
Answers
E.
copy into SFtable
E.
copy into SFtable
Answers
F.
from @%SFtable
F.
from @%SFtable
Answers
G.
file_format = (type = csv trim_space=true field_optionally_enclosed_by = '0x22');
G.
file_format = (type = csv trim_space=true field_optionally_enclosed_by = '0x22');
Answers
Suggested answer: D

Explanation:

If your external software exports fields enclosed in quotes but inserts a leading space before the opening quotation character for each field, Snowflake reads the leading space rather than the opening quotation character as the beginning of the field. The quotation characters are interpreted as string data.

Use the TRIM_SPACE file format option to remove undesirable spaces during the data load.

Data Engineer Loading File named snowdata.tsv in the /datadir directory from his local machine to Snowflake stage and try to prefix the file with a folder named tablestage, please mark the correct command which helps him to load the files data into snowflake internal Table stage?

A.
put file://c:\datadir\snowdata.tsv @~/tablestage;
A.
put file://c:\datadir\snowdata.tsv @~/tablestage;
Answers
B.
put file://c:\datadir\snowdata.tsv @%tablestage;
B.
put file://c:\datadir\snowdata.tsv @%tablestage;
Answers
C.
put file://c:\datadir\snowdata.tsv @tablestage;
C.
put file://c:\datadir\snowdata.tsv @tablestage;
Answers
D.
put file:///datadir/snowdata.tsv @%tablestage;
D.
put file:///datadir/snowdata.tsv @%tablestage;
Answers
Suggested answer: B

Explanation:

Execute PUT to upload (stage) local data files into an internal stage.

@% character combination identifies a table stage.

Mark the Correct Statements for the VALIDATION_MODE option used by Data Engineer for Da-ta loading operations in his/her COPY INTO <table> command:

A.
VALIDATION_MODE instructs the COPY command to validate the data files instead of loading them into the specified table; i.e., the COPY command tests the files for er-rors but does not load them.
A.
VALIDATION_MODE instructs the COPY command to validate the data files instead of loading them into the specified table; i.e., the COPY command tests the files for er-rors but does not load them.
Answers
B.
VALIDATION_MODE option supported these values:RETURN_n_ROWS,RETURN_ERRORS,RETURN_ALL_ERRORS
B.
VALIDATION_MODE option supported these values:RETURN_n_ROWS,RETURN_ERRORS,RETURN_ALL_ERRORS
Answers
C.
VALIDATION_MODE does not support COPY statements that transform data during a load. If the parameter is specified, the COPY statement returns an error.
C.
VALIDATION_MODE does not support COPY statements that transform data during a load. If the parameter is specified, the COPY statement returns an error.
Answers
D.
VALIDATION_MODE only support Data loading operation i.e., do not work while da-ta unloading.
D.
VALIDATION_MODE only support Data loading operation i.e., do not work while da-ta unloading.
Answers
Suggested answer: A, B, C

Explanation:

All the Statements are correct except the statement saying VALIDATION_MODE only support Data loading operation.

VALIDATION_MODE can be used with COPY INTO <location> command as well i.e for data unloading operation.

VALIDATION_MODE = RETURN_ROWS can be used at the time of Data unloading.

This option instructs the COPY command to return the results of the query in the SQL statement instead of unloading the results to the specified cloud storage location. The only supported validation option is RETURN_ROWS. This option returns all rows produced by the query.

When you have validated the query, you can remove the VALIDATION_MODE to perform the unload operation.

To troubleshoot data load failure in one of your Copy Statement, Data Engineer have Executed a COPY statement with the VALIDATION_MODE copy option set to RETURN_ALL_ERRORS with reference to the set of files he had attempted to load. Which below function can facilitate analysis of the problematic records on top of the Results produced? [Select 2]

A.
RESULT_SCAN
A.
RESULT_SCAN
Answers
B.
LAST_QUERY_ID
B.
LAST_QUERY_ID
Answers
C.
Rejected_record
C.
Rejected_record
Answers
D.
LOAD_ERROR
D.
LOAD_ERROR
Answers
Suggested answer: A, B

Explanation:

LAST_QUERY_ID() Function

Returns the ID of a specified query in the current session. If no query is specified, the most recently executed query is returned.

RESULT_SCAN() Function

Returns the result set of a previous command (within 24 hours of when you executed the query) as if the result was a table.

The following example validates a set of files (SFfile.csv.gz) that contain errors. To facilitate analy-sis of the errors, a COPY INTO <location> statement then unloads the problematic records into a text file so they could be analyzed and fixed in the original data files. The statement queries the RESULT_SCAN table.

1. #copy into Snowtable

2. from @SFstage/SFfile.csv.gz

3. validation_mode=return_all_errors;

4. #set qid=last_query_id();

5. #copy into @SFstage/errors/load_errors.txt from (select rejected_record from table( result_scan($qid))); Note: Other options are not valid functions.

As part of Table Designing, Data Engineer added a timestamp column that inserts the current timestamp as the default value as records are loaded into a table. The intent is to capture the time when each record was loaded into the table; however, the timestamps are earlier than the LOAD_TIME column values returned by COPY_HISTORY view (Account Usage). What could be reason of this issue?

A.
LOAD_TIME column values returned by COPY_HISTORY view (Account Usage) gives the same time as returned by CURRENT_TIMESTAMP.
A.
LOAD_TIME column values returned by COPY_HISTORY view (Account Usage) gives the same time as returned by CURRENT_TIMESTAMP.
Answers
B.
CURRENT_TIMESTAMP values might be different due to query gets executed in warehouse located in different region.
B.
CURRENT_TIMESTAMP values might be different due to query gets executed in warehouse located in different region.
Answers
C.
It might be possible that Cloud Provider hosted on Snowflake belongs to region having server time zone lagging Cluster time zone of warehouse where queries get processed & committed.
C.
It might be possible that Cloud Provider hosted on Snowflake belongs to region having server time zone lagging Cluster time zone of warehouse where queries get processed & committed.
Answers
D.
The reason is, CURRENT_TIMESTAMP is evaluated when the load operation is com-piled in cloud services rather than when the record is inserted into the table (i.e. when the transaction for the load operation is committed).
D.
The reason is, CURRENT_TIMESTAMP is evaluated when the load operation is com-piled in cloud services rather than when the record is inserted into the table (i.e. when the transaction for the load operation is committed).
Answers
Suggested answer: D

Explanation:

The reason timestamps are earlier than the LOAD_TIME column values which is returned by COPY_HISTORY view (Account Usage) is that CURRENT_TIMESTAMP is evaluated when the load operation is compiled in cloud services rather than when the record is inserted into the table (i.e.

when the transaction for the load operation is committed).

Snowpipe loads data from files as soon as they are available in a stage. Automated data loads leverage event notifications for cloud storage to inform Snowpipe of the arrival of new data files to load.

Which Cloud hosted platform provides cross cloud support for automated data loading via Snowpipe?

A.
GCP
A.
GCP
Answers
B.
AWS
B.
AWS
Answers
C.
AZURE
C.
AZURE
Answers
D.
None of the Above currently provide cross cloud support for Snowpipe.
D.
None of the Above currently provide cross cloud support for Snowpipe.
Answers
Suggested answer: B

Explanation:

Cross-cloud support only available to accounts hosted on Amazon Web Services currently.

Find out the odd one out:

A.
A.
Answers
B.
Bulk Data Load: Loads are always performed in a single transaction.
B.
Bulk Data Load: Loads are always performed in a single transaction.
Answers
C.
SnowPipe: Loads are combined or split into a single or multiple transactions based on the number and size of the rows in each data file.
C.
SnowPipe: Loads are combined or split into a single or multiple transactions based on the number and size of the rows in each data file.
Answers
D.
D.
Answers
E.
Bulk Data Load: Requires a user-specified warehouse to execute COPY statements.
E.
Bulk Data Load: Requires a user-specified warehouse to execute COPY statements.
Answers
F.
SnowPipe: Uses Snowflake-supplied compute resources.
F.
SnowPipe: Uses Snowflake-supplied compute resources.
Answers
G.
G.
Answers
H.
Bulk Data Load: Billed for the amount of time each virtual warehouse is active.
H.
Bulk Data Load: Billed for the amount of time each virtual warehouse is active.
Answers
I.
SnowPipe: Billed according to the compute resources used in the Snowpipe ware-house while loading the files.
I.
SnowPipe: Billed according to the compute resources used in the Snowpipe ware-house while loading the files.
Answers
J.
J.
Answers
K.
Bulk Data Load: Load history Stored in the metadata of the target table for 365 days.
K.
Bulk Data Load: Load history Stored in the metadata of the target table for 365 days.
Answers
L.
SnowPipe: Load history Stored in the metadata of the pipe for 64 days.
L.
SnowPipe: Load history Stored in the metadata of the pipe for 64 days.
Answers
Suggested answer: D

Explanation:

Bulk data load

Load History Stored in the metadata of the target table for 64 days. Available upon completion of the COPY statement as the statement output.

Snowpipe

Load History Stored in the metadata of the pipe for 14 days. Must be requested from Snowflake via a REST endpoint, SQL table function, or ACCOUNT_USAGE view.

Rest are correct statements.

Total 130 questions
Go to page: of 13