Snowflake DSA-C02 Practice Test - Questions Answers, Page 5
List of questions
Related questions
Question 41

Which of the following is a Python-based web application framework for visualizing data and analyzing results in a more efficient and flexible way?
Explanation:
Streamlit is a Python-based web application framework for visualizing data and analyzing results in a more efficient and flexible way. It is an open source library that assists data scientists and academics to develop Machine Learning (ML) visualization dashboards in a short period of time. We can build and deploy powerful data applications with just a few lines of code.
Why Streamlit?
Currently, real-world applications are in high demand and developers are developing new libraries and frameworks to make on-the-go dashboards easier to build and deploy. Streamlit is a library that reduces your dashboard development time from days to hours. Following are some reasons to choose the Streamlit:
It is a free and open-source library.
Installing Streamlit is as simple as installing any other python package
It is easy to learn because you won't need any web development experience, only a basic under-standing of Python is enough to build a data application.
It is compatible with almost all machine learning frameworks, including Tensorflow and Pytorch, Scikit-learn, and visualization libraries such as Seaborn, Altair, Plotly, and many others.
Question 42

Which is the visual depiction of data through the use of graphs, plots, and informational graphics?
Explanation:
Data visualization is the visual depiction of data through the use of graphs, plots, and informational graphics. Its practitioners use statistics and data science to convey the meaning behind data in ethical and accurate ways.
Question 43

Which method is used for detecting data outliers in Machine learning?
Explanation:
What are outliers?
Outliers are the values that look different from the other values in the data. Below is a plot high-lighting the outliers in 'red' and outliers can be seen in both the extremes of data.
Reasons for outliers in data
Errors during data entry or a faulty measuring device (a faulty sensor may result in extreme readings).
Natural occurrence (salaries of junior level employees vs C-level employees)
Problems caused by outliers
Outliers in the data may causes problems during model fitting (esp. linear models).
Outliers may inflate the error metrics which give higher weights to large errors (example, mean squared error, RMSE).
Z-score method is of the method for detecting outliers. This method is generally used when a variable' distribution looks close to Gaussian. Z-score is the number of standard deviations a value of a variable is away from the variable' mean.
Z-Score = (X-mean) / Standard deviation
IQR method , Box plots are some more example of methods used to detect data outliers in Data science.
Question 44

Mark the correct steps for saving the contents of a DataFrame to a Snowflake table as part of Moving Data from Spark to Snowflake?
Explanation:
Moving Data from Spark to Snowflake
The steps for saving the contents of a DataFrame to a Snowflake table are similar to writing from Snowflake to Spark:
1. Use the write() method of the DataFrame to construct a DataFrameWriter.
2. Specify SNOWFLAKE_SOURCE_NAME using the format() method.
3. Specify the connector options using either the option() or options() method.
4. Use the dbtable option to specify the table to which data is written.
5. Use the mode() method to specify the save mode for the content.
Examples
1. df.write
2. .format(SNOWFLAKE_SOURCE_NAME)
3. .options(sfOptions)
4. .option('dbtable', 't2')
5. .mode(SaveMode.Overwrite)
6. .save()
Question 45

Select the Data Science Tools which are known to provide native connectivity to Snowflake?
Explanation:
Hex --- collaborative data science and analytics platform
Denodo --- data virtualization and federation platform
DvSum --- data catalog and data intelligence platform
Diyotta --- data integration and migration
Question 46

Which one of the following is not the key component while designing External functions within Snowflake?
Explanation:
What is an External Function?
An external function calls code that is executed outside Snowflake.
The remotely executed code is known as a remote service.
Information sent to a remote service is usually relayed through a proxy service.
Snowflake stores security-related external function information in an API integration.
External Function:
An external function is a type of UDF. Unlike other UDFs, an external function does not contain its own code; instead, the external function calls code that is stored and executed outside Snowflake.
Inside Snowflake, the external function is stored as a database object that contains information that Snowflake uses to call the remote service. This stored information includes the URL of the proxy service that relays information to and from the remote service.
Remote Service:
The remotely executed code is known as a remote service.
The remote service must act like a function. For example, it must return a value.
Snowflake supports scalar external functions; the remote service must return exactly one row for each row received.
Proxy Service:
Snowflake does not call a remote service directly. Instead, Snowflake calls a proxy service, which relays the data to the remote service.
The proxy service can increase security by authenticating requests to the remote service.
The proxy service can support subscription-based billing for a remote service. For example, the proxy service can verify that a caller to the remote service is a paid subscriber.
The proxy service also relays the response from the remote service back to Snowflake.
Examples of proxy services include:
Amazon API Gateway.
Microsoft Azure API Management service.
API Integration:
An integration is a Snowflake object that provides an interface between Snowflake and third-party services. An API integration stores information, such as security information, that is needed to work with a proxy service or remote service.
An API integration is created with the CREATE API INTEGRATION command.
Users can write and call their own remote services, or call remote services written by third parties. These remote services can be written using any HTTP server stack, including cloud serverless compute services such as AWS Lambda.
Question 47

Which ones are the known limitations of using External function?
Question 48

What is the risk with tuning hyper-parameters using a test dataset?
Question 49

Select the correct mappings:
I) W Weights or Coefficients of independent variables in the Linear regression model --> Model Pa-rameter
II) K in the K-Nearest Neighbour algorithm --> Model Hyperparameter
III) Learning rate for training a neural network --> Model Hyperparameter
IV) Batch Size --> Model Parameter
Question 50

Performance metrics are a part of every machine learning pipeline, Which ones are not the performance metrics used in the Machine learning?
Question