Microsoft DP-500 Practice Test - Questions Answers, Page 12

List of questions
Question 111

DRAG DROP
You have an Azure Synapse Analytics serverless SQL pool.
You need to return a list of files and the number of rows in each file.
How should you complete the Transact-SQL statement? To answer, drag the appropriate values to the targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Question 112

HOTSPOT
You have an Azure Synapse Analytics serverless SQL pool and an Azure Data Lake Storage Gen2 account.
You need to query all the files in the 'csv/taxi/' folder and all its subfolders. All the files are in CSV format and have a header row.
How should you complete the query? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Question 113

You have a group of data scientists who must create machine learning models and run periodic experiments on a large dataset.
You need to recommend an Azure Synapse Analytics pool for the data scientists. The solution must minimize costs.
Which type of pool should you recommend?
Question 114

HOTSPOT
You manage a dataset that contains the two data sources as shown in the following table.
When you attempt to refresh the dataset in powerbi.com, you receive the following error message:
"[Unable to combine data] Add Columns is accessing data sources that have privacy levels which cannot be used together. Please rebuild this data combination."
You discover that the dataset contains queries that fold data from the SharePoint folder to the Azure
SQL database.
You need to resolve the error. The solution must provide the highest privacy possible.
Which privacy level should you select for each data source? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Question 115

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You are using an Azure Synapse Analytics serverless SQL pool to query a collection of Apache Parquet files by using automatic schema inference. The files contain more than 40 million rows of UTF-8encoded business names, survey names, and participant counts. The database is configured to use the default collation.
The queries use open row set and infer the schema shown in the following table.
You need to recommend changes to the queries to reduce I/O reads and tempdb usage.
Solution: You recommend defining an external table for the Parquet files and updating the query to use the table
Does this meet the goal?
Question 116

You have a deployment pipeline for a Power BI workspace. The workspace contains two datasets that use import storage mode.
A database administrator reports a drastic increase in the number of queries sent from the Power BI service to an Azure SQL database since the creation of the deployment pipeline.
An investigation into the issue identifies the following:
One of the datasets is larger than 1 GB and has a fact table that contains more than 500 million rows.
When publishing dataset changes to development, test, or production pipelines, a refresh is triggered against the entire dataset.
You need to recommend a solution to reduce the size of the queries sent to the database when the dataset changes are published to development, test, or production.
What should you recommend?
Question 117

You are using a Python notebook in an Apache Spark pool in Azure Synapse Analytics.
You need to present the data distribution statistics from a DataFrame in a tabular view.
Which method should you invoke on the DataFrame?
Question 118

You are using a Python notebook in an Apache Spark pool in Azure Synapse Analytics.
You need to present the data distribution statistics from a DataFrame in a tabular view.
Which method should you invoke on the DataFrame?
Question 119

You have a deployment pipeline for a Power BI workspace. The workspace contains two datasets that use import storage mode.
A database administrator reports a drastic increase in the number of queries sent from the Power Bi service to an Azure SQL database since the creation of the deployment pipeline.
An investigation into the issue identifies the following:
One of the datasets is larger than 1 GB and has a fact table that contains more than 500 million rows.
When publishing dataset changes to development, test, or production pipelines, a refresh is triggered against the entire dataset.
You need to recommend a solution to reduce the size of the queries sent to the database when the dataset changes are published to development, test, or production.
What should you recommend?
Question 120

You are using a Python notebook in an Apache Spark pool in Azure Synapse Analytics.
You need to present the data distribution statistics from a DataFrame in a tabular view.
Which method should you invoke on the DataFrame?
Question