ExamGecko
Home Home / Microsoft / DP-203

Microsoft DP-203 Practice Test - Questions Answers, Page 19

Question list
Search
Search

List of questions

Search

Related questions











Note: This question-is part of a series of questions that present the same scenario. Each question-in the series contains a unique solution that might meet the stated goals. Some question-sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question-in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You have an Azure Data Lake Storage account that contains a staging zone. You need to design a daily process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics. Solution: You use an Azure Data Factory schedule trigger to execute a pipeline that executes an Azure Databricks notebook, and then inserts the data into the data warehouse. Does this meet the goal?

A.
Yes
A.
Yes
Answers
B.
No
B.
No
Answers
Suggested answer: B

Explanation:

If you need to transform data in a way that is not supported by Data Factory, you can create a custom activity, not an Azure Databricks notebook, with your own data processing logic and use the activity in the pipeline. You can create a custom activity to run R scripts on your HDInsight cluster with R installed.

Reference:

https://docs.microsoft.com/en-US/azure/data-factory/transform-data

Note: This question-is part of a series of questions that present the same scenario. Each question-in the series contains a unique solution that might meet the stated goals. Some question-sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question-in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You have an Azure Data Lake Storage account that contains a staging zone. You need to design a daily process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics. Solution: You use an Azure Data Factory schedule trigger to execute a pipeline that executes mapping data flow, and then inserts the data into the data warehouse. Does this meet the goal?

A.
Yes
A.
Yes
Answers
B.
No
B.
No
Answers
Suggested answer: A

Explanation:


Note: This question-is part of a series of questions that present the same scenario. Each question-in the series contains a unique solution that might meet the stated goals. Some question-sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question-in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You have an Azure Data Lake Storage account that contains a staging zone. You need to design a daily process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics. Solution: You schedule an Azure Databricks job that executes an R notebook, and then inserts the data into the data warehouse. Does this meet the goal?

A.
Yes
A.
Yes
Answers
B.
No
B.
No
Answers
Suggested answer: A

Explanation:


You plan to create an Azure Data Factory pipeline that will include a mapping data flow. You have JSON data containing objects that have nested arrays. You need to transform the JSON-formatted data into a tabular dataset. The dataset must have one row for each item in the arrays. Which transformation method should you use in the mapping data flow?

A.
new branch
A.
new branch
Answers
B.
unpivot
B.
unpivot
Answers
C.
alter row
C.
alter row
Answers
D.
flatten
D.
flatten
Answers
Suggested answer: D

Explanation:

Use the flatten transformation to take array values inside hierarchical structures such as JSON and unroll them into individual rows. This process is known as denormalization.

Reference:

https://docs.microsoft.com/en-us/azure/data-factory/data-flow-flatten

You use Azure Stream Analytics to receive Twitter data from Azure Event Hubs and to output the data to an Azure Blob storage account. You need to output the count of tweets during the last five minutes every five minutes. Each tweet must only be counted once. Which windowing function should you use?

A.
a five-minute Sliding window
A.
a five-minute Sliding window
Answers
B.
a five-minute Session window
B.
a five-minute Session window
Answers
C.
a five-minute Hopping window that has a one-minute hop
C.
a five-minute Hopping window that has a one-minute hop
Answers
D.
a five-minute Tumbling window
D.
a five-minute Tumbling window
Answers
Suggested answer: D

Explanation:

Tumbling window functions are used to segment a data stream into distinct time segments and perform a function against them, such as the example below. The key differentiators of a Tumbling window are that they repeat, do not overlap, and an event cannot belong to more than one tumbling window.

Reference:

https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions

You are planning a streaming data solution that will use Azure Databricks. The solution will stream sales transaction data from an online store. The solution has the following specifications:

The output data will contain items purchased, quantity, line total sales amount, and line total tax amount. Line total sales amount and line total tax amount will be aggregated in Databricks. Sales transactions will never be updated. Instead, new rows will be added to adjust a sale. You need to recommend an output mode for the dataset that will be processed by using Structured Streaming. The solution must minimize duplicate data. What should you recommend?

A.
Update
A.
Update
Answers
B.
Complete
B.
Complete
Answers
C.
Append
C.
Append
Answers
Suggested answer: B

Explanation:

By default, streams run in append mode, which adds new records to the table.https://docs.databricks.com/delta/delta-streaming.html

You have an enterprise data warehouse in Azure Synapse Analytics named DW1 on a server named Server1. You need to determine the size of the transaction log file for each distribution of DW1. What should you do?

A.
On DW1, execute a query against the sys.database_files dynamic management view.
A.
On DW1, execute a query against the sys.database_files dynamic management view.
Answers
B.
From Azure Monitor in the Azure portal, execute a query against the logs of DW1.
B.
From Azure Monitor in the Azure portal, execute a query against the logs of DW1.
Answers
C.
Execute a query against the logs of DW1 by using theGet-AzOperationalInsightsSearchResult PowerShell cmdlet.
C.
Execute a query against the logs of DW1 by using theGet-AzOperationalInsightsSearchResult PowerShell cmdlet.
Answers
D.
On the master database, execute a query against the sys.dm_pdw_nodes_os_performance_counters dynamic management view.
D.
On the master database, execute a query against the sys.dm_pdw_nodes_os_performance_counters dynamic management view.
Answers
Suggested answer: A

Explanation:

For information about the current log file size, its maximum size, and the autogrow option for the file, you can also use the size, max_size, and growth columns for that log file in sys.database_files.

Reference:

https://docs.microsoft.com/en-us/sql/relational-databases/logs/manage-the-size-of-the-transaction-log-file

You are designing an anomaly detection solution for streaming data from an Azure IoT hub. The solution must meet the following requirements:

Send the output to Azure Synapse.

Identify spikes and dips in time series data.

Minimize development and configuration effort.

Which should you include in the solution?

A.
Azure Databricks
A.
Azure Databricks
Answers
B.
Azure Stream Analytics
B.
Azure Stream Analytics
Answers
C.
Azure SQL Database
C.
Azure SQL Database
Answers
Suggested answer: B

Explanation:

You can identify anomalies by routing data via IoT Hub to a built-in ML model in Azure Stream Analytics.

Reference:

https://docs.microsoft.com/en-us/learn/modules/data-anomaly-detection-using-azure-iot-hub/

A company uses Azure Stream Analytics to monitor devices.

The company plans to double the number of devices that are monitored. You need to monitor a Stream Analytics job to ensure that there are enough processing resources to handle the additional load. Which metric should you monitor?

A.
Early Input Events
A.
Early Input Events
Answers
B.
Late Input Events
B.
Late Input Events
Answers
C.
Watermark delay
C.
Watermark delay
Answers
D.
Input Deserialization Errors
D.
Input Deserialization Errors
Answers
Suggested answer: A

You have an Azure Stream Analytics job.

You need to ensure that the job has enough streaming units provisioned. You configure monitoring of the SU % Utilization metric.

Which two additional metrics should you monitor? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

A.
Backlogged Input Events
A.
Backlogged Input Events
Answers
B.
Watermark Delay
B.
Watermark Delay
Answers
C.
Function Events
C.
Function Events
Answers
D.
Out of order Events
D.
Out of order Events
Answers
E.
Late Input Events
E.
Late Input Events
Answers
Suggested answer: A, B

Explanation:

To react to increased workloads and increase streaming units, consider setting an alert of 80% on the SU Utilization metric. Also, you can use watermark delay and backlogged events metrics to see if there is an impact. Note: Backlogged Input Events: Number of input events that are backlogged. A non-zero value for this metric implies that your job isn't able to keep up with the number of incoming events. If this value is slowly increasing or consistently non-zero, you should scale out your job, by increasing the SUs.

Reference:

https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-monitoring

Total 320 questions
Go to page: of 32