ExamGecko
Home / Microsoft / DP-203 / List of questions
Ask Question

Microsoft DP-203 Practice Test - Questions Answers, Page 26

List of questions

Question 251

Report
Export
Collapse

You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1. Table1 contains the following:

One billion rows

A clustered columnstore index

A hash-distributed column named Product Key

A column named Sales Date that is of the date data type and cannot be null Thirty million rows will be added to Table1 each month. You need to partition Table1 based on the Sales Date column. The solution must optimize query performance and data loading. How often should you create a partition?

once per month
once per month
once per year
once per year
once per day
once per day
once per week
once per week
Suggested answer: B

Explanation:

Need a minimum 1 million rows per distribution. Each table is 60 distributions. 30 millions rows is added each month. Need 2 months to get a minimum of 1 million rows per distribution in a new partition. Note: When creating partitions on clustered columnstore tables, it is important to consider how many rows belong to each partition. For optimal compression and performance of clustered columnstore tables, a minimum of 1 million rows per distribution and partition is needed. Before partitions are created, dedicated SQL pool already divides each table into 60 distributions. Any partitioning added to a table is in addition to the distributions created behind the scenes. Using this example, if the sales fact table contained 36 monthly partitions, and given that a dedicated SQL pool has 60 distributions, then the sales fact table should contain 60 million rows per month, or 2.1 billion rows when all months are populated. If a table contains fewer than the recommended minimum number of rows per partition, consider using fewer partitions in order to increase the number of rows per partition.

Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehousetables-partition

asked 02/10/2024
Leandro Ruwer
46 questions

Question 252

Report
Export
Collapse

You are creating an Apache Spark job in Azure Databricks that will ingest JSON-formatted data. You need to convert a nested JSON string into a DataFrame that will contain multiple rows. Which Spark SQL function should you use?

explode
explode
filter
filter
coalesce
coalesce
extract
extract
Suggested answer: A

Explanation:

Convert nested JSON to a flattened DataFrame

You can to flatten nested JSON, using only $"column.*" and explode methods. Note: Extract and flatten

Use $"column.*" and explode methods to flatten the struct and array types before displaying the flattened DataFrame. Scala

display(DF.select($"id" as "main_id",$"name",$"batters",$"ppu",explode($"topping")) // Exploding the topping column using explode as it is an array type

.withColumn("topping_id",$"col.id") // Extracting topping_id from col using DOT form .withColumn("topping_type",$"col.type") // Extracting topping_tytpe from col using DOT form .drop($"col")

.select($"*",$"batters.*") // Flattened the struct type batters tto array type which is batter .drop($"batters")

.select($"*",explode($"batter"))

.drop($"batter")

.withColumn("batter_id",$"col.id") // Extracting batter_id from col using DOT form .withColumn("battter_type",$"col.type") // Extracting battter_type from col using DOT form .drop($"col")

)

Reference: https://learn.microsoft.com/en-us/azure/databricks/kb/scala/flatten-nested-columnsdynamically

asked 02/10/2024
brandon landaal
40 questions

Question 253

Report
Export
Collapse

You have an Azure subscription that contains an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 receives new data once every 24 hours. You have the following function.

Microsoft DP-203 image Question 77 89719 10022024015850000000

You have the following query.

Microsoft DP-203 image Question 77 89719 10022024015850000000

The query is executed once every 15 minutes and the @parameter value is set to the current date. You need to minimize the time it takes for the query to return results. Which two actions should you perform? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

Create an index on the avg_f column.
Create an index on the avg_f column.
Convert the avg_c column into a calculated column.
Convert the avg_c column into a calculated column.
Create an index on the sensorid column.
Create an index on the sensorid column.
Enable result set caching.
Enable result set caching.
Change the table distribution to replicate.
Change the table distribution to replicate.
Suggested answer: B, D
asked 02/10/2024
David Vicente Martinez
33 questions

Question 254

Report
Export
Collapse

You have an Azure Data Factory pipeline named pipeline1 that is invoked by a tumbling window trigger named Trigger1. Trigger1 has a recurrence of 60 minutes. You need to ensure that pipeline1 will execute only if the previous execution completes successfully. How should you configure the self-dependency for Trigger1?

offset: "-00:01:00" size: "00:01:00"
offset: "-00:01:00" size: "00:01:00"
offset: "01:00:00" size: "-01:00:00"
offset: "01:00:00" size: "-01:00:00"
offset: "01:00:00" size: "01:00:00"
offset: "01:00:00" size: "01:00:00"
offset: "-01:00:00" size: "01:00:00"
offset: "-01:00:00" size: "01:00:00"
Suggested answer: D

Explanation:


Tumbling window self-dependency properties

In scenarios where the trigger shouldn't proceed to the next window until the preceding window is successfully completed, build a self-dependency. A self-dependency trigger that's dependent on the success of earlier runs of itself within the preceding hour will have the properties indicated in the following code.

Example code:

"name": "DemoSelfDependency",

"properties": {

"runtimeState": "Started",

"pipeline": {

"pipelineReference": {

"referenceName": "Demo",

"type": "PipelineReference"

}

},

"type": "TumblingWindowTrigger",

"typeProperties": {

"frequency": "Hour",

"interval": 1,

"startTime": "2018-10-04T00:00:00Z",

"delay": "00:01:00",

"maxConcurrency": 50,

"retryPolicy": {

"intervalInSeconds": 30

},

"dependsOn": [

{

"type": "SelfDependencyTumblingWindowTriggerReference",

"size": "01:00:00",

"offset": "-01:00:00"

}

]

}

}

}

Reference:

https://docs.microsoft.com/en-us/azure/data-factory/tumbling-window-trigger-dependency

asked 02/10/2024
Rosalba Scalera
48 questions

Question 255

Report
Export
Collapse

You have an Azure subscription that contains an Azure Synapse Analytics dedicated SQL pool named SQLPool1. SQLPool1 is currently paused.

You need to restore the current state of SQLPool1 to a new SQL pool. What should you do first?

Become a Premium Member for full access
  Unlock Premium Member

Question 256

Report
Export
Collapse

HOTSPOT

You have an Azure Synapse Analytics dedicated SQL pool named Pool1 that contains an external table named Sales. Sales contains sales data. Each row in Sales contains data on a single sale, including the name of the salesperson. You need to implement row-level security (RLS). The solution must ensure that the salespeople can access only their respective sales.

What should you do? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

Become a Premium Member for full access
  Unlock Premium Member

Question 257

Report
Export
Collapse

You have an Azure SQL database named DB1 and an Azure Data Factory data pipeline named pipeline. From Data Factory, you configure a linked service to DB1.

In DB1, you create a stored procedure named SP1. SP1 returns a single row of data that has four columns. You need to add an activity to pipeline to execute SP1. The solution must ensure that the values in the columns are stored as pipeline variables. Which two types of activities can you use to execute SP1? (Refer to Data Engineering on Microsoft Azure documents or guide for Answers/Explanation available at Microsoft.com)

Become a Premium Member for full access
  Unlock Premium Member

Question 258

Report
Export
Collapse

You have a Microsoft Purview account. The Lineage view of a CSV file is shown in the following exhibit.

Microsoft DP-203 image Question 82 89724 10022024015850000000

How is the data for the lineage populated?

Become a Premium Member for full access
  Unlock Premium Member

Question 259

Report
Export
Collapse

HOTSPOT

You have an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 contains a fact table named Tablet. Table1 contains sales dat a. Sixty-five million rows of data are added to Table1 monthly. At the end of each month, you need to remove data that is older than 36 months. The solution must minimize how long it takes to remove the data. How should you partition Table1, and how should you remove the old data? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

Become a Premium Member for full access
  Unlock Premium Member

Question 260

Report
Export
Collapse

HOTSPOT

You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Sales.Orders. Sales.Orders contains a column named SalesRep.

You plan to implement row-level security (RLS) for Sales.Orders. You need to create the security policy that will be used to implement RLS. The solution must ensure that sales representatives only see rows for which the value of the SalesRep column matches their username. How should you complete the code? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

Become a Premium Member for full access
  Unlock Premium Member
Total 341 questions
Go to page: of 35
Search

Related questions