Microsoft DP-203 Practice Test - Questions Answers, Page 34

List of questions
Question 331

You have an Azure subscription that contains an Azure Synapse Analytics account and a Microsoft Purview account.
You create a pipeline named Pipeline1 for data ingestion to a dedicated SQL pool.
You need to generate data lineage from Pipeline1 to Microsoft Purview.
Which two activities generate data lineage? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.
Question 332

You use Azure Data Factory to create data pipelines.
You are evaluating whether to integrate Data Factory and GitHub for source and version control What are two advantages of the integration? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.
Question 333

HOTSPOT
You have an Azure Data Lake Storage account that contains one CSV file per hour for January 1, 2020, through January 31, 2023. The files are partitioned by using the following folder structure.
csv/system1/{year}/{month)/{filename).csv
You need to query the files by using an Azure Synapse Analytics serverless SQL pool The solution must return the row count of each file created during the last three months of 2022.
How should you complete the query? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Question 334

HOTSPOT
You have an Azure subscription that contains the Azure Synapse Analytics workspaces shown in the following table.
Each workspace must read and write data to datalake1.
Each workspace contains an unused Apache Spark pool.
You plan to configure each Spark pool to share catalog objects that reference datalakel For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE Each correct selection is worth one point.
Question 335

You have an Azure Blob storage account named storage! and an Azure Synapse Analytics serverless SQL pool named Pool! From Pool1., you plan to run ad-hoc queries that target storage!
You need to ensure that you can use shared access signature (SAS) authorization without defining a data source. What should you create first?
Question 336

DRAG DROP
You have a data warehouse.
You need to implement a slowly changing dimension (SCD) named Product that will include three columns named ProductName, ProductColor, and ProductSize. The solution must meet the following requirements:
* Prevent changes to the values stored in ProductName.
* Retain all the current and previous values in ProductColor.
* Retain only the current and the last values in ProductSize.
Which type of SCD should you implement for each column? To answer, drag the appropriate types to the correct columns.
Question 337

You have an Azure subscription that contains an Azure SQL database named SQLDB1 and an Azure Synapse Analytics dedicated SQL pool named Pool1.
You need to replicate data from SQLDB1 to Pool1. The solution must meet the following requirements:
* Minimize performance impact on SQLDB1.
* Support near-real-time (NRT) analytics.
* Minimize administrative effort.
What should you use?
Question 338

You have an Azure subscription that contains an Azure Synapse Analytics workspace named Workspaces a Log Analytics workspace named Workspace2, and an Azure Data Lake Storage Gen2 container named Container1.
Workspace1 contains an Apache Spark job named Job1 that writes data to Container1. Workspace1 sends diagnostics to Workspace2.
From Synapse Studio, you submit Job1.
What should you use to review the LogQuery output of the job?
Question 339

You have an Azure subscription that contains the resources shown in the following table.
You need to read the files in storage1 by using ad-hoc queries and the openrowset function. The solution must ensure that each rowset contains a single JSON record.
To what should you set the format option of the openrowset function?
Question 340

HOTSPOT
You have two Azure SQL databases named DB1 and DB2.
DB1 contains a table named Tabid. Table! contains a timestamp column named LastModifiedOn. LastModifiedOn contains the timestamp of the most recent update for each individual row.
DB2 contains a table named Watermark. Watermark contains a single timestamp column named WatermarkValue.
You plan to create an Azure Data Factory pipeline that will incrementally upload into Azure Blob Storage all the rows in Tablel for which the LastModifiedOn column contains a timestamp newer than the most recent value of the WatermarkValue column in Watermark.
You need to identify which activities to include in the pipeline. The solution must meet the following requirements:
* Minimize the effort to author the pipeline.
* Ensure that the number of data integration units allocated to the upload operation can be controlled.
What should you identify? To answer, select the appropriate options in the answer area.
NOTE: Each correct answer is worth one point.
Question