Microsoft DP-700 Practice Test - Questions Answers
List of questions
Related questions
Question 1
You need to ensure that the data analysts can access the gold layer lakehouse.
What should you do?
Add the DataAnalyst group to the Viewer role for WorkspaceA.
Share the lakehouse with the DataAnalysts group and grant the Build reports on the default semantic model permission.
Share the lakehouse with the DataAnalysts group and grant the Read all SQL Endpoint data permission.
Share the lakehouse with the DataAnalysts group and grant the Read all Apache Spark permission.
Explanation:
Data Analysts' Access Requirements must only have read access to the Delta tables in the gold layer and not have access to the bronze and silver layers.
The gold layer data is typically queried via SQL Endpoints. Granting the Read all SQL Endpoint data permission allows data analysts to query the data using familiar SQL-based tools while restricting access to the underlying files.
Question 2
HOTSPOT
You need to recommend a method to populate the POS1 data to the lakehouse medallion layers.
What should you recommend for each layer? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Question 3
You need to ensure that usage of the data in the Amazon S3 bucket meets the technical requirements.
What should you do?
Create a workspace identity and enable high concurrency for the notebooks.
Create a shortcut and ensure that caching is disabled for the workspace.
Create a workspace identity and use the identity in a data pipeline.
Create a shortcut and ensure that caching is enabled for the workspace.
Explanation:
To ensure that the usage of the data in the Amazon S3 bucket meets the technical requirements, we must address two key points:
Question 4
HOTSPOT
You need to create the product dimension.
How should you complete the Apache Spark SQL code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Question 5
You need to populate the MAR1 data in the bronze layer.
Which two types of activities should you include in the pipeline? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
ForEach
Copy data
WebHook
Stored procedure
Explanation:
MAR1 has seven entities, each accessible via a different API endpoint. A ForEach activity is required to iterate over these endpoints to fetch data from each one. It enables dynamic execution of API calls for each entity.
The Copy data activity is the primary mechanism to extract data from REST APIs and load it into the bronze layer in Delta format. It supports native connectors for REST APIs and Delta, minimizing development effort.
Question 6
You need to implement the solution for the book reviews.
Which should you do?
Create a Dataflow Gen2 dataflow.
Create a shortcut.
Enable external data sharing.
Create a data pipeline.
Explanation:
The requirement specifies that Litware plans to make the book reviews available in the lakehouse without making a copy of the data. In this case, creating a shortcut in Fabric is the most appropriate solution. A shortcut is a reference to the external data, and it allows Litware to access the book reviews stored in Amazon S3 without duplicating the data into the lakehouse.
Question 7
You need to resolve the sales data issue. The solution must minimize the amount of data transferred.
What should you do?
Spilt the dataflow into two dataflows.
Configure scheduled refresh for the dataflow.
Configure incremental refresh for the dataflow. Set Store rows from the past to 1 Month.
Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1 Year.
Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1 Month.
Explanation:
The sales data issue can be resolved by configuring incremental refresh for the dataflow. Incremental refresh allows for only the new or changed data to be processed, minimizing the amount of data transferred and improving performance.
The solution specifies that data older than one month never changes, so setting the refresh period to 1 Month is appropriate. This ensures that only the most recent month of data will be refreshed, reducing unnecessary data transfers.
Question 8
HOTSPOT
You need to troubleshoot the ad-hoc query issue.
How should you complete the statement? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Question 9
You have a Fabric workspace.
You have semi-structured data.
You need to read the data by using T-SQL, KQL, and Apache Spark. The data will only be written by using Spark.
What should you use to store the data?
a lakehouse
an eventhouse
a datamart
a warehouse
Explanation:
A lakehouse is the best option for storing semi-structured data when you need to read it using T-SQL, KQL, and Apache Spark. A lakehouse combines the flexibility of a data lake (which can handle semi-structured and unstructured data) with the performance features of a data warehouse. It allows data to be written using Apache Spark and can be queried using different technologies such as T-SQL (for SQL-based querying), KQL (Kusto Query Language for querying), and Apache Spark (for distributed processing). This solution is ideal when dealing with semi-structured data and requiring a versatile querying approach.
Question 10
You have a Fabric workspace that contains a warehouse named Warehouse1.
You have an on-premises Microsoft SQL Server database named Database1 that is accessed by using an on-premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?
a Dataflow Gen1 dataflow
a data pipeline
a KQL queryset
a notebook
Explanation:
To copy data from an on-premises Microsoft SQL Server database (Database1) to a warehouse (Warehouse1) in Microsoft Fabric, the best option is to use a data pipeline. A data pipeline in Fabric allows for the orchestration of data movement, from source to destination, using connectors, transformations, and scheduled workflows. Since the data is being transferred from an on-premises database and requires the use of a data gateway, a data pipeline provides the appropriate framework to facilitate this data movement efficiently and reliably.
Question