ExamGecko
Home Home / Microsoft / DP-203

Microsoft DP-203 Practice Test - Questions Answers, Page 7

Question list
Search
Search

List of questions

Search

Related questions











You have an enterprise data warehouse in Azure Synapse Analytics. Using PolyBase, you create an external table named [Ext].[Items] to query Parquet files stored in Azure Data Lake Storage Gen2 without importing the data to the data warehouse. The external table has three columns.

You discover that the Parquet files have a fourth column named ItemID. Which command should you run to add the ItemID column to the external table?

A.
Option A
A.
Option A
Answers
B.
Option B
B.
Option B
Answers
C.
Option C
C.
Option C
Answers
D.
Option D
D.
Option D
Answers
Suggested answer: C

Explanation:

Incorrect Answers:

A, D: Only these Data Definition Language (DDL) statements are allowed on external tables:

CREATE TABLE and DROP TABLE

CREATE STATISTICS and DROP STATISTICS CREATE VIEW and DROP VIEW

Reference: https://docs.microsoft.com/en-us/sql/t-sql/statements/create-external-table-transact-sql

You have an Azure Data Lake Storage Gen2 container that contains 100 TB of data. You need to ensure that the data in the container is available for read workloads in a secondary region if an outage occurs in the primary region. The solution must minimize costs. Which type of data redundancy should you use?

A.
geo-redundant storage (GRS)
A.
geo-redundant storage (GRS)
Answers
B.
read-access geo-redundant storage (RA-GRS)
B.
read-access geo-redundant storage (RA-GRS)
Answers
C.
zone-redundant storage (ZRS)
C.
zone-redundant storage (ZRS)
Answers
D.
locally-redundant storage (LRS)
D.
locally-redundant storage (LRS)
Answers
Suggested answer: B

Explanation:

Geo-redundant storage (with GRS or GZRS) replicates your data to another physical location in the secondary region to protect against regional outages. However, that data is available to be read only if the customer or Microsoft initiates a failover from the primary to secondary region. When you enable read access to the secondary region, your data is available to be read at all times, including in a situation where the primary region becomes unavailable. Incorrect Answers:

A: While Geo-redundant storage (GRS) is cheaper than Read-Access Geo-Redundant Storage (RA-GRS), GRS does NOT initiate automatic failover. C, D: Locally redundant storage (LRS) and Zone-redundant storage (ZRS) provides redundancy within a single region.

Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy

You plan to implement an Azure Data Lake Gen 2 storage account. You need to ensure that the data lake will remain available if a data center fails in the primary Azure region. The solution must minimize costs. Which type of replication should you use for the storage account?

A.
geo-redundant storage (GRS)
A.
geo-redundant storage (GRS)
Answers
B.
geo-zone-redundant storage (GZRS)
B.
geo-zone-redundant storage (GZRS)
Answers
C.
locally-redundant storage (LRS)
C.
locally-redundant storage (LRS)
Answers
D.
zone-redundant storage (ZRS)
D.
zone-redundant storage (ZRS)
Answers
Suggested answer: D

Explanation:


Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy

You are designing a fact table named FactPurchase in an Azure Synapse Analytics dedicated SQL pool. The table contains purchases from suppliers for a retail store. FactPurchase will contain the following columns.

FactPurchase will have 1 million rows of data added daily and will contain three years of data.

Transact-SQL queries similar to the following query will be executed daily.

SELECT

SupplierKey, StockItemKey, IsOrderFinalized, COUNT(*)

FROM FactPurchase

WHERE DateKey >= 20210101

AND DateKey <= 20210131

GROUP By SupplierKey, StockItemKey, IsOrderFinalized

Which table distribution will minimize query times?

A.
replicated
A.
replicated
Answers
B.
hash-distributed on PurchaseKey
B.
hash-distributed on PurchaseKey
Answers
C.
round-robin
C.
round-robin
Answers
D.
hash-distributed on IsOrderFinalized
D.
hash-distributed on IsOrderFinalized
Answers
Suggested answer: B

Explanation:

Hash-distributed tables improve query performance on large fact tables. To balance the parallel processing, select a distribution column that:

Has many unique values. The column can have duplicate values. All rows with the same value are assigned to the same distribution. Since there are 60 distributions, some distributions can have > 1 unique values while others may end with zero values.

Does not have NULLs, or has only a few NULLs. Is not a date column. Incorrect Answers:

C: Round-robin tables are useful for improving loading speed.

Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute

Note: This question-is part of a series of questions that present the same scenario. Each question-in the series contains a unique solution that might meet the stated goals. Some question-sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question-in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You have an Azure Storage account that contains 100 GB of files. The files contain rows of text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB. You plan to copy the data from the storage account to an enterprise data warehouse in Azure Synapse Analytics. You need to prepare the files to ensure that the data copies quickly. Solution: You convert the files to compressed delimited text files. Does this meet the goal?

A.
Yes
A.
Yes
Answers
B.
No
B.
No
Answers
Suggested answer: A

Explanation:

All file formats have different performance characteristics. For the fastest load, use compressed delimited text files.

Reference:

https://docs.microsoft.com/en-us/azure/sql-data-warehouse/guidance-for-loading-data

Note: This question-is part of a series of questions that present the same scenario. Each question-in the series contains a unique solution that might meet the stated goals. Some question-sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question-in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You have an Azure Storage account that contains 100 GB of files. The files contain rows of text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB. You plan to copy the data from the storage account to an enterprise data warehouse in Azure Synapse Analytics. You need to prepare the files to ensure that the data copies quickly. Solution: You copy the files to a table that has a columnstore index. Does this meet the goal?

A.
Yes
A.
Yes
Answers
B.
No
B.
No
Answers
Suggested answer: B

Explanation:

Instead convert the files to compressed delimited text files.

Reference:

https://docs.microsoft.com/en-us/azure/sql-data-warehouse/guidance-for-loading-data

Note: This question-is part of a series of questions that present the same scenario. Each question-in the series contains a unique solution that might meet the stated goals. Some question-sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question-in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure Storage account that contains 100 GB of files. The files contain rows of text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB.

You plan to copy the data from the storage account to an enterprise data warehouse in Azure Synapse Analytics.

You need to prepare the files to ensure that the data copies quickly.

Solution: You modify the files to ensure that each row is more than 1 MB.

Does this meet the goal?

A.
Yes
A.
Yes
Answers
B.
No
B.
No
Answers
Suggested answer: B

Explanation:


You build a data warehouse in an Azure Synapse Analytics dedicated SQL pool. Analysts write a complex SELECT query that contains multiple JOIN and CASE statements to transform data for use in inventory reports. The inventory reports will use the data and additional WHERE parameters depending on the report. The reports will be produced once daily.

You need to implement a solution to make the dataset available for the reports. The solution must minimize query times. What should you implement?

A.
an ordered clustered columnstore index
A.
an ordered clustered columnstore index
Answers
B.
a materialized view
B.
a materialized view
Answers
C.
result set caching
C.
result set caching
Answers
D.
a replicated table
D.
a replicated table
Answers
Suggested answer: B

Explanation:

Materialized views for dedicated SQL pools in Azure Synapse provide a low maintenance method for complex analytical queries to get fast performance without any query change. Incorrect Answers:

C: One daily execution does not make use of result cache caching. Note: When result set caching is enabled, dedicated SQL pool automatically caches query results in the user database for repetitive use. This allows subsequent query executions to get results directly from the persisted cache so recomputation is not needed. Result set caching improves query performance and reduces compute resource usage. In addition, queries using cached results set do not use any concurrency slots and thus do not count against existing concurrency limits.

Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-materialized-views https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-result-set-caching

You have an Azure Synapse Analytics workspace named WS1 that contains an Apache Spark pool named Pool1. You plan to create a database named DB1 in Pool1.

You need to ensure that when tables are created in DB1, the tables are available automatically as external tables to the built-in serverless SQL pool. Which format should you use for the tables in DB1?

A.
CSV
A.
CSV
Answers
B.
ORC
B.
ORC
Answers
C.
JSON
C.
JSON
Answers
D.
Parquet
D.
Parquet
Answers
Suggested answer: D

Explanation:

Serverless SQL pool can automatically synchronize metadata from Apache Spark. A serverless SQL pool database will be created for each database existing in serverless Apache Spark pools. For each Spark external table based on Parquet or CSV and located in Azure Storage, an external table is created in a serverless SQL pool database.

Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-storage-files-spark-tables

You are planning a solution to aggregate streaming data that originates in Apache Kafka and is output to Azure Data Lake Storage Gen2. The developers who will implement the stream processing solution use Java. Which service should you recommend using to process the streaming data?

A.
Azure Event Hubs
A.
Azure Event Hubs
Answers
B.
Azure Data Factory
B.
Azure Data Factory
Answers
C.
Azure Stream Analytics
C.
Azure Stream Analytics
Answers
D.
Azure Databricks
D.
Azure Databricks
Answers
Suggested answer: D

Explanation:


Total 320 questions
Go to page: of 32