Microsoft DP-203 Practice Exams
Last updated on Apr 01,2025- Exam Code: DP-203
- Exam Name: Data Engineering on Microsoft Azure
- Certification Provider: Microsoft
- Latest update: Apr 01,2025
HOTSPOT
From a website analytics system, you receive data extracts about user interactions such as downloads, link clicks, form submissions, and video plays.
The data contains the following columns.
You need to design a star schema to support analytical queries of the data. The star schema will contain four tables including a date dimension.
To which table should you add each column? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
You are planning a streaming data solution that will use Azure Databricks. The solution will stream sales transaction data from an online store.
The solution has the following specifications:
* The output data will contain items purchased, quantity, line total sales amount, and line total tax amount.
* Line total sales amount and line total tax amount will be aggregated in Databricks.
* Sales transactions will never be updated. Instead, new rows will be added to adjust a sale.
You need to recommend an output mode for the dataset that will be processed by using Structured Streaming. The solution must minimize duplicate data.
What should you recommend?
- A . Append
- B . Update
- C . Complete
You have an enterprise data warehouse in Azure Synapse Analytics.
You need to monitor the data warehouse to identify whether you must scale up to a higher service
level to accommodate the current workloads
Which is the best metric to monitor?
More than one answer choice may achieve the goal. Select the BEST answer.
- A . Data 10 percentage
- B . CPU percentage
- C . DWU used
- D . DWU percentage
HOTSPOT
You have an Azure subscription that contains the following resources:
An Azure Active Directory (Azure AD) tenant that contains a security group named Group1 An Azure Synapse Analytics SQL pool named Pool1
You need to control the access of Group1 to specific columns and rows in a table in Pool1.
Which Transact-SQL commands should you use? To answer, select the appropriate options in the answer area.
HOTSPOT
You have an Azure Synapse Analytics dedicated SQL pool named Pool1 and an Azure Data Lake Storage Gen2 account named Account1.
You plan to access the files in Account1 by using an external table.
You need to create a data source in Pool1 that you can reference when you create the external table.
How should you complete the Transact-SQL statement? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
DRAG DROP
You have an Azure Data Lake Storage Gen2 account that contains a JSON file for customers. The file contains two attributes named FirstName and LastName.
You need to copy the data from the JSON file to an Azure Synapse Analytics table by using Azure Databricks. A new column must be created that concatenates the FirstName and LastName values.
You create the following components:
✑ A destination table in Azure Synapse
✑ An Azure Blob storage container
✑ A service principal
In which order should you perform the actions? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
You are designing an Azure Databricks table. The table will ingest an average of 20 million streaming events per day.
You need to persist the events in the table for use in incremental load pipeline jobs in Azure Databricks. The solution must minimize storage costs and incremental load times.
What should you include in the solution?
- A . Partition by DateTime fields.
- B . Sink to Azure Queue storage.
- C . Include a watermark column.
- D . Use a JSON format for physical data storage.
You haw an Azure data factory named ADF1.
You currently publish all pipeline authoring changes directly to ADF1.
You need to implement version control for the changes made to pipeline artifacts. The solution must ensure that you can apply version control to the resources currently defined m the UX Authoring canvas for ADF1.
Which two actions should you perform? Each correct answer presents part of the solution NOTE: Each correct selection is worth one point.
- A . Create an Azure Data Factory trigger
- B . From the UX Authoring canvas, select Set up code repository
- C . Create a GitHub action
- D . From the Azure Data Factor Studio, run Publish All.
- E . Create a Git repository
- F . From the UX Authoring canvas, select Publish
HOTSPOT
You have an Azure Data Lake Storage Gen2 account named account1 that contains a container named Container"1. Container1 contains two folders named FolderA and FolderB.
You need to configure access control lists (ACLs) to meet the following requirements:
• Group1 must be able to list and read the contents and subfolders of FolderA.
• Group2 must be able to list and read the contents of FolderA and FolderB.
• Group2 must be prevented from reading any other folders at the root of Container1.
How should you configure the ACL permissions for each group? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
When designing partitions for the product sales transactions, which column should be used to partition the data to ensure efficient monthly loads?
- A . ProductID
- B . SalesDate
- C . RetailStoreID
- D . PromotionID