Microsoft DP-203 Practice Exams
Last updated on Apr 09,2025- Exam Code: DP-203
- Exam Name: Data Engineering on Microsoft Azure
- Certification Provider: Microsoft
- Latest update: Apr 09,2025
You are designing a streaming data solution that will ingest variable volumes of data.
You need to ensure that you can change the partition count after creation.
Which service should you use to ingest the data?
- A . Azure Event Hubs Dedicated
- B . Azure Stream Analytics
- C . Azure Data Factory
- D . Azure Synapse Analytics
You are designing an Azure Synapse Analytics dedicated SQL pool.
You need to ensure that you can audit access to Personally Identifiable information (PII).
What should you include in the solution?
- A . dynamic data masking
- B . row-level security (RLS)
- C . sensitivity classifications
- D . column-level security
You have an Azure Synapse Analytics dedicated SQL pool.
You need to Create a fact table named Table1 that will store sales data from the last three years. The solution must be optimized for the following query operations:
Show order counts by week.
• Calculate sales totals by region.
• Calculate sales totals by product.
• Find all the orders from a given month.
Which data should you use to partition Table1?
- A . region
- B . product
- C . week
- D . month
HOTSPOT
You have an Azure subscription that contains the Azure Synapse Analytics workspaces shown in the following table.
Each workspace must read and write data to datalake1.
Each workspace contains an unused Apache Spark pool.
You plan to configure each Spark pool to share catalog objects that reference datalakel For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE Each correct selection is worth one point.
You are designing a date dimension table in an Azure Synapse Analytics dedicated SQL pool. The date dimension table will be used by all the fact tables.
Which distribution type should you recommend to minimize data movement?
- A . HASH
- B . REPLICATE
- C . ROUND ROBIN
You have an Azure subscription that contains the resources shown in the following table.
You need to read the TSV files by using ad-hoc queries and the openrowset function. The solution must assign a name and override the inferred data type of each column.
What should you include in the openrowset function?
- A . the with clause
- B . the rowsetoptions bulk option
- C . the datafiletype bulk option
- D . the DATA_source parameter
You have an Azure subscription that contains the resources shown in the following table.
You need to read the TSV files by using ad-hoc queries and the openrowset function. The solution must assign a name and override the inferred data type of each column.
What should you include in the openrowset function?
- A . the with clause
- B . the rowsetoptions bulk option
- C . the datafiletype bulk option
- D . the DATA_source parameter
You have an Azure subscription that contains an Azure Data Factory data pipeline named Pipeline1, a Log Analytics workspace named LA1, and a storage account named account1.
You need to retain pipeline-run data for 90 days.
The solution must meet the following requirements:
• The pipeline-run data must be removed automatically after 90 days.
• Ongoing costs must be minimized.
Which two actions should you perform? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.
- A . Configure Pipeline1 to send logs to LA1.
- B . From the Diagnostic settings (classic) settings of account1. set the retention period to 90 days.
- C . Configure Pipeline1 to send logs to account1.
- D . From the Data Retention settings of LA1, set the data retention period to 90 days.
What should you do to improve high availability of the real-time data processing solution?
- A . Deploy identical Azure Stream Analytics jobs to paired regions in Azure.
- B . Deploy a High Concurrency Databricks cluster.
- C . Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.
- D . Set Data Lake Storage to use geo-redundant storage (GRS).
HOTSPOT
You are building an Azure Analytics query that will receive input data from Azure IoT Hub and write the results to Azure Blob storage.
You need to calculate the difference in readings per sensor per hour.
How should you complete the query? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.