Question 1
You have an Azure Databricks workspace and an Azure Data Lake Storage Gen2 account named storage!
New files are uploaded daily to storage1.
* Incrementally process new files as they are upkorage1 as a structured streaming source. The solution must meet the following requirements:
* Minimize implementation and maintenance effort.
* Minimize the cost of processing millions of files.
* Support schema inference and schema drift.
Which should you include in the recommendation?
New files are uploaded daily to storage1.
* Incrementally process new files as they are upkorage1 as a structured streaming source. The solution must meet the following requirements:
* Minimize implementation and maintenance effort.
* Minimize the cost of processing millions of files.
* Support schema inference and schema drift.
Which should you include in the recommendation?
Question 2
You have an Azure Data Factory version 2 (V2) resource named Df1. Df1 contains a linked service.
You have an Azure Key vault named vault1 that contains an encryption key named key1.
You need to encrypt Df1 by using key1.
What should you do first?
You have an Azure Key vault named vault1 that contains an encryption key named key1.
You need to encrypt Df1 by using key1.
What should you do first?
Question 3
You are designing an Azure Synapse solution that will provide a query interface for the data stored in an Azure Storage account. The storage account is only accessible from a virtual network.
You need to recommend an authentication mechanism to ensure that the solution can access the source data.
What should you recommend?
You need to recommend an authentication mechanism to ensure that the solution can access the source data.
What should you recommend?
Question 4
You have an Azure Data Factory that contains 10 pipelines.
You need to label each pipeline with its main purpose of either ingest, transform, or load. The labels must be available for grouping and filtering when using the monitoring experience in Data Factory.
What should you add to each pipeline?
You need to label each pipeline with its main purpose of either ingest, transform, or load. The labels must be available for grouping and filtering when using the monitoring experience in Data Factory.
What should you add to each pipeline?
Question 5
You are designing a date dimension table in an Azure Synapse Analytics dedicated SQL pool. The date dimension table will be used by all the fact tables.
Which distribution type should you recommend to minimize data movement?
Which distribution type should you recommend to minimize data movement?