Question 91

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
* A workload for data engineers who will use Python and SQL.
* A workload for jobs that will run notebooks that use Python, Scala, and SOL.
* A workload that data scientists will use to perform ad hoc analysis in Scala and R.
The enterprise architecture team at your company identifies the following standards for Databricks environments:
* The data engineers must share a cluster.
* The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
* All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
You need to create the Databricks clusters for the workloads.
Solution: You create a Standard cluster for each data scientist, a High Concurrency cluster for the data engineers, and a High Concurrency cluster for the jobs.
Does this meet the goal?
  • Question 92

    You have an Azure SQL database named Database1 and two Azure event hubs named HubA and HubB. The data consumed from each source is shown in the following table.

    You need to implement Azure Stream Analytics to calculate the average fare per mile by driver.
    How should you configure the Stream Analytics input for each source? To answer, select the appropriate options in the answer area.
    NOTE: Each correct selection is worth one point.

    Question 93

    You have an Azure Storage account that generates 200.000 new files daily. The file names have a format of (YYY)/(MM)/(DD)/|HH])/(CustornerID).csv.
    You need to design an Azure Data Factory solution that will toad new data from the storage account to an Azure Data lake once hourly. The solution must minimize load times and costs.
    How should you configure the solution? To answer, select the appropriate options in the answer area.
    NOTE: Each correct selection is worth one point.

    Question 94

    You use Azure Data Factory to prepare data to be queried by Azure Synapse Analytics serverless SQL pools.
    Files are initially ingested into an Azure Data Lake Storage Gen2 account as 10 small JSON files. Each file contains the same data attributes and data from a subsidiary of your company.
    You need to move the files to a different folder and transform the data to meet the following requirements:
    Provide the fastest possible query times.
    Automatically infer the schema from the underlying files.
    How should you configure the Data Factory copy activity? To answer, select the appropriate options in the answer area.
    NOTE: Each correct selection is worth one point.

    Question 95

    Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.
    NOTE: Each correct selection is worth one point.