Question 16

A data engineer wants to create a cluster using the Databricks CLI for a big ETL pipeline. The cluster should have five workers, one driver of type i3.xlarge, and should use the '14.3.x-scala2.12' runtime.
Which command should the data engineer use?
  • Question 17

    A table nameduser_ltvis being used to create a view that will be used by data analysts on various teams. Users in the workspace are configured into groups, which are used for setting up data access using ACLs.
    Theuser_ltvtable has the following schema:
    email STRING, age INT, ltv INT
    The following view definition is executed:

    An analyst who is not a member of the marketing group executes the following query:
    SELECT * FROM email_ltv
    Which statement describes the results returned by this query?
  • Question 18

    A data engineer is tasked with ensuring that a Delta table in Databricks continuously retains deleted files for 15 days (instead of the default 7 days), in order to permanently comply with the organization's data retention policy.
    Which code snippet correctly sets this retention period for deleted files?
  • Question 19

    You are looking to process the data based on two variables, one to check if the department is supply chain or check if process flag is set to True
  • Question 20

    Which of the following data workloads will utilize a Silver table as its source?