Question 1

Which of the following benefits is provided by the array functions from Spark SQL?
  • Question 2

    A data engineer has joined an existing project and they see the following query in the project repository:
    CREATE STREAMING LIVE TABLE loyal_customers AS
    SELECT customer_id -
    FROM STREAM(LIVE.customers)
    WHERE loyalty_level = 'high';
    Which of the following describes why the STREAM function is included in the query?
  • Question 3

    A data analyst has developed a query that runs against Delta table. They want help from the data engineering team to implement a series of tests to ensure the data returned by the query is clean. However, the data engineering team uses Python for its tests rather than SQL.
    Which of the following operations could the data engineering team use to run the query and operate with the results in PySpark?
  • Question 4

    A data analysis team has noticed that their Databricks SQL queries are running too slowly when connected to their always-on SQL endpoint. They claim that this issue is present when many members of the team are running small queries simultaneously. They ask the data engineering team for help. The data engineering team notices that each of the team's queries uses the same SQL endpoint.
    Which of the following approaches can the data engineering team use to improve the latency of the team's queries?
  • Question 5

    A data engineer and data analyst are working together on a data pipeline. The data engineer is working on the raw, bronze, and silver layers of the pipeline using Python, and the data analyst is working on the gold layer of the pipeline using SQL. The raw source of the pipeline is a streaming input. They now want to migrate their pipeline to use Delta Live Tables.
    Which of the following changes will need to be made to the pipeline when migrating to Delta Live Tables?