Free Access Databricks.Databricks-Certified-Professional-Data-Engineer.v2025-11-20.q139 Practice Test (Page 27)

Question 126

An analytics team wants to run a short-term experiment in Databricks SQL on the customer transactions Delta table (about 20 billion records) created by the data engineering team. Which strategy should the data engineering team use to ensure minimal downtime and no impact on the ongoing ETL processes?

A.Create a new table for the analytics team using a CTAS statement.

B.Deep clone the table for the analytics team.

C.Give the analytics team direct access to the production table.

D.Shallow clone the table for the analytics team.

Question 127

A new data engineer notices that a critical field was omitted from an application that writes its Kafka source to Delta Lake. This happened even though the critical field was in the Kafka source. That field was further missing from data written to dependent, long-term storage. The retention threshold on the Kafka service is seven days. The pipeline has been in production for three months.
Which describes how Delta Lake can help to avoid data loss of this nature in the future?

A.The Delta log and Structured Streaming checkpoints record the full history of the Kafka producer.

B.Delta Lake schema evolution can retroactively calculate the correct value for newly added fields, as long as the data was in the original source.

C.Delta Lake automatically checks that all fields present in the source data are included in the ingestion layer.

D.Data can never be permanently dropped or deleted from Delta Lake, so data loss is not possible under any circumstance.

E.Ingestine all raw data and metadata from Kafka to a bronze Delta table creates a permanent, replayable history of the data state.

Question 128

The research team has put together a funnel analysis query to monitor the customer traffic on the e-commerce platform, the query takes about 30 mins to run on a small SQL endpoint cluster with max scaling set to 1 cluster. What steps can be taken to improve the performance of the query?

A.They can turn on the Serverless feature for the SQL endpoint.

B.They can increase the maximum bound of the SQL endpoint's scaling range anywhere from between 1 to 100 to review the performance and select the size that meets the re-quired SLA.

C.They can increase the cluster size anywhere from X small to 3XL to review the per-formance and select the size that meets the required SLA.

D.They can turn off the Auto Stop feature for the SQL endpoint to more than 30 mins.

E.They can turn on the Serverless feature for the SQL endpoint and change the Spot In-stance Policy from
"Cost optimized" to "Reliability Optimized."

Correct Answer: C

Explanation
The answer is, They can increase the cluster size anywhere from 2X-Small to 4XL(Scale Up) to review the performance and select the size that meets your SLA. If you are trying to improve the performance of a single query at a time having additional memory, additional worker nodes mean that more tasks can run in a cluster which will improve the performance of that query.
The question is looking to test your ability to know how to scale a SQL Endpoint(SQL Warehouse) and you have to look for cue words or need to understand if the queries are running sequentially or concurrently. if the queries are running sequentially then scale up(Size of the cluster from 2X-Small to 4X-Large) if the queries are running concurrently or with more users then scale out(add more clusters).
SQL Endpoint(SQL Warehouse) Overview: (Please read all of the below points and the below diagram to understand )
1.A SQL Warehouse should have at least one cluster
2.A cluster comprises one driver node and one or many worker nodes
3.No of worker nodes in a cluster is determined by the size of the cluster (2X -Small ->1 worker, X-Small ->2 workers.... up to 4X-Large -> 128 workers) this is called Scale Up
4.A single cluster irrespective of cluster size(2X-Smal.. to ...4XLarge) can only run 10 queries at any given time if a user submits 20 queries all at once to a warehouse with 3X-Large cluster size and cluster scaling (min
1, max1) while 10 queries will start running the remaining 10 queries wait in a queue for these 10 to finish.
5.Increasing the Warehouse cluster size can improve the performance of a query, example if a query runs for 1 minute in a 2X-Small warehouse size, it may run in 30 Seconds if we change the warehouse size to X-Small.
this is due to 2X-Small has 1 worker node and X-Small has 2 worker nodes so the query has more tasks and runs faster (note: this is an ideal case example, the scalability of a query performance depends on many factors, it can not always be linear)
6.A warehouse can have more than one cluster this is called Scale Out. If a warehouse is configured with X-Small cluster size with cluster scaling(Min1, Max 2) Databricks spins up an additional cluster if it detects queries are waiting in the queue, If a warehouse is configured to run 2 clusters(Min1, Max 2), and let's say a user submits 20 queries, 10 queriers will start running and holds the remaining in the queue and databricks will automatically start the second cluster and starts redirecting the 10 queries waiting in the queue to the second cluster.
7.A single query will not span more than one cluster, once a query is submitted to a cluster it will remain in that cluster until the query execution finishes irrespective of how many clusters are available to scale.
Please review the below diagram to understand the above concepts:

Scale-up-> Increase the size of the SQL endpoint, change cluster size from 2X-Small to up to 4X-Large If you are trying to improve the performance of a single query having additional memory, additional worker nodes and cores will result in more tasks running in the cluster will ultimately improve the performance.
During the warehouse creation or after, you have the ability to change the warehouse size (2X-Small....to
...4XLarge) to improve query performance and the maximize scaling range to add more clusters on a SQL Endpoint(SQL Warehouse) scale-out if you are changing an existing warehouse you may have to restart the warehouse to make the changes effective.

Question 129

You noticed that colleague is manually copying the notebook with _bkp to store the previous ver-sions, which of the following feature would you recommend instead.

A.Databricks notebooks support change tracking and versioning

B.Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks

C.Databricks notebooks can be exported into dbc archive files and stored in data lake

D.Databricks notebook can be exported as HTML and imported at a later time

Question 130

A data engineer is configuring a Lakeflow Declarative Pipeline to process CDC (Change Data Capture) data from a source. The source events sometimes arrive out of order, and multiple updates may occur with the same update_timestamp but with different update_sequence_id.
What should the data engineer do to ensure events are sequenced correctly?

A.O Set track_history_column_list to [event_timestamp, event_id] in AUTO CDC APIs.

B.O Use dropDuplicates() to remove out-of-order and duplicate records in LDP.

C.O Use SEQUENCE BY STRUCT(event_timestamp, update_sequence_id) in AUTO CDC APIs.

D.O Use a window function to sort update_sequence_id within the same partition, i.e., update_timestamp in the LDP pipeline.

Other Version: 543Databricks.Databricks-Certified-Professional-Data-Engineer.v2025-10-30.q132; 1571Databricks.Databricks-Certified-Professional-Data-Engineer.v2024-03-27.q76; 1296Databricks.Databricks-Certified-Professional-Data-Engineer.v2023-12-11.q54

Latest Upload: 104SAP.C_BCBAI_2502.v2026-01-08.q38; 104Oracle.1Z0-1056-24.v2026-01-08.q53; 138Huawei.H13-831_V2.0.v2026-01-07.q101; 145Salesforce.Salesforce-Slack-Administrator.v2026-01-06.q103; 122CIPS.L5M15.v2026-01-06.q31; 113Oracle.1Z0-1072-25.v2026-01-06.q18; 123Oracle.1Z0-1042-25.v2026-01-05.q55; 131EMC.D-PCR-DY-01.v2026-01-05.q77; 125DSCI.DCPLA.v2026-01-05.q64; 160TheOpenGroup.OGA-031.v2026-01-05.q42

Question 126

Question 127

Question 128

Question 129

Question 130

Download PDF File