Free Access Databricks.Databricks-Certified-Professional-Data-Engineer.v2025-11-20.q139 Practice Test (Page 15)

Question 66

A Structured Streaming job deployed to production has been resulting in higher than expected cloud storage costs. At present, during normal execution, each micro-batch of data is processed in less than 3 seconds; at least 12 times per minute, a micro-batch is processed that contains 0 records. The streaming write was configured using the default trigger settings. The production job is currently scheduled alongside many other Databricks jobs in a workspace with instance pools provisioned to reduce start-up time for jobs with batch execution. Holding all other variables constant and assuming records need to be processed in less than 10 minutes, which adjustment will meet the requirement?

A.Set the trigger interval to 500 milliseconds; setting a small but non-zero trigger interval ensures that the source is not queried too frequently.

B.Set the trigger interval to 3 seconds; the default trigger interval is consuming too many records per batch, resulting in spill to disk that can increase volume costs.

C.Set the trigger interval to 10 minutes; each batch calls APIs in the source storage account, so decreasing trigger frequency to the maximum allowable threshold should minimize this cost.

D.Use the trigger once option and configure a Databricks job to execute the query every 10 minutes; this approach minimizes costs for both compute and storage.

Question 67

Given the following error traceback (from display(df.select(3*"heartrate"))) which shows AnalysisException: cannot resolve 'heartrateheartrateheartrate', which statement describes the error being raised?

A.There is a type error because a DataFrame object cannot be multiplied.

B.There is a syntax error because the heartrate column is not correctly identified as a column.

C.There is no column in the table named heartrateheartrateheartrate.

D.There is a type error because a column object cannot be multiplied.

Question 68

To reduce storage and compute costs, the data engineering team has been tasked with curating a series of aggregate tables leveraged by business intelligence dashboards, customer-facing applications, production machine learning models, and ad hoc analytical queries.
The data engineering team has been made aware of new requirements from a customer-facing application, which is the only downstream workload they manage entirely. As a result, an aggregate table used by numerous teams across the organization will need to have a number of fields renamed, and additional fields will also be added.
Which of the solutions addresses the situation while minimally interrupting other teams in the organization without increasing the number of tables that need to be managed?

A.Send all users notice that the schema for the table will be changing; include in the communication the logic necessary to revert the new table schema to match historic queries.

B.Configure a new table with all the requisite fields and new names and use this as the source for the customer-facing application; create a view that maintains the original data schema and table name by aliasing select fields from the new table.

C.Create a new table with the required schema and new fields and use Delta Lake's deep clone functionality to sync up changes committed to one table to the corresponding table.

D.Replace the current table definition with a logical view defined with the query logic currently writing the aggregate table; create a new table to power the customer-facing application.

E.Add a table comment warning all users that the table schema and field names will be changing on a given date; overwrite the table in place to the specifications of the customer-facing application.

Question 69

You are working on a process to query the table based on batch date, and batch date is an input parameter and expected to change every time the program runs, what is the best way to we can parameterize the query to run without manually changing the batch date?

A.Create a notebook parameter for batch date and assign the value to a python variable and use a spark data frame to filter the data based on the python variable

B.Create a dynamic view that can calculate the batch date automatically and use the view to query the data

C.There is no way we can combine python variable and spark code

D.Manually edit code every time to change the batch date

E.Store the batch date in the spark configuration and use a spark data frame to filter the data based on the spark configuration.

Question 70

A junior member of the data engineering team is exploring the language interoperability of Databricks notebooks. The intended outcome of the below code is to register a view of all sales that occurred in countries on the continent of Africa that appear in thegeo_lookuptable.
Before executing the code, runningSHOWTABLESon the current database indicates the database contains only two tables:geo_lookupandsales.

Which statement correctly describes the outcome of executing these command cells in order in an interactive notebook?

A.Both commands will succeed. Executing show tables will show that countries at and sales at have been registered as views.

B.Cmd 1 will succeed. Cmd 2 will search all accessible databases for a table or view named countries af: if this entity exists, Cmd 2 will succeed.

C.Cmd 1 will succeed and Cmd 2 will fail, countries at will be a Python variable representing a PySpark DataFrame.

D.Both commands will fail. No new variables, tables, or views will be created.

E.Cmd 1 will succeed and Cmd 2 will fail, countries at will be a Python variable containing a list of strings.

Other Version: 545Databricks.Databricks-Certified-Professional-Data-Engineer.v2025-10-30.q132; 1574Databricks.Databricks-Certified-Professional-Data-Engineer.v2024-03-27.q76; 1297Databricks.Databricks-Certified-Professional-Data-Engineer.v2023-12-11.q54

Latest Upload: 117SAP.C-LCNC-2406.v2026-01-09.q21; 132Salesforce.CRT-550.v2026-01-09.q122; 116Salesforce.Marketing-Cloud-Intelligence.v2026-01-09.q41; 113CIPS.L4M1.v2026-01-09.q27; 105ISTQB.ATM.v2026-01-09.q49; 107SAP.C_BCBAI_2502.v2026-01-08.q38; 110Oracle.1Z0-1056-24.v2026-01-08.q53; 140Huawei.H13-831_V2.0.v2026-01-07.q101; 150Salesforce.Salesforce-Slack-Administrator.v2026-01-06.q103; 140CIPS.L5M15.v2026-01-06.q31

Question 66

Question 67

Question 68

Question 69

Question 70

Download PDF File