Free Access Databricks.Databricks-Certified-Professional-Data-Engineer.v2025-10-30.q132 Practice Test (Page 21)

Question 96

You are currently working on a production job failure with a job set up in job clusters due to a data issue, what cluster do you need to start to investigate and analyze the data?

A.A Job cluster can be used to analyze the problem

B.All-purpose cluster/ interactive cluster is the recommended way to run commands and view the data.

C.Existing job cluster can be used to investigate the issue

D.Databricks SQL Endpoint can be used to investigate the issue

Question 97

A production cluster has 3 executor nodes and uses the same virtual machine type for the driver and executor.
When evaluating the Ganglia Metrics for this cluster, which indicator would signal a bottleneck caused by code executing on the driver?

A.The five Minute Load Average remains consistent/flat

B.Bytes Received never exceeds 80 million bytes per second

C.Total Disk Space remains constant

D.Network I/O never spikes

E.Overall cluster CPU utilization is around 25%

Question 98

A data ingestion task requires a one-TB JSON dataset to be written out to Parquet with a target part-file size of
512 MB. Because Parquet is being used instead of Delta Lake, built-in file-sizing features such as Auto-Optimize & Auto-Compaction cannot be used.
Which strategy will yield the best performance without shuffling data?

A.Set spark.sql.files.maxPartitionBytes to 512 MB, ingest the data, execute the narrow transformations, and then write to parquet.

B.Set spark.sql.shuffle.partitions to 2,048 partitions (1TB*1024*1024/512), ingest the data, execute the narrow transformations, optimize the data by sorting it (which automatically repartitions the data), and then write to parquet.

C.Set spark.sql.adaptive.advisoryPartitionSizeInBytes to 512 MB bytes, ingest the data, execute the narrow transformations, coalesce to 2,048 partitions (1TB*1024*1024/512), and then write to parquet.

D.Ingest the data, execute the narrow transformations, repartition to 2,048 partitions (1TB*
1024*1024/512), and then write to parquet.

E.Set spark.sql.shuffle.partitions to 512, ingest the data, execute the narrow transformations, and then write to parquet.

Question 99

All records from an Apache Kafka producer are being ingested into a single Delta Lake table with the following schema:
key BINARY, value BINARY, topic STRING, partition LONG, offset LONG, timestamp LONG There are 5 unique topics being ingested. Only the "registration" topic contains Personal Identifiable Information (PII). The company wishes to restrict access to PII. The company also wishes to only retain records containing PII in this table for 14 days after initial ingestion. However, for non-PII information, it would like to retain these records indefinitely.
Which of the following solutions meets the requirements?

A.All data should be deleted biweekly; Delta Lake's time travel functionality should be leveraged to maintain a history of non-PII information.

B.Data should be partitioned by the registration field, allowing ACLs and delete statements to be set for the PII directory.

C.Because the value field is stored as binary data, this information is not considered PII and no special precautions should be taken.

D.Separate object storage containers should be specified based on the partition field, allowing isolation at the storage level.

E.Data should be partitioned by the topic field, allowing ACLs and delete statements to leverage partition boundaries.

Question 100

The data architect has mandated that all tables in the Lakehouse should be configured as external Delta Lake tables.
Which approach will ensure that this requirement is met?

A.Whenever a database is being created, make sure that the location keyword is used

B.When configuring an external data warehouse for all table storage. leverage Databricks for all ELT.

C.Whenever a table is being created, make sure that the location keyword is used.

D.When tables are created, make sure that the external keyword is used in the create table statement.

E.When the workspace is being configured, make sure that external cloud object storage has been mounted.

Other Version: 751Databricks.Databricks-Certified-Professional-Data-Engineer.v2025-11-20.q139; 1389Databricks.Databricks-Certified-Professional-Data-Engineer.v2024-03-27.q76; 1167Databricks.Databricks-Certified-Professional-Data-Engineer.v2023-12-11.q54

Latest Upload: 102GAQM.CBCP-002.v2025-12-17.q28; 102ServiceNow.CIS-Discovery.v2025-12-17.q125; 113PECB.NIS-2-Directive-Lead-Implementer.v2025-12-16.q29; 122SAP.C_C4H32_2411.v2025-12-16.q51; 125Splunk.SPLK-3001.v2025-12-15.q104; 112SAP.C-S4PM2-2507.v2025-12-15.q29; 179APA.FPC-Remote.v2025-12-13.q147; 123CIPS.L6M2.v2025-12-13.q13; 131WGU.Cybersecurity-Architecture-and-Engineering.v2025-12-13.q81; 124Oracle.1Z0-1163-1.v2025-12-13.q17

Question 96

Question 97

Question 98

Question 99

Question 100

Download PDF File