Free Access Databricks.Associate-Developer-Apache-Spark.v2022-07-05.q65 Practice Test (Page 7)

Question 26

The code block displayed below contains an error. The code block should save DataFrame transactionsDf at path path as a parquet file, appending to any existing parquet file. Find the error.
Code block:

A.transactionsDf.format("parquet").option("mode", "append").save(path)

B.The code block is missing a reference to the DataFrameWriter.

C.save() is evaluated lazily and needs to be followed by an action.

D.The mode option should be omitted so that the command uses the default mode.

E.The code block is missing a bucketBy command that takes care of partitions.

F.Given that the DataFrame should be saved as parquet file, path is being passed to the wrong method.

Question 27

Which of the following code blocks concatenates rows of DataFrames transactionsDf and transactionsNewDf, omitting any duplicates?

A.transactionsDf.concat(transactionsNewDf).unique()

B.transactionsDf.union(transactionsNewDf).distinct()

C.spark.union(transactionsDf, transactionsNewDf).distinct()

D.transactionsDf.join(transactionsNewDf, how="union").distinct()

E.transactionsDf.union(transactionsNewDf).unique()

Question 28

Which of the following describes a shuffle?

A.A shuffle is a process that is executed during a broadcast hash join.

B.A shuffle is a process that compares data across executors.

C.A shuffle is a process that compares data across partitions.

D.A shuffle is a Spark operation that results from DataFrame.coalesce().

E.A shuffle is a process that allocates partitions to executors.

Question 29

The code block shown below should show information about the data type that column storeId of DataFrame transactionsDf contains. Choose the answer that correctly fills the blanks in the code block to accomplish this.
Code block:
transactionsDf.__1__(__2__).__3__

A.1. select
2. "storeId"
3. print_schema()

B.1. limit
2. 1
3. columns

C.1. select
2. "storeId"
3. printSchema()

D.1. limit
2. "storeId"
3. printSchema()

E.1. select
2. storeId
3. dtypes

Question 30

The code block shown below should return a DataFrame with only columns from DataFrame transactionsDf for which there is a corresponding transactionId in DataFrame itemsDf. DataFrame itemsDf is very small and much smaller than DataFrame transactionsDf. The query should be executed in an optimized way. Choose the answer that correctly fills the blanks in the code block to accomplish this.
__1__.__2__(__3__, __4__, __5__)

A.1. transactionsDf
2. join
3. broadcast(itemsDf)
4. transactionsDf.transactionId==itemsDf.transactionId
5. "outer"

B.1. transactionsDf
2. join
3. itemsDf
4. transactionsDf.transactionId==itemsDf.transactionId
5. "anti"

C.1. transactionsDf
2. join
3. broadcast(itemsDf)
4. "transactionId"
5. "left_semi"

D.1. itemsDf
2. broadcast
3. transactionsDf
4. "transactionId"
5. "left_semi"

E.1. itemsDf
2. join
3. broadcast(transactionsDf)
4. "transactionId"
5. "left_semi"

Other Version: 1510Databricks.Associate-Developer-Apache-Spark.v2022-05-05.q60

Latest Upload: 104OCEG.GRCP.v2025-09-11.q211; 103HP.HPE0-V27.v2025-09-11.q78; 118Oracle.1Z0-1057-23.v2025-09-10.q47; 150Google.Professional-Cloud-Network-Engineer.v2025-09-09.q179; 131SAP.C-S4EWM-2023.v2025-09-08.q83; 164TheSecOpsGroup.CNSP.v2025-09-08.q20; 223CFAInstitute.ESG-Investing.v2025-09-08.q173; 158PECB.ISO-IEC-27001-Lead-Implementer.v2025-09-06.q132; 147Salesforce.Data-Architect.v2025-09-05.q216; 142Adobe.AD0-E605.v2025-09-05.q50

Question 26

Question 27

Question 28

Question 29

Question 30

Download PDF File