Online Access Free Associate-Developer-Apache-Spark-3.5 Practice Test
| Exam Code: | Associate-Developer-Apache-Spark-3.5 |
| Exam Name: | Databricks Certified Associate Developer for Apache Spark 3.5 - Python |
| Certification Provider: | Databricks |
| Free Question Number: | 135 |
| Posted: | May 30, 2026 |
A Spark application developer wants to identify which operations cause shuffling, leading to a new stage in the Spark execution plan.
Which operation results in a shuffle and a new stage?
A data engineer is reviewing a Spark application that applies several transformations to a DataFrame but notices that the job does not start executing immediately.
Which two characteristics of Apache Spark's execution model explain this behavior?
Choose 2 answers:
A data engineer is working on the DataFrame:
(Referring to the table image: it has columns Id, Name, count, and timestamp.) Which code fragment should the engineer use to extract the unique values in the Name column into an alphabetically ordered list?
18 of 55.
An engineer has two DataFrames - df1 (small) and df2 (large). To optimize the join, the engineer uses a broadcast join:
from pyspark.sql.functions import broadcast
df_result = df2.join(broadcast(df1), on="id", how="inner")
What is the purpose of using broadcast() in this scenario?
33 of 55.
The data engineering team created a pipeline that extracts data from a transaction system.
The transaction system stores timestamps in UTC, and the data engineers must now transform the transaction_datetime field to the "America/New_York" timezone for reporting.
Which code should be used to convert the timestamp to the target timezone?