Online Access Free Associate-Developer-Apache-Spark Practice Test

Exam Code:	Associate-Developer-Apache-Spark
Exam Name:	Databricks Certified Associate Developer for Apache Spark 3.0 Exam
Certification Provider:	Databricks
Free Question Number:	179
Posted:	Jan 04, 2026

Rating

100%

Page: 1 / 36
Total 179 questions

Question 1

The code block displayed below contains an error. The code block should produce a DataFrame with color as the only column and three rows with color values of red, blue, and green, respectively.
Find the error.
Code block:
1.spark.createDataFrame([("red",), ("blue",), ("green",)], "color")
Instead of calling spark.createDataFrame, just DataFrame should be called.

A.The commas in the tuples with the colors should be eliminated.
B.The colors red, blue, and green should be expressed as a simple Python list, and not a list of tuples.
C.The "color" expression needs to be wrapped in brackets, so it reads ["color"].
D.Instead of color, a data type should be specified.

Question 2

The code block shown below should return a DataFrame with only columns from DataFrame transactionsDf for which there is a corresponding transactionId in DataFrame itemsDf. DataFrame itemsDf is very small and much smaller than DataFrame transactionsDf. The query should be executed in an optimized way. Choose the answer that correctly fills the blanks in the code block to accomplish this.
__1__.__2__(__3__, __4__, __5__)

A.1. itemsDf
2. broadcast
3. transactionsDf
4. "transactionId"
5. "left_semi"
B.1. transactionsDf
2. join
3. itemsDf
4. transactionsDf.transactionId==itemsDf.transactionId
5. "anti"
C.1. transactionsDf
2. join
3. broadcast(itemsDf)
4. "transactionId"
5. "left_semi"
D.1. itemsDf
2. join
3. broadcast(transactionsDf)
4. "transactionId"
5. "left_semi"
E.1. transactionsDf
2. join
3. broadcast(itemsDf)
4. transactionsDf.transactionId==itemsDf.transactionId
5. "outer"

Question 3

Which of the following is a viable way to improve Spark's performance when dealing with large amounts of data, given that there is only a single application running on the cluster?

A.Increase values for the properties spark.dynamicAllocation.maxExecutors, spark.default.parallelism, and spark.sql.shuffle.partitions
B.Increase values for the properties spark.sql.parallelism and spark.sql.shuffle.partitions
C.Increase values for the properties spark.sql.parallelism and spark.sql.partitions
D.Decrease values for the properties spark.default.parallelism and spark.sql.partitions
E.Increase values for the properties spark.default.parallelism and spark.sql.shuffle.partitions

Question 4

In which order should the code blocks shown below be run in order to assign articlesDf a DataFrame that lists all items in column attributes ordered by the number of times these items occur, from most to least often?
Sample of DataFrame articlesDf:
1.+------+-----------------------------+-------------------+
2.|itemId|attributes |supplier |
3.+------+-----------------------------+-------------------+
4.|1 |[blue, winter, cozy] |Sports Company Inc.|
5.|2 |[red, summer, fresh, cooling]|YetiX |
6.|3 |[green, summer, travel] |Sports Company Inc.|
7.+------+-----------------------------+-------------------+

A.2, 5, 4
B.2, 5, 3
C.1. articlesDf = articlesDf.groupby("col")
2. articlesDf = articlesDf.select(explode(col("attributes")))
3. articlesDf = articlesDf.orderBy("count").select("col")
4. articlesDf = articlesDf.sort("count",ascending=False).select("col")
5. articlesDf = articlesDf.groupby("col").count()
D.4, 5
E.5, 2
F.2, 3, 4

Question 5

Which of the following describes a valid concern about partitioning?

A.A shuffle operation returns 200 partitions if not explicitly set.
B.Decreasing the number of partitions reduces the overall runtime of narrow transformations if there are more executors available than partitions.
C.Short partition processing times are indicative of low skew.
D.No data is exchanged between executors when coalesce() is run.
E.The coalesce() method should be used to increase the number of partitions.

Other Version: 2007Databricks.Associate-Developer-Apache-Spark.v2022-07-05.q65; 1720Databricks.Associate-Developer-Apache-Spark.v2022-05-05.q60

Latest Upload: 124SAP.C-LCNC-2406.v2026-01-09.q21; 141Salesforce.CRT-550.v2026-01-09.q122; 119Salesforce.Marketing-Cloud-Intelligence.v2026-01-09.q41; 119CIPS.L4M1.v2026-01-09.q27; 107ISTQB.ATM.v2026-01-09.q49; 117SAP.C_BCBAI_2502.v2026-01-08.q38; 110Oracle.1Z0-1056-24.v2026-01-08.q53; 155Huawei.H13-831_V2.0.v2026-01-07.q101; 172Salesforce.Salesforce-Slack-Administrator.v2026-01-06.q103; 148CIPS.L5M15.v2026-01-06.q31