Question 1
Which of the following statements about Spark's DataFrames is incorrect?
Question 2
Which of the following code blocks returns only rows from DataFrame transactionsDf in which values in column productId are unique?
Question 3
The code block displayed below contains an error. The code block should produce a DataFrame with color as the only column and three rows with color values of red, blue, and green, respectively.
Find the error.
Code block:
1.spark.createDataFrame([("red",), ("blue",), ("green",)], "color")
Instead of calling spark.createDataFrame, just DataFrame should be called.
Find the error.
Code block:
1.spark.createDataFrame([("red",), ("blue",), ("green",)], "color")
Instead of calling spark.createDataFrame, just DataFrame should be called.
Question 4
Which of the following code blocks reduces a DataFrame from 12 to 6 partitions and performs a full shuffle?
Question 5
Which of the following describes the role of tasks in the Spark execution hierarchy?