Question 1

Which of the following statements about Spark's DataFrames is incorrect?
  • Question 2

    Which of the following code blocks returns only rows from DataFrame transactionsDf in which values in column productId are unique?
  • Question 3

    The code block displayed below contains an error. The code block should produce a DataFrame with color as the only column and three rows with color values of red, blue, and green, respectively.
    Find the error.
    Code block:
    1.spark.createDataFrame([("red",), ("blue",), ("green",)], "color")
    Instead of calling spark.createDataFrame, just DataFrame should be called.
  • Question 4

    Which of the following code blocks reduces a DataFrame from 12 to 6 partitions and performs a full shuffle?
  • Question 5

    Which of the following describes the role of tasks in the Spark execution hierarchy?