Question 16

Which of the following statements about garbage collection in Spark is incorrect?
  • Question 17

    Which of the following code blocks returns a new DataFrame with the same columns as DataFrame transactionsDf, except for columns predError and value which should be removed?
  • Question 18

    The code block displayed below contains an error. The code block should display the schema of DataFrame transactionsDf. Find the error.
    Code block:
    transactionsDf.rdd.printSchema
  • Question 19

    The code block shown below should return a DataFrame with all columns of DataFrame transactionsDf, but only maximum 2 rows in which column productId has at least the value 2. Choose the answer that correctly fills the blanks in the code block to accomplish this.
    transactionsDf.__1__(__2__).__3__
  • Question 20

    The code block displayed below contains multiple errors. The code block should remove column transactionDate from DataFrame transactionsDf and add a column transactionTimestamp in which dates that are expressed as strings in column transactionDate of DataFrame transactionsDf are converted into unix timestamps. Find the errors.
    Sample of DataFrame transactionsDf:
    1.+-------------+---------+-----+-------+---------+----+----------------+
    2.|transactionId|predError|value|storeId|productId| f| transactionDate|
    3.+-------------+---------+-----+-------+---------+----+----------------+
    4.| 1| 3| 4| 25| 1|null|2020-04-26 15:35|
    5.| 2| 6| 7| 2| 2|null|2020-04-13 22:01|
    6.| 3| 3| null| 25| 3|null|2020-04-02 10:53|
    7.+-------------+---------+-----+-------+---------+----+----------------+ Code block:
    1.transactionsDf = transactionsDf.drop("transactionDate")
    2.transactionsDf["transactionTimestamp"] = unix_timestamp("transactionDate", "yyyy-MM-dd")