Question 21

You work for a manufacturing plant that batches application log files together into a single log file once a day at
2:00 AM. You have written a Google Cloud Dataflow job to process that log file. You need to make sure the log file in processed once per day as inexpensively as possible. What should you do?
  • Question 22

    MJTelco needs you to create a schema in Google Bigtable that will allow for the historical analysis of the last
    2 years of records. Each record that comes in is sent every 15 minutes, and contains a unique identifier of the device and a data record. The most common query is for all the data for a given device for a given day. Which schema should you use?
  • Question 23

    You are developing an application that uses a recommendation engine on Google Cloud. Your solution
    should display new videos to customers based on past views. Your solution needs to generate labels for
    the entities in videos that the customer has viewed. Your design must be able to provide very fast filtering
    suggestions based on data from other customer preferences on several TB of data. What should you do?
  • Question 24

    To run a TensorFlow training job on your own computer using Cloud Machine Learning Engine, what would your command start with?
  • Question 25

    You are designing storage for 20 TB of text files as part of deploying a data pipeline on Google Cloud. Your input data is in CSV format. You want to minimize the cost of querying aggregate values for multiple users who will query the data in Cloud Storage with multiple engines. Which storage service and schema design should you use?