Question 81

Which of these is NOT a way to customize the software on Dataproc cluster instances?
  • Question 82

    You need to choose a database to store time series CPU and memory usage for millions of computers. You need to store this data in one-second interval samples. Analysts will be performing real-time, ad hoc analytics against the database. You want to avoid being charged for every query executed and ensure that the schema design will allow for future growth of the dataset. Which database and data model should you choose?
  • Question 83

    You are working on a sensitive project involving private user dat
    a. You have set up a project on Google Cloud Platform to house your work internally. An external consultant is going to assist with coding a complex transformation in a Google Cloud Dataflow pipeline for your project. How should you maintain users' privacy?
  • Question 84

    You are designing a basket abandonment system for an ecommerce company. The system will send a message to a user based on these rules:
    * No interaction by the user on the site for 1 hour
    * Has added more than $30 worth of products to the basket
    * Has not completed a transaction
    You use Google Cloud Dataflow to process the data and decide if a message should be sent. How should you design the pipeline?
  • Question 85

    Your team is working on a binary classification problem. You have trained a support vector machine (SVM) classifier with default parameters, and received an area under the Curve (AUC) of 0.87 on the validation set.
    You want to increase the AUC of the model. What should you do?