Question 136

When you design a Google Cloud Bigtable schema it is recommended that you _________.
  • Question 137

    You want to rebuild your batch pipeline for structured data on Google Cloud You are using PySpark to conduct data transformations at scale, but your pipelines are taking over twelve hours to run To expedite development and pipeline run time, you want to use a serverless tool and SQL syntax You have already moved your raw data into Cloud Storage How should you build the pipeline on Google Cloud while meeting speed and processing requirements?
  • Question 138

    Your company is currently setting up data pipelines for their campaign. For all the Google Cloud Pub/Sub
    streaming data, one of the important business requirements is to be able to periodically identify the inputs
    and their timings during their campaign. Engineers have decided to use windowing and transformation in
    Google Cloud Dataflow for this purpose. However, when testing this feature, they find that the Cloud
    Dataflow job fails for the all streaming insert. What is the most likely cause of this problem?
  • Question 139

    Which SQL keyword can be used to reduce the number of columns processed by BigQuery?
  • Question 140

    You are selecting services to write and transform JSON messages from Cloud Pub/Sub to BigQuery for a data pipeline on Google Cloud. You want to minimize service costs. You also want to monitor and accommodate input data volume that will vary in size with minimal manual intervention. What should you do?