Free Access Google.Professional-Data-Engineer.v2022-07-07.q155 Practice Test (Page 9)

Question 36

You designed a database for patient records as a pilot project to cover a few hundred patients in three clinics.
Your design used a single database table to represent all patients and their visits, and you used self-joins to generate reports. The server resource utilization was at 50%. Since then, the scope of the project has expanded. The database must now store 100 times more patient records. You can no longer run the reports, because they either take too long or they encounter errors with insufficient compute resources. How should you adjust the database design?

A.Add capacity (memory and disk space) to the database server by the order of 200.

B.Shard the tables into smaller ones based on date ranges, and only generate reports with prespecified date ranges.

C.Normalize the master patient-record table into the patient table and the visits table, and create other necessary tables to avoid self-join.

D.Partition the table into smaller tables, with one for each clinic. Run queries against the smaller table pairs, and use unions for consolidated reports.

Question 37

Your company's customer and order databases are often under heavy load. This makes performing analytics against them difficult without harming operations. The databases are in a MySQL cluster, with nightly backups taken using mysqldump. You want to perform analytics with minimal impact on operations. What should you do?

A.Add a node to the MySQL cluster and build an OLAP cube there.

B.Use an ETL tool to load the data from MySQL into Google BigQuery.

C.Connect an on-premises Apache Hadoop cluster to MySQL and perform ETL.

D.Mount the backups to Google Cloud SQL, and then process the data using Google Cloud Dataproc.

Question 38

Your company is performing data preprocessing for a learning algorithm in Google Cloud Dataflow.
Numerous data logs are being are being generated during this step, and the team wants to analyze them.
Due to the dynamic nature of the campaign, the data is growing exponentially every hour.
The data scientists have written the following code to read the data for a new key features in the logs.
BigQueryIO.Read
.named("ReadLogData")
.from("clouddataflow-readonly:samples.log_data")
You want to improve the performance of this data read. What should you do?

A.Call a transform that returns TableRowobjects, where each element in the PCollectionrepresents
a single row in the table.

B.Use .fromQueryoperation to read specific fields from the table.

C.Specify the TableReferenceobject in the code.

D.Use of both the Google BigQuery TableSchemaand TableFieldSchemaclasses.

Question 39

You are choosing a NoSQL database to handle telemetry data submitted from millions of Internet-of-
Things (IoT) devices. The volume of data is growing at 100 TB per year, and each data entry has about
100 attributes. The data processing pipeline does not require atomicity, consistency, isolation, and
durability (ACID). However, high availability and low latency are required.
You need to analyze the data by querying against individual fields. Which three databases meet your
requirements? (Choose three.)

A.HDFS with Hive

B.HBase

C.MySQL

D.Cassandra

E.MongoDB

F.Redis

Question 40

You are building a new data pipeline to share data between two different types of applications: jobs generators and job runners. Your solution must scale to accommodate increases in usage and must accommodate the addition of new applications without negatively affecting the performance of existing ones. What should you do?

A.Create an API using App Engine to receive and send messages to the applications

B.Use a Cloud Pub/Sub topic to publish jobs, and use subscriptions to execute them

C.Create a table on Cloud SQL, and insert and delete rows with the job information

D.Create a table on Cloud Spanner, and insert and delete rows with the job information

Other Version: 151Google.Professional-Data-Engineer.v2025-09-04.q126; 812Google.Professional-Data-Engineer.v2025-03-18.q108; 948Google.Professional-Data-Engineer.v2024-12-09.q327; 1474Google.Professional-Data-Engineer.v2024-02-15.q202; 2260Google.Professional-Data-Engineer.v2023-09-14.q233; 5244Google.Professional-Data-Engineer.v2022-02-11.q268; 112Google.Examcollectionpass.Professional-Data-Engineer.v2021-12-20.by.jonathan.161q.pdf

Latest Upload: 107Oracle.1Z0-1057-23.v2025-09-10.q47; 150Google.Professional-Cloud-Network-Engineer.v2025-09-09.q179; 131SAP.C-S4EWM-2023.v2025-09-08.q83; 155TheSecOpsGroup.CNSP.v2025-09-08.q20; 206CFAInstitute.ESG-Investing.v2025-09-08.q173; 152PECB.ISO-IEC-27001-Lead-Implementer.v2025-09-06.q132; 140Salesforce.Data-Architect.v2025-09-05.q216; 136Adobe.AD0-E605.v2025-09-05.q50; 176Nutanix.NCP-MCI-6.10.v2025-09-05.q55; 114Oracle.1z0-591.v2025-09-05.q104

Question 36

Question 37

Question 38

Question 39

Question 40

Download PDF File