Free Access Google.Professional-Machine-Learning-Engineer.v2025-05-29.q260 Practice Test (Page 40)

Question 191

Your team trained and tested a DNN regression model with good results. Six months after deployment, the model is performing poorly due to a change in the distribution of the input data. How should you address the input differences in production?

A.Create alerts to monitor for skew, and retrain the model.

B.Perform feature selection on the model, and retrain the model with fewer features

C.Retrain the model, and select an L2 regularization parameter with a hyperparameter tuning service

D.Perform feature selection on the model, and retrain the model on a monthly basis with fewer features

Correct Answer: A

The performance of a DNN regression model can degrade over time due to a change in the distribution of the input data. This phenomenon is known as data drift or concept drift, and it can affect the accuracy and reliability of the model predictions. Data drift can be caused by various factors, such as seasonal changes, population shifts, market trends, or external events1 To address the input differences in production, one should create alerts to monitor for skew, and retrain the model. Skew is a measure of how much the input data in production differs from the input data used for training the model. Skew can be detected by comparing the statistics and distributions of the input features in the training and production data, such as mean, standard deviation, histogram, or quantiles. Alerts can be set up to notify the model developers or operators when the skew exceeds a certain threshold, indicating a significant change in the input data2 When an alert is triggered, the model should be retrained with the latest data that reflects the current distribution of the input features. Retraining the model can help the model adapt to the new data and improve its performance. Retraining the model can be done manually or automatically, depending on the frequency and severity of the data drift. Retraining the model can also involve updating the model architecture, hyperparameters, or optimization algorithm, if necessary3 The other options are not as effective or feasible. Performing feature selection on the model and retraining the model with fewer features is not a good idea, as it may reduce the expressiveness and complexity of the model, and ignore some important features that may affect the output. Retraining the model and selecting an L2 regularization parameter with a hyperparameter tuning service is not relevant, as L2 regularization is a technique to prevent overfitting, not data drift. Retraining the model on a monthly basis with fewer features is not optimal, as it may not capture the timely changes in the input data, and may compromise the model performance.
References: 1: Data drift detection for machine learning models 2: Skew and drift detection 3: Retraining machine learning models

Question 192

You have created multiple versions of an ML model and have imported them to Vertex AI Model Registry. You want to perform A/B testing to identify the best-performing model using the simplest approach. What should you do?

A.Split incoming traffic among separate Cloud Run instances of deployed models. Monitor the performance of each version using Cloud Monitoring.

B.Split incoming traffic to distribute prediction requests among the versions. Monitor the performance of each version using Looker Studio dashboards that compare logged data for each version.

C.Split incoming traffic among Google Kubernetes Engine (GKE) clusters and use Traffic Director to distribute prediction requests to different versions. Monitor the performance of each version using Cloud Monitoring.

D.Split incoming traffic to distribute prediction requests among the versions. Monitor the performance of each version using Vertex AI's built-in monitoring tools.

Question 193

You need to train a natural language model to perform text classification on product descriptions that contain millions of examples and 100,000 unique words. You want to preprocess the words individually so that they can be fed into a recurrent neural network. What should you do?

A.Create a hot-encoding of words, and feed the encodings into your model.

B.Identify word embeddings from a pre-trained model, and use the embeddings in your model.

C.Sort the words by frequency of occurrence, and use the frequencies as the encodings in your model.

D.Assign a numerical value to each word from 1 to 100,000 and feed the values as inputs in your model.

Correct Answer: B

* Option A is incorrect because creating a one-hot encoding of words, and feeding the encodings into your model is not an efficient way to preprocess the words individually for a natural language model. One-hot encoding is a method of representing categorical variables as binary vectors, where each element corresponds to a category and only one element is 1 andthe rest are 01. However, this method is not suitable for high-dimensional and sparse data, such as words in a large vocabulary, because it requires a lot of memory and computation, and does not capture the semantic similarity or relationship between words2.
* Option B is correct because identifying word embeddings from a pre-trained model, and using the embeddings in your model is a good way to preprocess the words individually for a natural language model. Word embeddings are low-dimensional and dense vectors that represent the meaning and usage of words in a continuous space3. Word embeddings can be learned from a large corpus of text using neural networks, such as word2vec, GloVe, or BERT4. Using pre-trained word embeddings can save time and resources, and improve the performance of the natural language model, especially when the training data is limited or noisy5.
* Option C is incorrect because sorting the words by frequency of occurrence, and using the frequencies as the encodings in your model is not a meaningful way to preprocess the words individually for a natural language model. This method implies that the frequency of a wordis a good indicator of its importance or relevance, which may not be true. For example, the word "the" is very frequent but not very informative, while the word "unicorn" is rare but more distinctive. Moreover, this method does not capture the semantic similarity or relationship between words, and may introduce noise or bias into the model.
* Option D is incorrect because assigning a numerical value to each word from 1 to 100,000 and feeding the values as inputs in your model is not a valid way to preprocess the words individually for a natural language model. This method implies an ordinal relationship between the words, which may not be true.
For example, assigning the values 1, 2, and 3 to the words "apple", "banana", and "orange" does not
* make sense, as there is no inherent order among these fruits. Moreover, this method does not capture the semantic similarity or relationship between words, and may confuse the model with irrelevant or misleading information.
References:
* One-hot encoding
* Word embeddings
* Word embedding
* Pre-trained word embeddings
* Using pre-trained word embeddings in a Keras model
* [Term frequency]
* [Term frequency-inverse document frequency]
* [Ordinal variable]
* [Encoding categorical features]

Question 194

You have a custom job that runs on Vertex Al on a weekly basis The job is Implemented using a proprietary ML workflow that produces the datasets. models, and custom artifacts, and sends them to a Cloud Storage bucket Many different versions of the datasets and models were created Due to compliance requirements, your company needs to track which model was used for making a particular prediction, and needs access to the artifacts for each model. How should you configure your workflows to meet these requirement?

A.Use the Vertex Al Metadata API inside the custom Job to create context, execution, and artifacts for each model, and use events to link them together.

B.Create a Vertex Al experiment, and enable autologging inside the custom job

C.Register each model in Vertex Al Model Registry, and use model labels to store the related dataset and model information.

D.Configure a TensorFlow Extended (TFX) ML Metadata database, and use the ML Metadata API.

Question 195

You are building a predictive maintenance model to preemptively detect part defects in bridges. You plan to use high definition images of the bridges as model inputs. You need to explain the output of the model to the relevant stakeholders so they can take appropriate action. How should you build the model?

A.Use scikit-lean to build a tree-based model, and use partial dependence plots (PDP) to explain the model output.

B.Use scikit-learn to build a tree-based model, and use SHAP values to explain the model output.

C.Use TensorFlow to create a deep learning-based model and use Integrated Gradients to explain the model output.

D.Use TensorFlow to create a deep learning-based model and use the sampled Shapley method to explain the model output.

Other Version: 3776Google.Professional-Machine-Learning-Engineer.v2024-09-13.q242; 2757Google.Professional-Machine-Learning-Engineer.v2023-12-28.q120; 1253Google.Professional-Machine-Learning-Engineer.v2023-02-27.q25; 2607Google.Professional-Machine-Learning-Engineer.v2022-08-07.q66; 2230Google.Professional-Machine-Learning-Engineer.v2022-03-21.q70; 62Google.Prepawayete.Professional-Machine-Learning-Engineer.v2021-09-13.by.murray.54q.pdf

Latest Upload: 452PaloAltoNetworks.NGFW-Engineer.v2026-05-01.q43; 594Nokia.4A0-113.v2026-05-01.q69; 635EC-COUNCIL.312-49v11.v2026-04-30.q214; 612Microsoft.MB-820.v2026-04-30.q101; 444Salesforce.MC-202.v2026-04-30.q57; 483BICSI.INSTC_V8.v2026-04-29.q53; 599NMLS.MLO.v2026-04-28.q82; 404NCARB.Project-Management.v2026-04-28.q27; 781EMC.D-AV-DY-23.v2026-04-27.q184; 2004ServiceNow.CSA.v2026-04-27.q483

Question 191

Question 192

Question 193

Question 194

Question 195

Download PDF File