Free Access Snowflake.DSA-C03.v2025-10-01.q105 Practice Test (Page 14)

Question 61

You are tasked with fine-tuning a Snowflake Cortex LLM model using your own labeled dataset to improve its performance on a specific sentiment analysis task related to customer reviews. You have already created a Snowflake stage 'my_stage' and uploaded your labeled data in CSV format to this stage. The labeled data contains two columns: 'review_text' and 'sentiment' (values: 'positive', 'negative', 'neutral'). Which of the following SQL commands, or sequences of commands, is MOST appropriate to initiate the fine-tuning process using the 'SNOWFLAKE.ML.FINETUNE LLM' function? Assume you have already set the necessary permissions for your role to access the model and stage.

A.Option A

B.Option B

C.Option C

D.Option D

E.Option E

Question 62

A data scientist is building a linear regression model in Snowflake to predict customer churn based on structured data stored in a table named 'CUSTOMER DATA'. The table includes features like 'CUSTOMER D', 'AGE, 'TENURE MONTHS', 'NUM PRODUCTS', and 'AVG MONTHLY SPEND'. The target variable is 'CHURNED' (1 for churned, 0 for active). After building the model, the data scientist wants to evaluate its performance using Mean Squared Error (MSE) on a held-out test set. Which of the following SQL queries, executed within Snowflake's stored procedure framework, is the MOST efficient and accurate way to calculate the MSE for the linear regression model predictions against the actual 'CHURNED values in the 'CUSTOMER DATA TEST table, assuming the linear regression model is named 'churn _ model' and the predicted values are generated by the MODEL APPLY() function?

Question 63

A data scientist is tasked with predicting house prices using Snowflake. They have a dataset stored in a Snowflake table called 'HOUSE PRICES' with columns such as 'SQUARE FOOTAGE, 'NUM BEDROOMS, 'LOCATION_ID, and 'PRICE. They choose a Random Forest Regressor model. Which of the following steps is MOST important to prevent overfitting and ensure good generalization performance on unseen data, and how can this be effectively implemented within a Snowflake-centric workflow?

A.Increase the number of estimators (trees) in the Random Forest to the maximum possible value to capture all potential patterns, without cross validation.

B.Tune the hyperparameters of the Random Forest model (e.g., 'max_deptm, 'n_estimators') using cross-validation. You can achieve this by splitting the 'HOUSE PRICES table into training and validation sets using Snowflake's 'QUALIFY clause or temporary tables, then train and evaluate the model within a loop or stored procedure.

C.Train the Random Forest model on the entire 'HOUSE PRICES table without splitting into training and validation sets, as this will provide the model with the most data.

D.Randomly select a small subset of the features (e.g., only use 'SQUARE FOOTAGE and 'NUM BEDROOMS) to simplify the model and prevent overfitting.

E.Eliminate outliers without understanding the data properly to reduce noise.

Question 64

You've deployed a fraud detection model in Snowflake using Snowpark. You are monitoring its performance and notice a significant decrease in recall, while precision remains high. This means the model is missing many fraudulent transactions. The training data was initially balanced, but you suspect that recent changes in user behavior have skewed the distribution of fraudulent vs. non-fraudulent transactions in production. Which of the following actions are MOST appropriate to address this issue and improve the model's performance, considering best practices for model retraining within the Snowflake ecosystem?

A.Retrain the model using the original training data. Since the precision is high, the model's fundamental logic is still sound. A larger training dataset isn't necessary.

B.Retrain the model using a dataset that includes recent production data, being sure to re-balance the dataset to maintain a roughly equal number of fraudulent and non-fraudulent transactions. Prioritize transactions from the last month.

C.Adjust the model's classification threshold to be more sensitive, even if it means accepting a slightly lower precision. This can be done directly within Snowflake using a SQL UDF that transforms the model's output probabilities.

D.Implement a data drift monitoring system in Snowflake to automatically detect changes in the input features of the model. Trigger an automated retraining pipeline when significant drift is detected. This retraining should include recent production data with updated labels, but only if label data collection can be automated.

E.Immediately shut down the model to prevent further inaccurate classifications. Investigate why the recall is low before any retraining is performed.

Question 65

You have deployed a regression model in Snowflake as an external function using AWS Lambda'. The external function takes several numerical features as input and returns a predicted value. You want to continuously monitor the model's performance in production and automatically retrain it when the performance degrades below a predefined threshold. Which of the following methods represent VALID approaches for calculating and monitoring model performance within the Snowflake environment and triggering the retraining process?

A.Create a Snowflake Task that periodically executes a SQL query to calculate performance metrics (e.g., RMSE) by comparing predicted values from the external function with actual values stored in a separate table. Trigger a Python UDF, deployed as a Snowflake stored procedure, to initiate retraining if the RMSE exceeds the threshold.

B.Implement custom logging within the AWS Lambda function to capture prediction results and actual values. Configure AWS CloudWatch to monitor these logs and trigger an AWS Step Function that initiates a new training job and updates the Snowflake external function with the new model endpoint upon completion.

C.Utilize Snowflake's Alerting feature, setting an alert rule based on the output of a SQL query that calculates performance metrics. Configure the alert action to invoke a webhook that triggers a retraining pipeline.

D.Build a Snowpark Python application deployed on Snowflake which periodically polls the external function's performance by querying the function with a sample data set and comparing results to ground truth stored in Snowflake. Initiate retraining directly from the Snowpark application if performance degrades.

E.Create a view that joins the input features with the predicted output and the actual result. Configure model monitoring within the AWS Sagemaker to perform continuous validation of the model.

Latest Upload: 201PaloAltoNetworks.NGFW-Engineer.v2026-05-01.q43; 297Nokia.4A0-113.v2026-05-01.q69; 253EC-COUNCIL.312-49v11.v2026-04-30.q214; 228Microsoft.MB-820.v2026-04-30.q101; 209Salesforce.MC-202.v2026-04-30.q57; 205BICSI.INSTC_V8.v2026-04-29.q53; 333NMLS.MLO.v2026-04-28.q82; 243NCARB.Project-Management.v2026-04-28.q27; 461EMC.D-AV-DY-23.v2026-04-27.q184; 1113ServiceNow.CSA.v2026-04-27.q483

Question 61

Question 62

Question 63

Question 64

Question 65

Download PDF File