Question 56

You work for an online travel agency that also sells advertising placements on its website to other companies.
You have been asked to predict the most relevant web banner that a user should see next. Security is important to your company. The model latency requirements are 300ms@p99, the inventory is thousands of web banners, and your exploratory analysis has shown that navigation context is a good predictor.
You want to Implement the simplest solution. How should you configure the prediction pipeline?
  • Question 57

    Your team is building an application for a global bank that will be used by millions of customers. You built a forecasting model that predicts customers1 account balances 3 days in the future. Your team will use the results in a new feature that will notify users when their account balance is likely to drop below $25. How should you serve your predictions?
  • Question 58

    A Data Scientist needs to migrate an existing on-premises ETL process to the cloud. The current process runs at regular time intervals and uses PySpark to combine and format multiple large data sources into a single consolidated output for downstream processing.
    The Data Scientist has been given the following requirements to the cloud solution:
    * Combine multiple data sources.
    * Reuse existing PySpark logic.
    * Run the solution on the existing schedule.
    * Minimize the number of servers that will need to be managed.
    Which architecture should the Data Scientist use to build this solution?
  • Question 59

    A Machine Learning Specialist previously trained a logistic regression model using scikit-learn on a local machine, and the Specialist now wants to deploy it to production for inference only.
    What steps should be taken to ensure Amazon SageMaker can host a model that was trained locally?
  • Question 60

    You developed an ML model with Al Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine (GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?