ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 47 - Professional Machine Learning Engineer discussion

Report
Export

You work for a public transportation company and need to build a model to estimate delay times for multiple transportation routes. Predictions are served directly to users in an app in real time. Because different seasons and population increases impact the data relevance, you will retrain the model every month. You want to follow Google-recommended best practices. How should you configure the end-to-end architecture of the predictive model?

A.
Configure Kubeflow Pipelines to schedule your multi-step workflow from training to deploying your model.
Answers
A.
Configure Kubeflow Pipelines to schedule your multi-step workflow from training to deploying your model.
B.
Use a model trained and deployed on BigQuery ML and trigger retraining with the scheduled query feature in BigQuery
Answers
B.
Use a model trained and deployed on BigQuery ML and trigger retraining with the scheduled query feature in BigQuery
C.
Write a Cloud Functions script that launches a training and deploying job on Ai Platform that is triggered by Cloud Scheduler
Answers
C.
Write a Cloud Functions script that launches a training and deploying job on Ai Platform that is triggered by Cloud Scheduler
D.
Use Cloud Composer to programmatically schedule a Dataflow job that executes the workflow from training to deploying your model
Answers
D.
Use Cloud Composer to programmatically schedule a Dataflow job that executes the workflow from training to deploying your model
Suggested answer: A

Explanation:

The end-to-end architecture of the predictive model for estimating delay times for multiple transportation routes should be configured using Kubeflow Pipelines. Kubeflow Pipelines is a platform for building and deploying scalable, portable, and reusable machine learning pipelines on Kubernetes. Kubeflow Pipelines allows you to orchestrate your multi-step workflow from data preparation, model training, model evaluation, model deployment, and model serving.Kubeflow Pipelines also provides a user interface for managing and tracking your pipeline runs, experiments, and artifacts1

Using Kubeflow Pipelines has several advantages for this use case:

Full automation: You can define your pipeline as a Python script that specifies the steps and dependencies of your workflow, and use the Kubeflow Pipelines SDK to compile and upload your pipeline to the Kubeflow Pipelines service.You can also use the Kubeflow Pipelines UI to create, run, and monitor your pipeline2

Scalability: You can leverage the power of Kubernetes to scale your pipeline components horizontally and vertically, and use distributed training frameworks such as TensorFlow or PyTorch to train your model on multiple nodes or GPUs3

Portability: You can package your pipeline components as Docker containers that can run on any Kubernetes cluster, and use the Kubeflow Pipelines SDK to export and import your pipeline packages across different environments4

Reusability: You can reuse your pipeline components across different pipelines, and share your components with other users through the Kubeflow Pipelines Component Store.You can also use pre-built components from the Kubeflow Pipelines library or other sources5

Schedulability: You can use the Kubeflow Pipelines UI or the Kubeflow Pipelines SDK to schedule recurring pipeline runs based on cron expressions or intervals. For example, you can schedule your pipeline to run every month to retrain your model on the latest data.

The other options are not as suitable for this use case. Using a model trained and deployed on BigQuery ML is not recommended, as BigQuery ML is mainly designed for simple and quick machine learning tasks on large-scale data, and does not support complex models or custom code. Writing a Cloud Functions script that launches a training and deploying job on AI Platform is not ideal, as Cloud Functions has limitations on the memory, CPU, and execution time, and does not provide a user interface for managing and tracking your pipeline. Using Cloud Composer to programmatically schedule a Dataflow job that executes the workflow from training to deploying your model is not optimal, as Dataflow is mainly designed for data processing and streaming analytics, and does not support model serving or monitoring.

asked 18/09/2024
Vladimir Litvinenko
29 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first