ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 126 - Professional Machine Learning Engineer discussion

Report
Export

You have been asked to productionize a proof-of-concept ML model built using Keras. The model was trained in a Jupyter notebook on a data scientist's local machine. The notebook contains a cell that performs data validation and a cell that performs model analysis. You need to orchestrate the steps contained in the notebook and automate the execution of these steps for weekly retraining. You expect much more training data in the future. You want your solution to take advantage of managed services while minimizing cost. What should you do?

A.
Move the Jupyter notebook to a Notebooks instance on the largest N2 machine type, and schedule the execution of the steps in the Notebooks instance using Cloud Scheduler.
Answers
A.
Move the Jupyter notebook to a Notebooks instance on the largest N2 machine type, and schedule the execution of the steps in the Notebooks instance using Cloud Scheduler.
B.
Write the code as a TensorFlow Extended (TFX) pipeline orchestrated with Vertex AI Pipelines. Use standard TFX components for data validation and model analysis, and use Vertex AI Pipelines for model retraining.
Answers
B.
Write the code as a TensorFlow Extended (TFX) pipeline orchestrated with Vertex AI Pipelines. Use standard TFX components for data validation and model analysis, and use Vertex AI Pipelines for model retraining.
C.
Rewrite the steps in the Jupyter notebook as an Apache Spark job, and schedule the execution of the job on ephemeral Dataproc clusters using Cloud Scheduler.
Answers
C.
Rewrite the steps in the Jupyter notebook as an Apache Spark job, and schedule the execution of the job on ephemeral Dataproc clusters using Cloud Scheduler.
D.
Extract the steps contained in the Jupyter notebook as Python scripts, wrap each script in an Apache Airflow BashOperator, and run the resulting directed acyclic graph (DAG) in Cloud Composer.
Answers
D.
Extract the steps contained in the Jupyter notebook as Python scripts, wrap each script in an Apache Airflow BashOperator, and run the resulting directed acyclic graph (DAG) in Cloud Composer.
Suggested answer: B

Explanation:

The best option for productionizing a Keras model is to use TensorFlow Extended (TFX), a framework for building end-to-end machine learning pipelines that can handle large-scale data and complex workflows. TFX provides standard components for data ingestion, transformation, validation, analysis, training, tuning, serving, and monitoring. TFX pipelines can be orchestrated with Vertex AI Pipelines, a managed service that runs on Google Cloud Platform and leverages Kubernetes and Argo. Vertex AI Pipelines allows you to automate the execution of your TFX pipeline steps, schedule retraining jobs, and scale up or down the resources as needed. By using TFX and Vertex AI Pipelines, you can take advantage of the following benefits:

You can reuse the existing code in your Jupyter notebook, as TFX supports Keras as a first-class citizen. You can also use the Keras Tuner to optimize your model hyperparameters.

You can ensure data quality and consistency by using the TFX Data Validation component, which can detect anomalies, drift, and skew in your data. You can also use the TFX SchemaGen component to generate a schema for your data and enforce it throughout the pipeline.

You can analyze your model performance and fairness by using the TFX Model Analysis component, which can produce various metrics and visualizations. You can also use the TFX Model Validation component to compare your new model with a baseline model and set thresholds for deploying the model to production.

You can deploy your model to various serving platforms by using the TFX Pusher component, which can push your model to Vertex AI, Cloud AI Platform, TensorFlow Serving, or TensorFlow Lite. You can also use the TFX Model Registry to manage the versions and metadata of your models.

You can monitor your model performance and health by using the TFX Model Monitor component, which can detect data drift, concept drift, and prediction skew in your model. You can also use the TFX Evaluator component to compute metrics and validate your model against a baseline or a slice of data.

You can reduce the cost and complexity of managing your own infrastructure by using Vertex AI Pipelines, which provides a serverless environment for running your TFX pipeline. You can also use the Vertex AI Experiments and Vertex AI TensorBoard to track and visualize your pipeline runs.

[TensorFlow Extended (TFX)]

[Vertex AI Pipelines]

[TFX User Guide]

asked 18/09/2024
Nora Daniels
36 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first