ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 180 - Professional Machine Learning Engineer discussion

Report
Export

You have trained a model by using data that was preprocessed in a batch Dataflow pipeline Your use case requires real-time inference. You want to ensure that the data preprocessing logic is applied consistently between training and serving. What should you do?

A.
Perform data validation to ensure that the input data to the pipeline is the same format as the input data to the endpoint.
Answers
A.
Perform data validation to ensure that the input data to the pipeline is the same format as the input data to the endpoint.
B.
Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline Use the same code in the endpoint.
Answers
B.
Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline Use the same code in the endpoint.
C.
Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline Share this code with the end users of the endpoint.
Answers
C.
Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline Share this code with the end users of the endpoint.
D.
Batch the real-time requests by using a time window and then use the Dataflow pipeline to preprocess the batched requests. Send the preprocessed requests to the endpoint.
Answers
D.
Batch the real-time requests by using a time window and then use the Dataflow pipeline to preprocess the batched requests. Send the preprocessed requests to the endpoint.
Suggested answer: B

Explanation:

According to the official exam guide1, one of the skills assessed in the exam is to ''design, build, and productionalize ML models to solve business challenges using Google Cloud technologies''.Dataflow2is a fully managed, fast, and easy-to-use service for running Apache Spark and Apache Hadoop clusters on Google Cloud. Dataflow supports both batch and streaming data processing pipelines. However, if your use case requires real-time inference, you need to ensure that the data preprocessing logic is applied consistently between training and serving. One way to achieve this is to refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline, and use the same code in the endpoint. This way, you can avoid data skew and drift issues that might arise from using different preprocessing methods for training and serving. Therefore, option B is the best way to ensure the data preprocessing logic is applied consistently between training and serving. The other options are not relevant or optimal for this scenario.Reference:

Professional ML Engineer Exam Guide

Dataflow

Google Professional Machine Learning Certification Exam 2023

Latest Google Professional Machine Learning Engineer Actual Free Exam Questions

asked 18/09/2024
Panayiotis Markatos
51 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first