ExamGecko
Home Home / Google / Professional Machine Learning Engineer

Google Professional Machine Learning Engineer Practice Test - Questions Answers, Page 24

Question list
Search
Search

List of questions

Search

Related questions











You are developing a recommendation engine for an online clothing store. The historical customer transaction data is stored in BigQuery and Cloud Storage. You need to perform exploratory data analysis (EDA), preprocessing and model training. You plan to rerun these EDA, preprocessing, and training steps as you experiment with different types of algorithms. You want to minimize the cost and development effort of running these steps as you experiment. How should you configure the environment?

A.
Create a Vertex Al Workbench user-managed notebook using the default VM instance, and use the %%bigquery magic commands in Jupyter to query the tables.
A.
Create a Vertex Al Workbench user-managed notebook using the default VM instance, and use the %%bigquery magic commands in Jupyter to query the tables.
Answers
B.
Create a Vertex Al Workbench managed notebook to browse and query the tables directly from the JupyterLab interface.
B.
Create a Vertex Al Workbench managed notebook to browse and query the tables directly from the JupyterLab interface.
Answers
C.
Create a Vertex Al Workbench user-managed notebook on a Dataproc Hub. and use the %%bigquery magic commands in Jupyter to query the tables.
C.
Create a Vertex Al Workbench user-managed notebook on a Dataproc Hub. and use the %%bigquery magic commands in Jupyter to query the tables.
Answers
D.
Create a Vertex Al Workbench managed notebook on a Dataproc cluster, and use the spark-bigquery-connector to access the tables.
D.
Create a Vertex Al Workbench managed notebook on a Dataproc cluster, and use the spark-bigquery-connector to access the tables.
Answers
Suggested answer: A

Explanation:

Cost-effectiveness:User-managed notebooks in Vertex AI Workbench allow you to leverage pre-configured virtual machines with reasonable resource allocation, keeping costs lower comparedto options involving managed notebooks or Dataproc clusters.Development flexibility:User-managed notebooks offer full control over the environment,allowing you to install additional libraries or dependencies needed for your specific EDA,preprocessing, and model training tasks. This flexibility is crucial while experimenting withdifferent algorithms.

You recently deployed a model lo a Vertex Al endpoint and set up online serving in Vertex Al Feature Store. You have configured a daily batch ingestion job to update your featurestore During the batch ingestion jobs you discover that CPU utilization is high in your featurestores online serving nodes and that feature retrieval latency is high. You need to improve online serving performance during the daily batch ingestion. What should you do?

A.
Schedule an increase in the number of online serving nodes in your featurestore prior to the batch ingestion jobs.
A.
Schedule an increase in the number of online serving nodes in your featurestore prior to the batch ingestion jobs.
Answers
B.
Enable autoscaling of the online serving nodes in your featurestore
B.
Enable autoscaling of the online serving nodes in your featurestore
Answers
C.
Enable autoscaling for the prediction nodes of your DeployedModel in the Vertex Al endpoint.
C.
Enable autoscaling for the prediction nodes of your DeployedModel in the Vertex Al endpoint.
Answers
D.
Increase the worker counts in the importFeaturevalues request of your batch ingestion job.
D.
Increase the worker counts in the importFeaturevalues request of your batch ingestion job.
Answers
Suggested answer: B

Explanation:

Vertex AI Feature Store provides two options for online serving: Bigtable and optimized online serving. Both options support autoscaling, which means that the number of online serving nodes can automatically adjust to the traffic demand. By enabling autoscaling, you can improve the online serving performance and reduce the feature retrieval latency during the daily batch ingestion. Autoscaling also helps you optimize the cost and resource utilization of your featurestore.Reference:

Online serving | Vertex AI | Google Cloud

New Vertex AI Feature Store: BigQuery-Powered, GenAI-Ready | Google Cloud Blog

You are developing a custom TensorFlow classification model based on tabular data. Your raw data is stored in BigQuery contains hundreds of millions of rows, and includes both categorical and numerical features. You need to use a MaxMin scaler on some numerical features, and apply a one-hot encoding to some categorical features such as SKU names. Your model will be trained over multiple epochs. You want to minimize the effort and cost of your solution. What should you do?

A.
1 Write a SQL query to create a separate lookup table to scale the numerical features. 2. Deploy a TensorFlow-based model from Hugging Face to BigQuery to encode the text features. 3. Feed the resulting BigQuery view into Vertex Al Training.
A.
1 Write a SQL query to create a separate lookup table to scale the numerical features. 2. Deploy a TensorFlow-based model from Hugging Face to BigQuery to encode the text features. 3. Feed the resulting BigQuery view into Vertex Al Training.
Answers
B.
1 Use BigQuery to scale the numerical features. 2. Feed the features into Vertex Al Training. 3 Allow TensorFlow to perform the one-hot text encoding.
B.
1 Use BigQuery to scale the numerical features. 2. Feed the features into Vertex Al Training. 3 Allow TensorFlow to perform the one-hot text encoding.
Answers
C.
1 Use TFX components with Dataflow to encode the text features and scale the numerical features. 2 Export results to Cloud Storage as TFRecords. 3 Feed the data into Vertex Al Training.
C.
1 Use TFX components with Dataflow to encode the text features and scale the numerical features. 2 Export results to Cloud Storage as TFRecords. 3 Feed the data into Vertex Al Training.
Answers
D.
1 Write a SQL query to create a separate lookup table to scale the numerical features. 2 Perform the one-hot text encoding in BigQuery. 3. Feed the resulting BigQuery view into Vertex Al Training.
D.
1 Write a SQL query to create a separate lookup table to scale the numerical features. 2 Perform the one-hot text encoding in BigQuery. 3. Feed the resulting BigQuery view into Vertex Al Training.
Answers
Suggested answer: C

Explanation:

TFX (TensorFlow Extended) is a platform for end-to-end machine learning pipelines. It provides components for data ingestion, preprocessing, validation, model training, serving, and monitoring. Dataflow is a fully managed service for scalable data processing. By using TFX components with Dataflow, you can perform feature engineering on large-scale tabular data in a distributed and efficient way. You can use the Transform component to apply the MaxMin scaler and the one-hot encoding to the numerical and categorical features, respectively. You can also use the ExampleGen component to read data from BigQuery and the Trainer component to train your TensorFlow model. The output of the Transform component is a TFRecord file, which is a binary format for storing TensorFlow data. You can export the TFRecord file to Cloud Storage and feed it into Vertex AI Training, which is a managed service for training custom machine learning models on Google Cloud.Reference:

TFX | TensorFlow

Dataflow | Google Cloud

Vertex AI Training | Google Cloud

You are developing a custom image classification model in Python. You plan to run your training application on Vertex Al Your input dataset contains several hundred thousand small images You need to determine how to store and access the images for training. You want to maximize data throughput and minimize training time while reducing the amount of additional code. What should you do?

A.
Store image files in Cloud Storage and access them directly.
A.
Store image files in Cloud Storage and access them directly.
Answers
B.
Store image files in Cloud Storage and access them by using serialized records.
B.
Store image files in Cloud Storage and access them by using serialized records.
Answers
C.
Store image files in Cloud Filestore, and access them by using serialized records.
C.
Store image files in Cloud Filestore, and access them by using serialized records.
Answers
D.
Store image files in Cloud Filestore and access them directly by using an NFS mount point.
D.
Store image files in Cloud Filestore and access them directly by using an NFS mount point.
Answers
Suggested answer: B

Explanation:

Cloud Storage is a scalable and cost-effective storage service for any type of data. By storing image files in Cloud Storage, you can access them from anywhere and avoid the overhead of managing your own storage infrastructure. However, accessing image files directly from Cloud Storage can be slow and inefficient, especially for large-scale training. A better option is to use serialized records, such as TFRecord or Apache Avro, which are binary formats that store multiple images and their labels in a single file. Serialized records can improve the data throughput and reduce the network latency, as well as enable data compression and sharding. You can use TensorFlow or Apache Beam APIs to create and read serialized records from Cloud Storage. This solution requires minimal code changes and can speed up your training time significantly.Reference:

Cloud Storage | Google Cloud

TFRecord and tf.Example | TensorFlow Core

Apache Avro 1.10.2 Specification

Using Apache Beam with Cloud Storage | Cloud Storage

You are developing an image recognition model using PyTorch based on ResNet50 architecture Your code is working fine on your local laptop on a small subsample. Your full dataset has 200k labeled images You want to quickly scale your training workload while minimizing cost. You plan to use 4 V100 GPUs What should you do?

A.
Create a Google Kubernetes Engine cluster with a node pool that has 4 V100 GPUs Prepare and submit a TFJob operator to this node pool.
A.
Create a Google Kubernetes Engine cluster with a node pool that has 4 V100 GPUs Prepare and submit a TFJob operator to this node pool.
Answers
B.
Configure a Compute Engine VM with all the dependencies that launches the training Tram your model with Vertex Al using a custom tier that contains the required GPUs.
B.
Configure a Compute Engine VM with all the dependencies that launches the training Tram your model with Vertex Al using a custom tier that contains the required GPUs.
Answers
C.
Create a Vertex Al Workbench user-managed notebooks instance with 4 V100 GPUs, and use it to tram your model.
C.
Create a Vertex Al Workbench user-managed notebooks instance with 4 V100 GPUs, and use it to tram your model.
Answers
D.
Package your code with Setuptools and use a pre-built container. Train your model with Vertex Al using a custom tier that contains the required GPUs.
D.
Package your code with Setuptools and use a pre-built container. Train your model with Vertex Al using a custom tier that contains the required GPUs.
Answers
Suggested answer: D

Explanation:

Vertex AI is a unified platform for building and managing machine learning solutions on Google Cloud. It provides a managed service for training custom models with various frameworks, such as TensorFlow, PyTorch, scikit-learn, and XGBoost. To train your PyTorch model with Vertex AI, you need to package your code with Setuptools, which is a Python tool for creating and distributing packages. You also need to use a pre-built container, which is a Docker image that contains the dependencies and libraries for your framework. You can choose from a list of pre-built containers provided by Google, or create your own custom container. By using a pre-built container, you can avoid the hassle of installing and configuring the environment for your model. You can also specify a custom tier for your training job, which allows you to select the number and type of GPUs you want to use. You can choose from various GPU options, such as V100, P100, K80, and T4. By using 4 V100 GPUs, you can leverage the high performance and memory capacity of these accelerators to train your model faster and cheaper than using CPUs. This solution requires minimal changes to your code and can scale your training workload efficiently.Reference:

Vertex AI | Google Cloud

Custom training with pre-built containers | Vertex AI

[Using GPUs | Vertex AI]

You work for a retail company. You have been tasked with building a model to determine the probability of churn for each customer. You need the predictions to be interpretable so the results can be used to develop marketing campaigns that target at-risk customers. What should you do?

A.
Build a random forest regression model in a Vertex Al Workbench notebook instance Configure the model to generate feature importance's after the model is trained.
A.
Build a random forest regression model in a Vertex Al Workbench notebook instance Configure the model to generate feature importance's after the model is trained.
Answers
B.
Build an AutoML tabular regression model Configure the model to generate explanations when it makes predictions.
B.
Build an AutoML tabular regression model Configure the model to generate explanations when it makes predictions.
Answers
C.
Build a custom TensorFlow neural network by using Vertex Al custom training Configure the model to generate explanations when it makes predictions.
C.
Build a custom TensorFlow neural network by using Vertex Al custom training Configure the model to generate explanations when it makes predictions.
Answers
D.
Build a random forest classification model in a Vertex Al Workbench notebook instance Configure the model to generate feature importance's after the model is trained.
D.
Build a random forest classification model in a Vertex Al Workbench notebook instance Configure the model to generate feature importance's after the model is trained.
Answers
Suggested answer: D

Explanation:

A random forest is an ensemble learning method that consists of many decision trees. It can be used for both regression and classification tasks. A random forest classification model can predict the probability of churn for each customer by assigning them to different classes, such as high-risk, medium-risk, or low-risk. A random forest model can also generate feature importances, which measure how much each feature contributes to the prediction. Feature importances can help interpret the model and understand what factors influence customer churn. Vertex AI Workbench is an integrated development environment (IDE) that allows you to create and run Jupyter notebooks on Google Cloud. You can use Vertex AI Workbench to build a random forest classification model in Python, using libraries such as scikit-learn or TensorFlow. You can also configure the model to generate feature importances after the model is trained, and visualize them using plots or tables. This solution can help you build an interpretable model for customer churn prediction, and use the results to design marketing campaigns that target at-risk customers.Reference:

Random Forests | scikit-learn

Vertex AI Workbench | Google Cloud

Interpreting Random Forests | Towards Data Science

You work for a company that is developing an application to help users with meal planning You want to use machine learning to scan a corpus of recipes and extract each ingredient (e g carrot, rice pasta) and each kitchen cookware (e.g. bowl, pot spoon) mentioned Each recipe is saved in an unstructured text file What should you do?

A.
Create a text dataset on Vertex Al for entity extraction Create two entities called ingredient' and cookware' and label at least 200 examples of each entity Train an AutoML entity extraction model to extract occurrences of these entity types Evaluate performance on a holdout dataset.
A.
Create a text dataset on Vertex Al for entity extraction Create two entities called ingredient' and cookware' and label at least 200 examples of each entity Train an AutoML entity extraction model to extract occurrences of these entity types Evaluate performance on a holdout dataset.
Answers
B.
Create a multi-label text classification dataset on Vertex Al Create a test dataset and label each recipe that corresponds to its ingredients and cookware Train a multi-class classification model Evaluate the model's performance on a holdout dataset.
B.
Create a multi-label text classification dataset on Vertex Al Create a test dataset and label each recipe that corresponds to its ingredients and cookware Train a multi-class classification model Evaluate the model's performance on a holdout dataset.
Answers
C.
Use the Entity Analysis method of the Natural Language API to extract the ingredients and cookware from each recipe Evaluate the model's performance on a prelabeled dataset.
C.
Use the Entity Analysis method of the Natural Language API to extract the ingredients and cookware from each recipe Evaluate the model's performance on a prelabeled dataset.
Answers
D.
Create a text dataset on Vertex Al for entity extraction Create as many entities as there are different ingredients and cookware Train an AutoML entity extraction model to extract those entities Evaluate the models performance on a holdout dataset.
D.
Create a text dataset on Vertex Al for entity extraction Create as many entities as there are different ingredients and cookware Train an AutoML entity extraction model to extract those entities Evaluate the models performance on a holdout dataset.
Answers
Suggested answer: A

Explanation:

Entity extraction is a natural language processing (NLP) task that involves identifying and extracting specific types of information from text, such as names, dates, locations, etc. Entity extraction can help you analyze a corpus of recipes and extract each ingredient and cookware mentioned in them. Vertex AI is a unified platform for building and managing machine learning solutions on Google Cloud. It provides a service for AutoML entity extraction, which allows you to create and train custom entity extraction models without writing any code. You can use Vertex AI to create a text dataset for entity extraction, and label your data with two entities: ''ingredient'' and ''cookware''. You need to label at least 200 examples of each entity type to train an AutoML entity extraction model. You can also use a holdout dataset to evaluate the performance of your model, such as precision, recall, and F1-score. This solution can help you build a machine learning model to scan a corpus of recipes and extract each ingredient and cookware mentioned in them, and use the results to help users with meal planning.Reference:

AutoML Entity Extraction | Vertex AI

Preparing data for AutoML Entity Extraction | Vertex AI

You work for an organization that operates a streaming music service. You have a custom production model that is serving a 'next song' recommendation based on a user's recent listening history. Your model is deployed on a Vertex Al endpoint. You recently retrained the same model by using fresh data. The model received positive test results offline. You now want to test the new model in production while minimizing complexity. What should you do?

A.
Create a new Vertex Al endpoint for the new model and deploy the new model to that new endpoint Build a service to randomly send 5% of production traffic to the new endpoint Monitor end-user metrics such as listening time If end-user metrics improve between models over time gradually increase the percentage of production traffic sent to the new endpoint.
A.
Create a new Vertex Al endpoint for the new model and deploy the new model to that new endpoint Build a service to randomly send 5% of production traffic to the new endpoint Monitor end-user metrics such as listening time If end-user metrics improve between models over time gradually increase the percentage of production traffic sent to the new endpoint.
Answers
B.
Capture incoming prediction requests in BigQuery Create an experiment in Vertex Al Experiments Run batch predictions for both models using the captured data Use the user's selected song to compare the models performance side by side If the new models performance metrics are better than the previous model deploy the new model to production.
B.
Capture incoming prediction requests in BigQuery Create an experiment in Vertex Al Experiments Run batch predictions for both models using the captured data Use the user's selected song to compare the models performance side by side If the new models performance metrics are better than the previous model deploy the new model to production.
Answers
C.
Deploy the new model to the existing Vertex Al endpoint Use traffic splitting to send 5% of production traffic to the new model Monitor end-user metrics, such as listening time If end-user metrics improve between models over time, gradually increase the percentage of production traffic sent to the new model.
C.
Deploy the new model to the existing Vertex Al endpoint Use traffic splitting to send 5% of production traffic to the new model Monitor end-user metrics, such as listening time If end-user metrics improve between models over time, gradually increase the percentage of production traffic sent to the new model.
Answers
D.
Configure a model monitoring job for the existing Vertex Al endpoint. Configure the monitoring job to detect prediction drift, and set a threshold for alerts Update the model on the endpoint from the previous model to the new model If you receive an alert of prediction drift, revert to the previous model.
D.
Configure a model monitoring job for the existing Vertex Al endpoint. Configure the monitoring job to detect prediction drift, and set a threshold for alerts Update the model on the endpoint from the previous model to the new model If you receive an alert of prediction drift, revert to the previous model.
Answers
Suggested answer: C

Explanation:

Traffic splitting is a feature of Vertex AI that allows you to distribute the prediction requests among multiple models or model versions within the same endpoint. You can specify the percentage of traffic that each model or model version receives, and change it at any time. Traffic splitting can help you test the new model in production without creating a new endpoint or a separate service. You can deploy the new model to the existing Vertex AI endpoint, and use traffic splitting to send 5% of production traffic to the new model. You can monitor the end-user metrics, such as listening time, to compare the performance of the new model and the previous model. If the end-user metrics improve between models over time, you can gradually increase the percentage of production traffic sent to the new model. This solution can help you test the new model in production while minimizing complexity and cost.Reference:

Traffic splitting | Vertex AI

Deploying models to endpoints | Vertex AI

You created a model that uses BigQuery ML to perform linear regression. You need to retrain the model on the cumulative data collected every week. You want to minimize the development effort and the scheduling cost. What should you do?

A.
Use BigQuerys scheduling service to run the model retraining query periodically.
A.
Use BigQuerys scheduling service to run the model retraining query periodically.
Answers
B.
Create a pipeline in Vertex Al Pipelines that executes the retraining query and use the Cloud Scheduler API to run the query weekly.
B.
Create a pipeline in Vertex Al Pipelines that executes the retraining query and use the Cloud Scheduler API to run the query weekly.
Answers
C.
Use Cloud Scheduler to trigger a Cloud Function every week that runs the query for retraining the model.
C.
Use Cloud Scheduler to trigger a Cloud Function every week that runs the query for retraining the model.
Answers
D.
Use the BigQuery API Connector and Cloud Scheduler to trigger. Workflows every week that retrains the model.
D.
Use the BigQuery API Connector and Cloud Scheduler to trigger. Workflows every week that retrains the model.
Answers
Suggested answer: B

Explanation:

BigQuery is a serverless data warehouse that allows you to perform SQL queries on large-scale data. BigQuery ML is a feature of BigQuery that enables you to create and execute machine learning models using standard SQL queries. You can use BigQuery ML to perform linear regression on your data and create a model. BigQuery also provides a scheduling service that allows you to create and manage recurring SQL queries. You can use BigQuery's scheduling service to run the model retraining query periodically, such as every week. You can specify the destination table for the query results, and the schedule options, such as start date, end date, frequency, and time zone. You can also monitor the status and history of your scheduled queries. This solution can help you retrain the model on the cumulative data collected every week, while minimizing the development effort and the scheduling cost.Reference:

BigQuery ML | Google Cloud

Scheduling queries | BigQuery

You want to migrate a scikrt-learn classifier model to TensorFlow. You plan to train the TensorFlow classifier model using the same training set that was used to train the scikit-learn model and then compare the performances using a common test set. You want to use the Vertex Al Python SDK to manually log the evaluation metrics of each model and compare them based on their F1 scores and confusion matrices. How should you log the metrics?

A.
A.
Answers
B.
B.
Answers
C.
C.
Answers
D.
D.
Answers
Suggested answer: D

Explanation:

To log the metrics of a machine learning model in TensorFlow using the Vertex AI Python SDK, you should utilize theaiplatform.log_metricsfunction to log the F1 score andaiplatform.log_classification_metricsfunction to log the confusion matrix. These functions allow users to manually record and store evaluation metrics for each model, facilitating an efficient comparison based on specific performance indicators like F1 scores and confusion matrices.Reference: The answer can be verified from official Google Cloud documentation and resources related to Vertex AI and TensorFlow.

Vertex AI Python SDK reference | Google Cloud

Logging custom metrics | Vertex AI

Migrating from scikit-learn to TensorFlow | TensorFlow

Total 285 questions
Go to page: of 29