List of questions
Related questions
Question 207 - Professional Machine Learning Engineer discussion
You have deployed a scikit-learn model to a Vertex Al endpoint using a custom model server. You enabled auto scaling; however, the deployed model fails to scale beyond one replica, which led to dropped requests. You notice that CPU utilization remains low even during periods of high load. What should you do?
A.
Attach a GPU to the prediction nodes.
B.
Increase the number of workers in your model server.
C.
Schedule scaling of the nodes to match expected demand.
D.
Increase the minReplicaCount in your DeployedModel configuration.
Your answer:
0 comments
Sorted by
Leave a comment first