List of questions
Related questions
Question 77 - Professional Machine Learning Engineer discussion
You work for a biotech startup that is experimenting with deep learning ML models based on properties of biological organisms. Your team frequently works on early-stage experiments with new architectures of ML models, and writes custom TensorFlow ops in C++. You train your models on large datasets and large batch sizes. Your typical batch size has 1024 examples, and each example is about 1 MB in size. The average size of a network with all weights and embeddings is 20 GB. What hardware should you choose for your models?
A.
A cluster with 2 n1-highcpu-64 machines, each with 8 NVIDIA Tesla V100 GPUs (128 GB GPU memory in total), and a n1-highcpu-64 machine with 64 vCPUs and 58 GB RAM
B.
A cluster with 2 a2-megagpu-16g machines, each with 16 NVIDIA Tesla A100 GPUs (640 GB GPU memory in total), 96 vCPUs, and 1.4 TB RAM
C.
A cluster with an n1-highcpu-64 machine with a v2-8 TPU and 64 GB RAM
D.
A cluster with 4 n1-highcpu-96 machines, each with 96 vCPUs and 86 GB RAM
Your answer:
0 comments
Sorted by
Leave a comment first