ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 202 - Professional Machine Learning Engineer discussion

Report
Export

You need to use TensorFlow to train an image classification model. Your dataset is located in a Cloud Storage directory and contains millions of labeled images Before training the model, you need to prepare the data. You want the data preprocessing and model training workflow to be as efficient scalable, and low maintenance as possible. What should you do?

A.
1 Create a Dataflow job that creates sharded TFRecord files in a Cloud Storage directory. 2 Reference tf .data.TFRecordDataset in the training script. 3. Train the model by using Vertex Al Training with a V100 GPU.
Answers
A.
1 Create a Dataflow job that creates sharded TFRecord files in a Cloud Storage directory. 2 Reference tf .data.TFRecordDataset in the training script. 3. Train the model by using Vertex Al Training with a V100 GPU.
B.
1 Create a Dataflow job that moves the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label. 2 Reference tfds.fclder_da-asst.imageFclder in the training script. 3. Train the model by using Vertex AI Training with a V100 GPU.
Answers
B.
1 Create a Dataflow job that moves the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label. 2 Reference tfds.fclder_da-asst.imageFclder in the training script. 3. Train the model by using Vertex AI Training with a V100 GPU.
C.
1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench instance. 2 Write a Python script that creates sharded TFRecord files in a directory inside the instance 3. Reference tf. da-a.TFRecrrdDataset in the training script. 4. Train the model by using the Workbench instance.
Answers
C.
1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench instance. 2 Write a Python script that creates sharded TFRecord files in a directory inside the instance 3. Reference tf. da-a.TFRecrrdDataset in the training script. 4. Train the model by using the Workbench instance.
D.
1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench instance. 2 Write a Python scnpt that copies the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label. 3 Reference tf ds. f older_dataset. imageFolder in the training script. 4. Train the model by using the Workbench instance.
Answers
D.
1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench instance. 2 Write a Python scnpt that copies the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label. 3 Reference tf ds. f older_dataset. imageFolder in the training script. 4. Train the model by using the Workbench instance.
Suggested answer: A

Explanation:

TFRecord is a binary file format that stores your data as a sequence of binary strings1.TFRecord files are efficient, scalable, and easy to process1.Sharding is a technique that splits a large file into smaller files, which can improve parallelism and performance2.Dataflow is a service that allows you to create and run data processing pipelines on Google Cloud3.Dataflow can create sharded TFRecord files from your images in a Cloud Storage directory4.

tf.data.TFRecordDataset is a class that allows you to read and parse TFRecord files in TensorFlow. You can use this class to create a tf.data.Dataset object that represents your input data for training. tf.data.Dataset is a high-level API that provides various methods to transform, batch, shuffle, and prefetch your data.

Vertex AI Training is a service that allows you to train your custom models on Google Cloud using various hardware accelerators, such as GPUs. Vertex AI Training supports TensorFlow models and can read data from Cloud Storage. You can use Vertex AI Training to train your image classification model by using a V100 GPU, which is a powerful and fast GPU for deep learning.

TFRecord and tf.Example | TensorFlow Core

Sharding | TensorFlow Core

Dataflow | Google Cloud

Creating sharded TFRecord files | Google Cloud

[tf.data.TFRecordDataset | TensorFlow Core v2.6.0]

[tf.data: Build TensorFlow input pipelines | TensorFlow Core]

[Vertex AI Training | Google Cloud]

[NVIDIA Tesla V100 GPU | NVIDIA]

asked 18/09/2024
Martin Schwarz
35 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first