ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 234 - Professional Machine Learning Engineer discussion

Report
Export

You are developing a custom image classification model in Python. You plan to run your training application on Vertex Al Your input dataset contains several hundred thousand small images You need to determine how to store and access the images for training. You want to maximize data throughput and minimize training time while reducing the amount of additional code. What should you do?

A.
Store image files in Cloud Storage and access them directly.
Answers
A.
Store image files in Cloud Storage and access them directly.
B.
Store image files in Cloud Storage and access them by using serialized records.
Answers
B.
Store image files in Cloud Storage and access them by using serialized records.
C.
Store image files in Cloud Filestore, and access them by using serialized records.
Answers
C.
Store image files in Cloud Filestore, and access them by using serialized records.
D.
Store image files in Cloud Filestore and access them directly by using an NFS mount point.
Answers
D.
Store image files in Cloud Filestore and access them directly by using an NFS mount point.
Suggested answer: B

Explanation:

Cloud Storage is a scalable and cost-effective storage service for any type of data. By storing image files in Cloud Storage, you can access them from anywhere and avoid the overhead of managing your own storage infrastructure. However, accessing image files directly from Cloud Storage can be slow and inefficient, especially for large-scale training. A better option is to use serialized records, such as TFRecord or Apache Avro, which are binary formats that store multiple images and their labels in a single file. Serialized records can improve the data throughput and reduce the network latency, as well as enable data compression and sharding. You can use TensorFlow or Apache Beam APIs to create and read serialized records from Cloud Storage. This solution requires minimal code changes and can speed up your training time significantly.Reference:

Cloud Storage | Google Cloud

TFRecord and tf.Example | TensorFlow Core

Apache Avro 1.10.2 Specification

Using Apache Beam with Cloud Storage | Cloud Storage

asked 18/09/2024
Tunde Ogunkoya
31 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first