ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 99 - Professional Machine Learning Engineer discussion

Report
Export

You are profiling the performance of your TensorFlow model training time and notice a performance issue caused by inefficiencies in the input data pipeline for a single 5 terabyte CSV file dataset on Cloud Storage. You need to optimize the input pipeline performance. Which action should you try first to increase the efficiency of your pipeline?

A.
Preprocess the input CSV file into a TFRecord file.
Answers
A.
Preprocess the input CSV file into a TFRecord file.
B.
Randomly select a 10 gigabyte subset of the data to train your model.
Answers
B.
Randomly select a 10 gigabyte subset of the data to train your model.
C.
Split into multiple CSV files and use a parallel interleave transformation.
Answers
C.
Split into multiple CSV files and use a parallel interleave transformation.
D.
Set the reshuffle_each_iteration parameter to true in the tf.data.Dataset.shuffle method.
Answers
D.
Set the reshuffle_each_iteration parameter to true in the tf.data.Dataset.shuffle method.
Suggested answer: A

Explanation:

According to the web search results, the TFRecord format is a recommended way to store large amounts of data efficiently and improve the performance of the data input pipeline123.The TFRecord format is a binary format that can be compressed and serialized, which reduces the I/O overhead and the memory footprint of the data1.The tf.data API provides tools to create and read TFRecord files easily1.

The other options are not as effective as option A. Option B would reduce the amount of data available for training and might affect the model accuracy. Option C would still require reading from a single CSV file at a time, which might not utilize the full bandwidth of the remote storage. Option D would only affect the order of the data elements, not the speed of reading them.

asked 18/09/2024
Rahul Chugh
38 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first