ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 111 - MLS-C01 discussion

Report
Export

A Machine Learning Specialist is preparing data for training on Amazon SageMaker The Specialist is transformed into a numpy .array, which appears to be negatively affecting the speed of the training

What should the Specialist do to optimize the data for training on SageMaker'?

A.
Use the SageMaker batch transform feature to transform the training data into a DataFrame
Answers
A.
Use the SageMaker batch transform feature to transform the training data into a DataFrame
B.
Use AWS Glue to compress the data into the Apache Parquet format
Answers
B.
Use AWS Glue to compress the data into the Apache Parquet format
C.
Transform the dataset into the Recordio protobuf format
Answers
C.
Transform the dataset into the Recordio protobuf format
D.
Use the SageMaker hyperparameter optimization feature to automatically optimize the data
Answers
D.
Use the SageMaker hyperparameter optimization feature to automatically optimize the data
Suggested answer: C

Explanation:

The Recordio protobuf format is a binary data format that is optimized for training on SageMaker. It allows faster data loading and lower memory usage compared to other formats such as CSV or numpy arrays. The Recordio protobuf format also supports features such as sparse input, variable-length input, and label embedding. To use the Recordio protobuf format, the data needs to be serialized and deserialized using the appropriate libraries. Some of the built-in algorithms in SageMaker support the Recordio protobuf format as a content type for training and inference.References:

Common Data Formats for Training

Using RecordIO Format

Content Types Supported by Built-in Algorithms

asked 16/09/2024
Nicola Grossi
38 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first