ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 234 - MLS-C01 discussion

Report
Export

A social media company wants to develop a machine learning (ML) model to detect Inappropriate or offensive content in images. The company has collected a large dataset of labeled images and plans to use the built-in Amazon SageMaker image classification algorithm to train the model. The company also intends to use SageMaker pipe mode to speed up the training.

...company splits the dataset into training, validation, and testing datasets. The company stores the training and validation images in folders that are named Training and Validation, respectively. The folder ...ain subfolders that correspond to the names of the dataset classes. The company resizes the images to the same sue and generates two input manifest files named training.1st and validation.1st, for the ..ing dataset and the validation dataset. respectively. Finally, the company creates two separate Amazon S3 buckets for uploads of the training dataset and the validation dataset.

...h additional data preparation steps should the company take before uploading the files to Amazon S3?

A.
Generate two Apache Parquet files, training.parquet and validation.parquet. by reading the images into a Pandas data frame and storing the data frame as a Parquet file. Upload the Parquet files to the training S3 bucket
Answers
A.
Generate two Apache Parquet files, training.parquet and validation.parquet. by reading the images into a Pandas data frame and storing the data frame as a Parquet file. Upload the Parquet files to the training S3 bucket
B.
Compress the training and validation directories by using the Snappy compression library Upload the manifest and compressed files to the training S3 bucket
Answers
B.
Compress the training and validation directories by using the Snappy compression library Upload the manifest and compressed files to the training S3 bucket
C.
Compress the training and validation directories by using the gzip compression library. Upload the manifest and compressed files to the training S3 bucket.
Answers
C.
Compress the training and validation directories by using the gzip compression library. Upload the manifest and compressed files to the training S3 bucket.
D.
Generate two RecordIO files, training rec and validation.rec. from the manifest files by using the im2rec Apache MXNet utility tool. Upload the RecordlO files to the training S3 bucket.
Answers
D.
Generate two RecordIO files, training rec and validation.rec. from the manifest files by using the im2rec Apache MXNet utility tool. Upload the RecordlO files to the training S3 bucket.
Suggested answer: D

Explanation:

The SageMaker image classification algorithm supports both RecordIO and image content types for training in file mode, and supports the RecordIO content type for training in pipe mode1.However, the algorithm also supports training in pipe mode using the image files without creating RecordIO files, by using the augmented manifest format2. In this case, the company should generate

asked 16/09/2024
Eduardo Collado
29 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first