ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 245 - MLS-C01 discussion

Report
Export

A company's machine learning (ML) specialist is building a computer vision model to classify 10 different traffic signs. The company has stored 100 images of each class in Amazon S3, and the company has another 10.000 unlabeled images. All the images come from dash cameras and are a size of 224 pixels * 224 pixels. After several training runs, the model is overfitting on the training data.

Which actions should the ML specialist take to address this problem? (Select TWO.)

A.
Use Amazon SageMaker Ground Truth to label the unlabeled images
Answers
A.
Use Amazon SageMaker Ground Truth to label the unlabeled images
B.
Use image preprocessing to transform the images into grayscale images.
Answers
B.
Use image preprocessing to transform the images into grayscale images.
C.
Use data augmentation to rotate and translate the labeled images.
Answers
C.
Use data augmentation to rotate and translate the labeled images.
D.
Replace the activation of the last layer with a sigmoid.
Answers
D.
Replace the activation of the last layer with a sigmoid.
E.
Use the Amazon SageMaker k-nearest neighbors (k-NN) algorithm to label the unlabeled images.
Answers
E.
Use the Amazon SageMaker k-nearest neighbors (k-NN) algorithm to label the unlabeled images.
Suggested answer: C, E

Explanation:

Data augmentation is a technique to increase the size and diversity of the training data by applying random transformations such as rotation, translation, scaling, flipping, etc. This can help reduce overfitting and improve the generalization of the model.Data augmentation can be done using the Amazon SageMaker image classification algorithm, which supports various augmentation options such as horizontal_flip, vertical_flip, rotate, brightness, contrast, etc1

The Amazon SageMaker k-nearest neighbors (k-NN) algorithm is a supervised learning algorithm that can be used to label unlabeled data based on the similarity to the labeled data. The k-NN algorithm assigns a label to an unlabeled instance by finding the k closest labeled instances in the feature space and taking a majority vote among their labels. This can help increase the size and diversity of the training data and reduce overfitting.The k-NN algorithm can be used with the Amazon SageMaker image classification algorithm by extracting features from the images using a pre-trained model and then applying the k-NN algorithm on the feature vectors2

Using Amazon SageMaker Ground Truth to label the unlabeled images is not a good option because it is a manual and costly process that requires human annotators. Moreover, it does not address the issue of overfitting on the existing labeled data.

Using image preprocessing to transform the images into grayscale images is not a good option because it reduces the amount of information and variation in the images, which can degrade the performance of the model. Moreover, it does not address the issue of overfitting on the existing labeled data.

Replacing the activation of the last layer with a sigmoid is not a good option because it is not suitable for a multi-class classification problem. A sigmoid activation function outputs a value between 0 and 1, which can be interpreted as a probability of belonging to a single class. However, for a multi-class classification problem, the output should be a vector of probabilities that sum up to 1, which can be achieved by using a softmax activation function.

References:

1:Image classification algorithm - Amazon SageMaker

2:k-nearest neighbors (k-NN) algorithm - Amazon SageMaker

asked 16/09/2024
karl hickey
42 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first