ExamGecko
Home Home / Amazon / MLS-C01

Amazon MLS-C01 Practice Test - Questions Answers, Page 18

Question list
Search
Search

List of questions

Search

Related questions











A Machine Learning Specialist is attempting to build a linear regression model.

Given the displayed residual plot only, what is the MOST likely problem with the model?

A.
Linear regression is inappropriate. The residuals do not have constant variance.
A.
Linear regression is inappropriate. The residuals do not have constant variance.
Answers
B.
Linear regression is inappropriate. The underlying data has outliers.
B.
Linear regression is inappropriate. The underlying data has outliers.
Answers
C.
Linear regression is appropriate. The residuals have a zero mean.
C.
Linear regression is appropriate. The residuals have a zero mean.
Answers
D.
Linear regression is appropriate. The residuals have constant variance.
D.
Linear regression is appropriate. The residuals have constant variance.
Answers
Suggested answer: A

Explanation:

A residual plot is a type of plot that displays the values of a predictor variable in a regression model along the x-axis and the values of the residuals along the y-axis. This plot is used to assess whether or not the residuals in a regression model are normally distributed and whether or not they exhibit heteroscedasticity. Heteroscedasticity means that the variance of the residuals is not constant across different values of the predictor variable. This violates one of the assumptions of linear regression and can lead to biased estimates and unreliable predictions. The displayed residual plot shows a clear pattern of heteroscedasticity, as the residuals spread out as the fitted values increase. This indicates that linear regression is inappropriate for this data and a different model should be used.References:

Regression - Amazon Machine Learning

How to Create a Residual Plot by Hand

How to Create a Residual Plot in Python

A machine learning specialist works for a fruit processing company and needs to build a system that categorizes apples into three types. The specialist has collected a dataset that contains 150 images for each type of apple and applied transfer learning on a neural network that was pretrained on ImageNet with this dataset.

The company requires at least 85% accuracy to make use of the model.

After an exhaustive grid search, the optimal hyperparameters produced the following:

68% accuracy on the training set

67% accuracy on the validation set

What can the machine learning specialist do to improve the system's accuracy?

A.
Upload the model to an Amazon SageMaker notebook instance and use the Amazon SageMaker HPO feature to optimize the model's hyperparameters.
A.
Upload the model to an Amazon SageMaker notebook instance and use the Amazon SageMaker HPO feature to optimize the model's hyperparameters.
Answers
B.
Add more data to the training set and retrain the model using transfer learning to reduce the bias.
B.
Add more data to the training set and retrain the model using transfer learning to reduce the bias.
Answers
C.
Use a neural network model with more layers that are pretrained on ImageNet and apply transfer learning to increase the variance.
C.
Use a neural network model with more layers that are pretrained on ImageNet and apply transfer learning to increase the variance.
Answers
D.
Train a new model using the current neural network architecture.
D.
Train a new model using the current neural network architecture.
Answers
Suggested answer: B

Explanation:

The problem described in the question is a case of underfitting, where the neural network model performs poorly on both the training and validation sets. This means that the model has not learned the features of the data well enough and has high bias. To solve this issue, the machine learning specialist should consider the following change:

Add more data to the training set and retrain the model using transfer learning to reduce the bias: Adding more data to the training set can help the model learn more patterns and variations in the data and improve its performance. Transfer learning can also help the model leverage the knowledge from the pre-trained network and adapt it to the new data. This can reduce the bias and increase the accuracy of the model.

References:

Transfer learning for TensorFlow image classification models in Amazon SageMaker

Transfer learning for custom labels using a TensorFlow container and ''bring your own algorithm'' in Amazon SageMaker

Machine Learning Concepts - AWS Training and Certification

A company uses camera images of the tops of items displayed on store shelves to determine which items were removed and which ones still remain. After several hours of data labeling, the company has a total of

1,000 hand-labeled images covering 10 distinct items. The training results were poor.

Which machine learning approach fulfills the company's long-term needs?

A.
Convert the images to grayscale and retrain the model
A.
Convert the images to grayscale and retrain the model
Answers
B.
Reduce the number of distinct items from 10 to 2, build the model, and iterate
B.
Reduce the number of distinct items from 10 to 2, build the model, and iterate
Answers
C.
Attach different colored labels to each item, take the images again, and build the model
C.
Attach different colored labels to each item, take the images again, and build the model
Answers
D.
Augment training data for each item using image variants like inversions and translations, build the model, and iterate.
D.
Augment training data for each item using image variants like inversions and translations, build the model, and iterate.
Answers
Suggested answer: D

Explanation:

Data augmentation is a technique that can increase the size and diversity of the training data by applying various transformations to the original images, such as inversions, translations, rotations, scaling, cropping, flipping, and color variations. Data augmentation can help improve the performance and generalization of image classification models by reducing overfitting and introducing more variability to the data. Data augmentation is especially useful when the original data is limited or imbalanced, as in the case of the company's problem. By augmenting the training data for each item using image variants, the company can build a more robust and accurate model that can recognize the items on the store shelves from different angles, positions, and lighting conditions. The company can also iterate on the model by adding more data or fine-tuning the hyperparameters to achieve better results.

References:

Build high performing image classification models using Amazon SageMaker JumpStart

The Effectiveness of Data Augmentation in Image Classification using Deep Learning

Data augmentation for improving deep learning in image classification problem

Class-Adaptive Data Augmentation for Image Classification

A Data Scientist is developing a binary classifier to predict whether a patient has a particular disease on a series of test results. The Data Scientist has data on 400 patients randomly selected from the population. The disease is seen in 3% of the population.

Which cross-validation strategy should the Data Scientist adopt?

A.
A k-fold cross-validation strategy with k=5
A.
A k-fold cross-validation strategy with k=5
Answers
B.
A stratified k-fold cross-validation strategy with k=5
B.
A stratified k-fold cross-validation strategy with k=5
Answers
C.
A k-fold cross-validation strategy with k=5 and 3 repeats
C.
A k-fold cross-validation strategy with k=5 and 3 repeats
Answers
D.
An 80/20 stratified split between training and validation
D.
An 80/20 stratified split between training and validation
Answers
Suggested answer: B

Explanation:

A stratified k-fold cross-validation strategy is a technique that preserves the class distribution in each fold. This is important for imbalanced datasets, such as the one in the question, where the disease is seen in only 3% of the population. If a random k-fold cross-validation strategy is used, some folds may have no positive cases or very few, which would lead to poor estimates of the model performance. A stratified k-fold cross-validation strategy ensures that each fold has the same proportion of positive and negative cases as the whole dataset, which makes the evaluation more reliable and robust. A k-fold cross-validation strategy with k=5 and 3 repeats is also a possible option, but it is more computationally expensive and may not be necessary if the stratification is done properly. An 80/20 stratified split between training and validation is another option, but it uses less data for training and validation than k-fold cross-validation, which may result in higher variance and lower accuracy of the estimates.References:

AWS Machine Learning Specialty Certification Exam Guide

AWS Machine Learning Training: Model Evaluation

How to Fix k-Fold Cross-Validation for Imbalanced Classification

A technology startup is using complex deep neural networks and GPU compute to recommend the company's products to its existing customers based upon each customer's habits and interactions. The solution currently pulls each dataset from an Amazon S3 bucket before loading the data into a TensorFlow model pulled from the company's Git repository that runs locally. This job then runs for several hours while continually outputting its progress to the same S3 bucket. The job can be paused, restarted, and continued at any time in the event of a failure, and is run from a central queue.

Senior managers are concerned about the complexity of the solution's resource management and the costs involved in repeating the process regularly. They ask for the workload to be automated so it runs once a week, starting Monday and completing by the close of business Friday.

Which architecture should be used to scale the solution at the lowest cost?

A.
Implement the solution using AWS Deep Learning Containers and run the container as a job using AWS Batch on a GPU-compatible Spot Instance
A.
Implement the solution using AWS Deep Learning Containers and run the container as a job using AWS Batch on a GPU-compatible Spot Instance
Answers
B.
Implement the solution using a low-cost GPU-compatible Amazon EC2 instance and use the AWS Instance Scheduler to schedule the task
B.
Implement the solution using a low-cost GPU-compatible Amazon EC2 instance and use the AWS Instance Scheduler to schedule the task
Answers
C.
Implement the solution using AWS Deep Learning Containers, run the workload using AWS Fargate running on Spot Instances, and then schedule the task using the built-in task scheduler
C.
Implement the solution using AWS Deep Learning Containers, run the workload using AWS Fargate running on Spot Instances, and then schedule the task using the built-in task scheduler
Answers
D.
Implement the solution using Amazon ECS running on Spot Instances and schedule the task using the ECS service scheduler
D.
Implement the solution using Amazon ECS running on Spot Instances and schedule the task using the ECS service scheduler
Answers
Suggested answer: A

Explanation:

The best architecture to scale the solution at the lowest cost is to implement the solution using AWS Deep Learning Containers and run the container as a job using AWS Batch on a GPU-compatible Spot Instance. This option has the following advantages:

AWS Deep Learning Containers: These are Docker images that are pre-installed and optimized with popular deep learning frameworks such as TensorFlow, PyTorch, and MXNet. They can be easily deployed on Amazon EC2, Amazon ECS, Amazon EKS, and AWS Fargate. They can also be integrated with AWS Batch to run containerized batch jobs. Using AWS Deep Learning Containers can simplify the setup and configuration of the deep learning environment and reduce the complexity of the resource management.

AWS Batch: This is a fully managed service that enables you to run batch computing workloads on AWS. You can define compute environments, job queues, and job definitions to run your batch jobs. You can also use AWS Batch to automatically provision compute resources based on the requirements of the batch jobs. You can specify the type and quantity of the compute resources, such as GPU instances, and the maximum price you are willing to pay for them. You can also use AWS Batch to monitor the status and progress of your batch jobs and handle any failures or interruptions.

GPU-compatible Spot Instance: This is an Amazon EC2 instance that uses a spare compute capacity that is available at a lower price than the On-Demand price. You can use Spot Instances to run your deep learning training jobs at a lower cost, as long as you are flexible about when your instances run and how long they run. You can also use Spot Instances with AWS Batch to automatically launch and terminate instances based on the availability and price of the Spot capacity. You can also use Spot Instances with Amazon EBS volumes to store your datasets, checkpoints, and logs, and attach them to your instances when they are launched. This way, you can preserve your data and resume your training even if your instances are interrupted.

References:

AWS Deep Learning Containers

AWS Batch

Amazon EC2 Spot Instances

Using Amazon EBS Volumes with Amazon EC2 Spot Instances

A media company with a very large archive of unlabeled images, text, audio, and video footage wishes to index its assets to allow rapid identification of relevant content by the Research team. The company wants to use machine learning to accelerate the efforts of its in-house researchers who have limited machine learning expertise.

Which is the FASTEST route to index the assets?

A.
Use Amazon Rekognition, Amazon Comprehend, and Amazon Transcribe to tag data into distinct categories/classes.
A.
Use Amazon Rekognition, Amazon Comprehend, and Amazon Transcribe to tag data into distinct categories/classes.
Answers
B.
Create a set of Amazon Mechanical Turk Human Intelligence Tasks to label all footage.
B.
Create a set of Amazon Mechanical Turk Human Intelligence Tasks to label all footage.
Answers
C.
Use Amazon Transcribe to convert speech to text. Use the Amazon SageMaker Neural Topic Model (NTM) and Object Detection algorithms to tag data into distinct categories/classes.
C.
Use Amazon Transcribe to convert speech to text. Use the Amazon SageMaker Neural Topic Model (NTM) and Object Detection algorithms to tag data into distinct categories/classes.
Answers
D.
Use the AWS Deep Learning AMI and Amazon EC2 GPU instances to create custom models for audio transcription and topic modeling, and use object detection to tag data into distinct categories/classes.
D.
Use the AWS Deep Learning AMI and Amazon EC2 GPU instances to create custom models for audio transcription and topic modeling, and use object detection to tag data into distinct categories/classes.
Answers
Suggested answer: A

Explanation:

Amazon Rekognition, Amazon Comprehend, and Amazon Transcribe are AWS machine learning services that can analyze and extract metadata from images, text, audio, and video content. These services are easy to use, scalable, and do not require any machine learning expertise. They can help the media company to quickly index its assets and enable rapid identification of relevant content by the research team. Using these services is the fastest route to index the assets, compared to the other options that involve human intervention, custom model development, or additional steps.References:

AWS Media Intelligence Solutions

AWS Machine Learning Services

The Best Services For Running Machine Learning Models On AWS

A Machine Learning Specialist is working for an online retailer that wants to run analytics on every customer visit, processed through a machine learning pipeline. The data needs to be ingested by Amazon Kinesis Data Streams at up to 100 transactions per second, and the JSON data blob is 100 KB in size.

What is the MINIMUM number of shards in Kinesis Data Streams the Specialist should use to successfully ingest this data?

A.
1 shards
A.
1 shards
Answers
B.
10 shards
B.
10 shards
Answers
C.
100 shards
C.
100 shards
Answers
D.
1,000 shards
D.
1,000 shards
Answers
Suggested answer: A

Explanation:

According to the Amazon Kinesis Data Streams documentation, the maximum size of data blob (the data payload before Base64-encoding) per record is 1 MB. The maximum number of records that can be sent to a shard per second is 1,000. Therefore, the maximum throughput of a shard is 1 MB/sec for input and 2 MB/sec for output. In this case, the input throughput is 100 transactions per second * 100 KB per transaction = 10 MB/sec. Therefore, the minimum number of shards required is 10 MB/sec / 1 MB/sec = 10 shards. However, the question asks for the minimum number of shards in Kinesis Data Streams, not the minimum number of shards per stream. A Kinesis Data Streams account can have multiple streams, each with its own number of shards. Therefore, the minimum number of shards in Kinesis Data Streams is 1, which is the minimum number of shards per stream.References:

Amazon Kinesis Data Streams Terminology and Concepts

Amazon Kinesis Data Streams Limits

A Machine Learning Specialist is deciding between building a naive Bayesian model or a full Bayesian network for a classification problem. The Specialist computes the Pearson correlation coefficients between each feature and finds that their absolute values range between 0.1 to 0.95.

Which model describes the underlying data in this situation?

A.
A naive Bayesian model, since the features are all conditionally independent.
A.
A naive Bayesian model, since the features are all conditionally independent.
Answers
B.
A full Bayesian network, since the features are all conditionally independent.
B.
A full Bayesian network, since the features are all conditionally independent.
Answers
C.
A naive Bayesian model, since some of the features are statistically dependent.
C.
A naive Bayesian model, since some of the features are statistically dependent.
Answers
D.
A full Bayesian network, since some of the features are statistically dependent.
D.
A full Bayesian network, since some of the features are statistically dependent.
Answers
Suggested answer: D

Explanation:

A naive Bayesian model assumes that the features are conditionally independent given the class label. This means that the joint probability of the features and the class can be factorized as the product of the class prior and the feature likelihoods. A full Bayesian network, on the other hand, does not make this assumption and allows for modeling arbitrary dependencies between the features and the class using a directed acyclic graph. In this case, the joint probability of the features and the class is given by the product of the conditional probabilities of each node given its parents in the graph. If the features are statistically dependent, meaning that their correlation coefficients are not close to zero, then a naive Bayesian model would not capture these dependencies and would likely perform worse than a full Bayesian network that can account for them. Therefore, a full Bayesian network describes the underlying data better in this situation.References:

Naive Bayes and Text Classification I

Bayesian Networks

A Data Scientist is building a linear regression model and will use resulting p-values to evaluate the statistical significance of each coefficient. Upon inspection of the dataset, the Data Scientist discovers that most of the features are normally distributed. The plot of one feature in the dataset is shown in the graphic.

What transformation should the Data Scientist apply to satisfy the statistical assumptions of the linear regression model?

A.
Exponential transformation
A.
Exponential transformation
Answers
B.
Logarithmic transformation
B.
Logarithmic transformation
Answers
C.
Polynomial transformation
C.
Polynomial transformation
Answers
D.
Sinusoidal transformation
D.
Sinusoidal transformation
Answers
Suggested answer: B

Explanation:

The plot in the graphic shows a right-skewed distribution, which violates the assumption of normality for linear regression. To correct this, the Data Scientist should apply a logarithmic transformation to the feature. This will help to make the distribution more symmetric and closer to a normal distribution, which is a key assumption for linear regression.References:

Linear Regression

Linear Regression with Amazon Machine Learning

Machine Learning on AWS

A Machine Learning Specialist is assigned to a Fraud Detection team and must tune an XGBoost model, which is working appropriately for test data. However, with unknown data, it is not working as expected. The existing parameters are provided as follows.

Which parameter tuning guidelines should the Specialist follow to avoid overfitting?

A.
Increase the max_depth parameter value.
A.
Increase the max_depth parameter value.
Answers
B.
Lower the max_depth parameter value.
B.
Lower the max_depth parameter value.
Answers
C.
Update the objective to binary:logistic.
C.
Update the objective to binary:logistic.
Answers
D.
Lower the min_child_weight parameter value.
D.
Lower the min_child_weight parameter value.
Answers
Suggested answer: B

Explanation:

Overfitting occurs when a model performs well on the training data but poorly on the test data. This is often because the model has learned the training data too well and is not able to generalize to new data. To avoid overfitting, the Machine Learning Specialist should lower the max_depth parameter value. This will reduce the complexity of the model and make it less likely to overfit.According to the XGBoost documentation1, the max_depth parameter controls the maximum depth of a tree and lower values can help prevent overfitting.The documentation also suggests other ways to control overfitting, such as adding randomness, using regularization, and using early stopping1.References:

XGBoost Parameters

Total 308 questions
Go to page: of 31