ExamGecko
Home Home / Amazon / MLS-C01

Amazon MLS-C01 Practice Test - Questions Answers, Page 29

Question list
Search
Search

List of questions

Search

Related questions











A manufacturing company has a production line with sensors that collect hundreds of quality metrics. The company has stored sensor data and manual inspection results in a data lake for several months. To automate quality control, the machine learning team must build an automated mechanism that determines whether the produced goods are good quality, replacement market quality, or scrap quality based on the manual inspection results.

Which modeling approach will deliver the MOST accurate prediction of product quality?

A.
Amazon SageMaker DeepAR forecasting algorithm
A.
Amazon SageMaker DeepAR forecasting algorithm
Answers
B.
Amazon SageMaker XGBoost algorithm
B.
Amazon SageMaker XGBoost algorithm
Answers
C.
Amazon SageMaker Latent Dirichlet Allocation (LDA) algorithm
C.
Amazon SageMaker Latent Dirichlet Allocation (LDA) algorithm
Answers
D.
A convolutional neural network (CNN) and ResNet
D.
A convolutional neural network (CNN) and ResNet
Answers
Suggested answer: D

Explanation:

A convolutional neural network (CNN) is a type of deep learning model that can learn to extract features from images and perform tasks such as classification, segmentation, and detection1.ResNet is a popular CNN architecture that uses residual connections to overcome the problem of vanishing gradients and enable very deep networks2. For the task of predicting product quality based on sensor data, a CNN and ResNet approach can leverage the spatial structure of the data and learn complex patterns that distinguish different quality levels.

References:

Convolutional Neural Networks (CNNs / ConvNets)

PyTorch ResNet: The Basics and a Quick Tutorial

A manufacturing company uses machine learning (ML) models to detect quality issues. The models use images that are taken of the company's product at the end of each production step. The company has thousands of machines at the production site that generate one image per second on average.

The company ran a successful pilot with a single manufacturing machine. For the pilot, ML specialists used an industrial PC that ran AWS IoT Greengrass with a long-running AWS Lambda function that uploaded the images to Amazon S3. The uploaded images invoked a Lambda function that was written in Python to perform inference by using an Amazon SageMaker endpoint that ran a custom model. The inference results were forwarded back to a web service that was hosted at the production site to prevent faulty products from being shipped.

The company scaled the solution out to all manufacturing machines by installing similarly configured industrial PCs on each production machine. However, latency for predictions increased beyond acceptable limits. Analysis shows that the internet connection is at its capacity limit.

How can the company resolve this issue MOST cost-effectively?

A.
Set up a 10 Gbps AWS Direct Connect connection between the production site and the nearest AWS Region. Use the Direct Connect connection to upload the images. Increase the size of the instances and the number of instances that are used by the SageMaker endpoint.
A.
Set up a 10 Gbps AWS Direct Connect connection between the production site and the nearest AWS Region. Use the Direct Connect connection to upload the images. Increase the size of the instances and the number of instances that are used by the SageMaker endpoint.
Answers
B.
Extend the long-running Lambda function that runs on AWS IoT Greengrass to compress the images and upload the compressed files to Amazon S3. Decompress the files by using a separate Lambda function that invokes the existing Lambda function to run the inference pipeline.
B.
Extend the long-running Lambda function that runs on AWS IoT Greengrass to compress the images and upload the compressed files to Amazon S3. Decompress the files by using a separate Lambda function that invokes the existing Lambda function to run the inference pipeline.
Answers
C.
Use auto scaling for SageMaker. Set up an AWS Direct Connect connection between the production site and the nearest AWS Region. Use the Direct Connect connection to upload the images.
C.
Use auto scaling for SageMaker. Set up an AWS Direct Connect connection between the production site and the nearest AWS Region. Use the Direct Connect connection to upload the images.
Answers
D.
Deploy the Lambda function and the ML models onto the AWS IoT Greengrass core that is running on the industrial PCs that are installed on each machine. Extend the long-running Lambda function that runs on AWS IoT Greengrass to invoke the Lambda function with the captured images and run the inference on the edge component that forwards the results directly to the web service.
D.
Deploy the Lambda function and the ML models onto the AWS IoT Greengrass core that is running on the industrial PCs that are installed on each machine. Extend the long-running Lambda function that runs on AWS IoT Greengrass to invoke the Lambda function with the captured images and run the inference on the edge component that forwards the results directly to the web service.
Answers
Suggested answer: D

Explanation:

The best option is to deploy the Lambda function and the ML models onto the AWS IoT Greengrass core that is running on the industrial PCs that are installed on each machine. This way, the inference can be performed locally on the edge devices, without the need to upload the images to Amazon S3 and invoke the SageMaker endpoint. This will reduce the latency and the network bandwidth consumption. The long-running Lambda function can be extended to invoke the Lambda function with the captured images and run the inference on the edge component that forwards the results directly to the web service. This will also simplify the architecture and eliminate the dependency on the internet connection.

Option A is not cost-effective, as it requires setting up a 10 Gbps AWS Direct Connect connection and increasing the size and number of instances for the SageMaker endpoint. This will increase the operational costs and complexity.

Option B is not optimal, as it still requires uploading the images to Amazon S3 and invoking the SageMaker endpoint. Compressing and decompressing the images will add additional processing overhead and latency.

Option C is not sufficient, as it still requires uploading the images to Amazon S3 and invoking the SageMaker endpoint. Auto scaling for SageMaker will help to handle the increased workload, but it will not reduce the latency or the network bandwidth consumption. Setting up an AWS Direct Connect connection will improve the network performance, but it will also increase the operational costs and complexity.References:

AWS IoT Greengrass

Deploying Machine Learning Models to Edge Devices

AWS Certified Machine Learning - Specialty Exam Guide

A company distributes an online multiple-choice survey to several thousand people. Respondents to the survey can select multiple options for each question.

A machine learning (ML) engineer needs to comprehensively represent every response from all respondents in a dataset. The ML engineer will use the dataset to train a logistic regression model.

Which solution will meet these requirements?

A.

Perform one-hot encoding on every possible option for each question of the survey.

A.

Perform one-hot encoding on every possible option for each question of the survey.

Answers
B.

Perform binning on all the answers each respondent selected for each question.

B.

Perform binning on all the answers each respondent selected for each question.

Answers
C.

Use Amazon Mechanical Turk to create categorical labels for each set of possible responses.

C.

Use Amazon Mechanical Turk to create categorical labels for each set of possible responses.

Answers
D.

Use Amazon Textract to create numeric features for each set of possible responses.

D.

Use Amazon Textract to create numeric features for each set of possible responses.

Answers
Suggested answer: A

Explanation:

In cases where survey questions allow multiple choices per question, one-hot encoding is an effective way to represent responses as binary features. Each possible option for each question is transformed into a separate binary column (1 if selected, 0 if not), providing a comprehensive and machine-readable format that logistic regression models can interpret effectively.

This approach ensures that each respondent's selections are accurately captured in a format suitable for training, offering a straightforward representation for multi-choice responses.

A data scientist wants to improve the fit of a machine learning (ML) model that predicts house prices. The data scientist makes a first attempt to fit the model, but the fitted model has poor accuracy on both the training dataset and the test dataset.

Which steps must the data scientist take to improve model accuracy? (Select THREE.)

A.

Increase the amount of regularization that the model uses.

A.

Increase the amount of regularization that the model uses.

Answers
B.

Decrease the amount of regularization that the model uses.

B.

Decrease the amount of regularization that the model uses.

Answers
C.

Increase the number of training examples that that model uses.

C.

Increase the number of training examples that that model uses.

Answers
D.

Increase the number of test examples that the model uses.

D.

Increase the number of test examples that the model uses.

Answers
E.

Increase the number of model features that the model uses.

E.

Increase the number of model features that the model uses.

Answers
F.

Decrease the number of model features that the model uses.

F.

Decrease the number of model features that the model uses.

Answers
Suggested answer: B, C, E

Explanation:

When a model shows poor accuracy on both the training and test datasets, it often indicates underfitting. To improve the model's accuracy, the data scientist can:

Decrease regularization: Excessive regularization can lead to underfitting by constraining the model too much. Reducing it allows the model to capture more complexity.

Increase the number of training examples: Adding more data can help the model learn better and generalize well, especially if the dataset was previously insufficient.

Increase the number of model features: Adding relevant features can help the model capture more predictive information, thus potentially improving accuracy.

Options A, D, and F would either reduce the complexity or impact the generalization capability, which is not desirable in the case of underfitting.

A manufacturing company stores production volume data in a PostgreSQL database.

The company needs an end-to-end solution that will give business analysts the ability to prepare data for processing and to predict future production volume based the previous year's production volume. The solution must not require the company to have coding knowledge.

Which solution will meet these requirements with the LEAST effort?

A.

Use AWS Database Migration Service (AWS DMS) to transfer the data from the PostgreSQL database to an Amazon S3 bucket. Create an Amazon EMR cluster to read the S3 bucket and perform the data preparation. Use Amazon SageMaker Studio for the prediction modeling.

A.

Use AWS Database Migration Service (AWS DMS) to transfer the data from the PostgreSQL database to an Amazon S3 bucket. Create an Amazon EMR cluster to read the S3 bucket and perform the data preparation. Use Amazon SageMaker Studio for the prediction modeling.

Answers
B.

Use AWS Glue DataBrew to read the data that is in the PostgreSQL database and to perform the data preparation. Use Amazon SageMaker Canvas for the prediction modeling.

B.

Use AWS Glue DataBrew to read the data that is in the PostgreSQL database and to perform the data preparation. Use Amazon SageMaker Canvas for the prediction modeling.

Answers
C.

Use AWS Database Migration Service (AWS DMS) to transfer the data from the PostgreSQL database to an Amazon S3 bucket. Use AWS Glue to read the data in the S3 bucket and to perform the data preparation. Use Amazon SageMaker Canvas for the prediction modeling.

C.

Use AWS Database Migration Service (AWS DMS) to transfer the data from the PostgreSQL database to an Amazon S3 bucket. Use AWS Glue to read the data in the S3 bucket and to perform the data preparation. Use Amazon SageMaker Canvas for the prediction modeling.

Answers
D.

Use AWS Glue DataBrew to read the data that is in the PostgreSQL database and to perform the data preparation. Use Amazon SageMaker Studio for the prediction modeling.

D.

Use AWS Glue DataBrew to read the data that is in the PostgreSQL database and to perform the data preparation. Use Amazon SageMaker Studio for the prediction modeling.

Answers
Suggested answer: B

Explanation:

AWS Glue DataBrew provides a no-code data preparation interface that enables business analysts to clean and transform data from various sources, including PostgreSQL databases, without needing programming skills. Amazon SageMaker Canvas offers a no-code interface for machine learning model training and predictions, allowing users to predict future production volume without coding expertise.

This solution meets the requirements efficiently by providing end-to-end data preparation and prediction modeling without requiring coding.

A music streaming company is building a pipeline to extract features. The company wants to store the features for offline model training and online inference. The company wants to track feature history and to give the company's data science teams access to the features.

Which solution will meet these requirements with the MOST operational efficiency?

A.

Use Amazon SageMaker Feature Store to store features for model training and inference. Create an online store for online inference. Create an offline store for model training. Create an 1AM role for data scientists to access and search through feature groups.

A.

Use Amazon SageMaker Feature Store to store features for model training and inference. Create an online store for online inference. Create an offline store for model training. Create an 1AM role for data scientists to access and search through feature groups.

Answers
B.

Use Amazon SageMaker Feature Store to store features for model training and inference. Create an online store for both online inference and model training. Create an 1AM role for data scientists to access and search through feature groups.

B.

Use Amazon SageMaker Feature Store to store features for model training and inference. Create an online store for both online inference and model training. Create an 1AM role for data scientists to access and search through feature groups.

Answers
C.

Create one Amazon S3 bucket to store online inference features. Create a second S3 bucket to store offline model training features. Turn on

C.

Create one Amazon S3 bucket to store online inference features. Create a second S3 bucket to store offline model training features. Turn on

Answers
D.

Create two separate Amazon DynamoDB tables to store online inference features and offline model training features. Use time-based versioning on both tables. Query the DynamoDB table for online inference. Move the data from DynamoDB to Amazon S3 when a new SageMaker training job is launched. Create an 1AM policy that allows data scientists to access both tables.

D.

Create two separate Amazon DynamoDB tables to store online inference features and offline model training features. Use time-based versioning on both tables. Query the DynamoDB table for online inference. Move the data from DynamoDB to Amazon S3 when a new SageMaker training job is launched. Create an 1AM policy that allows data scientists to access both tables.

Answers
Suggested answer: A

Explanation:

Amazon SageMaker Feature Store is a fully managed, purpose-built repository for storing, updating, and sharing machine learning features. It supports both online and offline stores for features, allowing real-time access for online inference and batch access for offline model training. It also tracks feature history, making it easier for data scientists to work with and access relevant feature sets.

This solution provides the necessary storage and access capabilities with high operational efficiency by managing feature history and enabling controlled access through IAM roles, making it a comprehensive choice for the company's requirements.


A company is building a predictive maintenance model for its warehouse equipment. The model must predict the probability of failure of all machines in the warehouse. The company has collected 10.000 event samples within 3 months. The event samples include 100 failure cases that are evenly distributed across 50 different machine types.

How should the company prepare the data for the model to improve the model's accuracy?

A.

Adjust the class weight to account for each machine type.

A.

Adjust the class weight to account for each machine type.

Answers
B.

Oversample the failure cases by using the Synthetic Minority Oversampling Technique (SMOTE).

B.

Oversample the failure cases by using the Synthetic Minority Oversampling Technique (SMOTE).

Answers
C.

Undersample the non-failure events. Stratify the non-failure events by machine type.

C.

Undersample the non-failure events. Stratify the non-failure events by machine type.

Answers
D.

Undersample the non-failure events by using the Synthetic Minority Oversampling Technique (SMOTE).

D.

Undersample the non-failure events by using the Synthetic Minority Oversampling Technique (SMOTE).

Answers
Suggested answer: B

Explanation:

In predictive maintenance, when a dataset is imbalanced (with far fewer failure cases than non-failure cases), oversampling the minority class helps the model learn from the minority class effectively. The Synthetic Minority Oversampling Technique (SMOTE) generates synthetic samples for the minority class by creating data points between existing minority class instances. This can enhance the model's ability to recognize failure patterns, particularly in imbalanced datasets.

SMOTE increases the effective presence of failure cases in the dataset, providing a balanced learning environment for the model. This is more effective than undersampling, which would risk losing important non-failure data.

An ecommerce company has observed that customers who use the company's website rarely view items that the website recommends to customers. The company wants to recommend items to customers that customers are more likely to want to purchase.

Which solution will meet this requirement in the SHORTEST amount of time?

A.

Host the company's website on Amazon EC2 Accelerated Computing instances to increase the website response speed.

A.

Host the company's website on Amazon EC2 Accelerated Computing instances to increase the website response speed.

Answers
B.

Host the company's website on Amazon EC2 GPU-based instances to increase the speed of the website's search tool.

B.

Host the company's website on Amazon EC2 GPU-based instances to increase the speed of the website's search tool.

Answers
C.

Integrate Amazon Personalize into the company's website to provide customers with personalized recommendations.

C.

Integrate Amazon Personalize into the company's website to provide customers with personalized recommendations.

Answers
D.

Use Amazon SageMaker to train a Neural Collaborative Filtering (NCF) model to make product recommendations.

D.

Use Amazon SageMaker to train a Neural Collaborative Filtering (NCF) model to make product recommendations.

Answers
Suggested answer: C

Explanation:

Amazon Personalize is a managed AWS service specifically designed to deliver personalized recommendations with minimal development time. It uses machine learning algorithms tailored for recommendation systems, making it highly suitable for applications where quick integration is essential. By using Amazon Personalize, the company can leverage existing customer data to generate real-time, personalized product recommendations that align better with customer preferences, enhancing the likelihood of customer engagement with recommended items.

Options involving EC2 instances with GPU or accelerated computing primarily enhance computational performance but do not inherently improve recommendation relevance, while Amazon SageMaker would require more development effort to achieve similar results.

A tourism company uses a machine learning (ML) model to make recommendations to customers. The company uses an Amazon SageMaker environment and set hyperparameter tuning completion criteria to MaxNumberOfTrainingJobs.

An ML specialist wants to change the hyperparameter tuning completion criteria. The ML specialist wants to stop tuning immediately after an internal algorithm determines that tuning job is unlikely to improve more than 1% over the objective metric from the best training job.

Which completion criteria will meet this requirement?

A.

MaxRuntimelnSeconds

A.

MaxRuntimelnSeconds

Answers
B.

TargetObjectiveMetricValue

B.

TargetObjectiveMetricValue

Answers
C.

CompleteOnConvergence

C.

CompleteOnConvergence

Answers
D.

MaxNumberOfTrainingJobsNotlmproving

D.

MaxNumberOfTrainingJobsNotlmproving

Answers
Suggested answer: C

Explanation:

In Amazon SageMaker, hyperparameter tuning jobs optimize model performance by adjusting hyperparameters. Amazon SageMaker's hyperparameter tuning supports completion criteria settings that enable efficient management of tuning resources. In this scenario, the ML specialist aims to set a completion criterion that will terminate the tuning job as soon as SageMaker detects that further improvements in the objective metric are unlikely to exceed 1%.

The CompleteOnConvergence setting is designed for such requirements. This criterion enables the tuning job to automatically stop when SageMaker determines that additional hyperparameter evaluations are unlikely to improve the objective metric beyond a certain threshold, allowing for efficient tuning completion. The convergence process relies on an internal optimization algorithm that continuously evaluates the objective metric during tuning and stops when performance stabilizes without further improvement.

This is supported by AWS documentation, which explains that CompleteOnConvergence is an efficient way to manage tuning by stopping unnecessary evaluations once the model performance stabilizes within the specified threshold.

A machine learning (ML) specialist uploads a dataset to an Amazon S3 bucket that is protected by server-side encryption with AWS KMS keys (SSE-KMS). The ML specialist needs to ensure that an Amazon SageMaker notebook instance can read the dataset that is in Amazon S3.

Which solution will meet these requirements?

A.

Define security groups to allow all HTTP inbound and outbound traffic. Assign the security groups to the SageMaker notebook instance.

A.

Define security groups to allow all HTTP inbound and outbound traffic. Assign the security groups to the SageMaker notebook instance.

Answers
B.

Configure the SageMaker notebook instance to have access to the VPC. Grant permission in the AWS Key Management Service (AWS KMS) key policy to the notebook's VPC.

B.

Configure the SageMaker notebook instance to have access to the VPC. Grant permission in the AWS Key Management Service (AWS KMS) key policy to the notebook's VPC.

Answers
C.

Assign an IAM role that provides S3 read access for the dataset to the SageMaker notebook. Grant permission in the KMS key policy to the 1AM role.

C.

Assign an IAM role that provides S3 read access for the dataset to the SageMaker notebook. Grant permission in the KMS key policy to the 1AM role.

Answers
D.

Assign the same KMS key that encrypts the data in Amazon S3 to the SageMaker notebook instance.

D.

Assign the same KMS key that encrypts the data in Amazon S3 to the SageMaker notebook instance.

Answers
Suggested answer: C

Explanation:

When an Amazon SageMaker notebook instance needs to access encrypted data in Amazon S3, the ML specialist must ensure that both Amazon S3 access permissions and AWS Key Management Service (KMS) decryption permissions are properly configured. The dataset in this scenario is stored with server-side encryption using an AWS KMS key (SSE-KMS), so the following steps are necessary:

S3 Read Permissions: Attach an IAM role to the SageMaker notebook instance with permissions that allow the s3:GetObject action for the specific S3 bucket storing the data. This will allow the notebook instance to read data from Amazon S3.

KMS Key Policy Permissions: Grant permissions in the KMS key policy to the IAM role assigned to the SageMaker notebook instance. This allows SageMaker to use the KMS key to decrypt data in the S3 bucket.

These steps ensure the SageMaker notebook instance can access the encrypted data stored in S3. The AWS documentation emphasizes that to access SSE-KMS encrypted data, the SageMaker notebook requires appropriate permissions in both the S3 bucket policy and the KMS key policy, making Option C the correct and secure approach.

Total 308 questions
Go to page: of 31