Amazon MLS-C01 Practice Test - Questions Answers, Page 30
List of questions
Related questions
A data scientist is designing a repository that will contain many images of vehicles. The repository must scale automatically in size to store new images every day. The repository must support versioning of the images. The data scientist must implement a solution that maintains multiple immediately accessible copies of the data in different AWS Regions.
Which solution will meet these requirements?
Amazon S3 with S3 Cross-Region Replication (CRR)
Amazon Elastic Block Store (Amazon EBS) with snapshots that are shared in a secondary Region
Amazon Elastic File System (Amazon EFS) Standard storage that is configured with Regional availability
AWS Storage Gateway Volume Gateway
An insurance company is creating an application to automate car insurance claims. A machine learning (ML) specialist used an Amazon SageMaker Object Detection - TensorFlow built-in algorithm to train a model to detect scratches and dents in images of cars. After the model was trained, the ML specialist noticed that the model performed better on the training dataset than on the testing dataset.
Which approach should the ML specialist use to improve the performance of the model on the testing data?
Increase the value of the momentum hyperparameter.
Reduce the value of the dropout_rate hyperparameter.
Reduce the value of the learning_rate hyperparameter.
Increase the value of the L2 hyperparameter.
A machine learning (ML) specialist is training a linear regression model. The specialist notices that the model is overfitting. The specialist applies an L1 regularization parameter and runs the model again. This change results in all features having zero weights.
What should the ML specialist do to improve the model results?
Increase the L1 regularization parameter. Do not change any other training parameters.
Decrease the L1 regularization parameter. Do not change any other training parameters.
Introduce a large L2 regularization parameter. Do not change the current L1 regularization value.
Introduce a small L2 regularization parameter. Do not change the current L1 regularization value.
A machine learning (ML) specialist at a retail company must build a system to forecast the daily sales for one of the company's stores. The company provided the ML specialist with sales data for this store from the past 10 years. The historical dataset includes the total amount of sales on each day for the store. Approximately 10% of the days in the historical dataset are missing sales data.
The ML specialist builds a forecasting model based on the historical dataset. The specialist discovers that the model does not meet the performance standards that the company requires.
Which action will MOST likely improve the performance for the forecasting model?
Aggregate sales from stores in the same geographic area.
Apply smoothing to correct for seasonal variation.
Change the forecast frequency from daily to weekly.
Replace missing values in the dataset by using linear interpolation.
A company plans to build a custom natural language processing (NLP) model to classify and prioritize user feedback. The company hosts the data and all machine learning (ML) infrastructure in the AWS Cloud. The ML team works from the company's office, which has an IPsec VPN connection to one VPC in the AWS Cloud.
The company has set both the enableDnsHostnames attribute and the enableDnsSupport attribute of the VPC to true. The company's DNS resolvers point to the VPC DNS. The company does not allow the ML team to access Amazon SageMaker notebooks through connections that use the public internet. The connection must stay within a private network and within the AWS internal network.
Which solution will meet these requirements with the LEAST development effort?
Create a VPC interface endpoint for the SageMaker notebook in the VPC. Access the notebook through a VPN connection and the VPC endpoint.
Create a bastion host by using Amazon EC2 in a public subnet within the VPC. Log in to the bastion host through a VPN connection. Access the SageMaker notebook from the bastion host.
Create a bastion host by using Amazon EC2 in a private subnet within the VPC with a NAT gateway. Log in to the bastion host through a VPN connection. Access the SageMaker notebook from the bastion host.
Create a NAT gateway in the VPC. Access the SageMaker notebook HTTPS endpoint through a VPN connection and the NAT gateway.
A bank has collected customer data for 10 years in CSV format. The bank stores the data in an on-premises server. A data science team wants to use Amazon SageMaker to build and train a machine learning (ML) model to predict churn probability. The team will use the historical data. The data scientists want to perform data transformations quickly and to generate data insights before the team builds a model for production.
Which solution will meet these requirements with the LEAST development effort?
Upload the data into the SageMaker Data Wrangler console directly. Perform data transformations and generate insights within Data Wrangler.
Upload the data into an Amazon S3 bucket. Allow SageMaker to access the data that is in the bucket. Import the data from the S3 bucket into SageMaker Data Wrangler. Perform data transformations and generate insights within Data Wrangler.
Upload the data into the SageMaker Data Wrangler console directly. Allow SageMaker and Amazon QuickSight to access the data that is in an Amazon S3 bucket. Perform data transformations in Data Wrangler and save the transformed data into a second S3 bucket. Use QuickSight to generate data insights.
Upload the data into an Amazon S3 bucket. Allow SageMaker to access the data that is in the bucket. Import the data from the bucket into SageMaker Data Wrangler. Perform data transformations in Data Wrangler. Save the data into a second S3 bucket. Use a SageMaker Studio notebook to generate data insights.
A machine learning (ML) engineer is preparing a dataset for a classification model. The ML engineer notices that some continuous numeric features have a significantly greater value than most other features. A business expert explains that the features are independently informative and that the dataset is representative of the target distribution.
After training, the model's inferences accuracy is lower than expected.
Which preprocessing technique will result in the GREATEST increase of the model's inference accuracy?
Normalize the problematic features.
Bootstrap the problematic features.
Remove the problematic features.
Extrapolate synthetic features.
A data scientist needs to create a model for predictive maintenance. The model will be based on historical data to identify rare anomalies in the data.
The historical data is stored in an Amazon S3 bucket. The data scientist needs to use Amazon SageMaker Data Wrangler to ingest the data. The data scientists also needs to perform exploratory data analysis (EDA) to understand the statistical properties of the data.
Which solution will meet these requirements with the LEAST amount of compute resources?
Import the data by using the None option.
Import the data by using the Stratified option.
Import the data by using the First K option. Infer the value of K from domain knowledge.
Import the data by using the Randomized option. Infer the random size from domain knowledge.
A company wants to use machine learning (ML) to improve its customer churn prediction model. The company stores data in an Amazon Redshift data warehouse.
A data science team wants to use Amazon Redshift machine learning (Amazon Redshift ML) to build a model and run predictions for new data directly within the data warehouse.
Which combination of steps should the company take to use Amazon Redshift ML to meet these requirements? (Select THREE.)
Define the feature variables and target variable for the churn prediction model.
Use the SQL EXPLAIN_MODEL function to run predictions.
Write a CREATE MODEL SQL statement to create a model.
Use Amazon Redshift Spectrum to train the model.
Manually export the training data to Amazon S3.
Use the SQL prediction function to run predictions,
A business to business (B2B) ecommerce company wants to develop a fair and equitable risk mitigation strategy to reject potentially fraudulent transactions. The company wants to reject fraudulent transactions despite the possibility of losing some profitable transactions or customers.
Which solution will meet these requirements with the LEAST operational effort?
Use Amazon SageMaker to approve transactions only for products the company has sold in the past.
Use Amazon SageMaker to train a custom fraud detection model based on customer data.
Use the Amazon Fraud Detector prediction API to approve or deny any activities that Fraud Detector identifies as fraudulent.
Use the Amazon Fraud Detector prediction API to identify potentially fraudulent activities so the company can review the activities and reject fraudulent transactions.
Question