ExamGecko
Home / Microsoft / DP-100 / List of questions
Ask Question

Microsoft DP-100 Practice Test - Questions Answers, Page 5

List of questions

Question 41

Report Export Collapse

You are a data scientist creating a linear regression model.

You need to determine how closely the data fits the regression line.

Which metric should you review?

Root Mean Square Error
Root Mean Square Error
Coefficient of determination
Coefficient of determination
Recall
Recall
Precision
Precision
Mean absolute error
Mean absolute error
Suggested answer: B
Explanation:

Coefficient of determination, often referred to as R2, represents the predictive power of the model as a value between 0 and 1. Zero means the model is random (explains nothing); 1 means there is a perfect fit. However, caution should be used in interpreting R2 values, as low values can be entirely normal and high values can be suspect.

Incorrect Answers:

A: Root mean squared error (RMSE) creates a single value that summarizes the error in the model. By squaring the difference, the metric disregards the difference between over-prediction and under-prediction.

C: Recall is the fraction of all correct results returned by the model.

D: Precision is the proportion of true results over all positive results.

E: Mean absolute error (MAE) measures how close the predictions are to the actual outcomes; thus, a lower score is better.

Reference:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/evaluate-model

asked 02/10/2024
JULIUS BALNEG
42 questions

Question 42

Report Export Collapse

You are creating a binary classification by using a two-class logistic regression model.

You need to evaluate the model results for imbalance.

Which evaluation metric should you use?

Relative Absolute Error
Relative Absolute Error
AUC Curve
AUC Curve
Mean Absolute Error
Mean Absolute Error
Relative Squared Error
Relative Squared Error
Accuracy
Accuracy
Root Mean Square Error
Root Mean Square Error
Suggested answer: B
Explanation:

One can inspect the true positive rate vs. the false positive rate in the Receiver Operating Characteristic (ROC) curve and the corresponding Area Under the Curve (AUC) value. The closer this curve is to the upper left corner; the better the classifier's performance is (that is maximizing the true positive rate while minimizing the false positive rate). Curves that are close to the diagonal of the plot, result from classifiers that tend to make predictions that are close to random guessing.

Reference:

https://docs.microsoft.com/en-us/azure/machine-learning/studio/evaluate-model-performance#evaluating-a-binary-classification-model

asked 02/10/2024
annalise ramdin
41 questions

Question 43

Report Export Collapse

You are a data scientist building a deep convolutional neural network (CNN) for image classification.

The CNN model you build shows signs of overfitting.

You need to reduce overfitting and converge the model to an optimal fit.

Which two actions should you perform? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Add an additional dense layer with 512 input units.
Add an additional dense layer with 512 input units.
Add L1/L2 regularization.
Add L1/L2 regularization.
Use training data augmentation.
Use training data augmentation.
Reduce the amount of training data.
Reduce the amount of training data.
Add an additional dense layer with 64 input units.
Add an additional dense layer with 64 input units.
Suggested answer: B, D
Explanation:

B: Weight regularization provides an approach to reduce the overfitting of a deep learning neural network model on the training data and improve the performance of the model on new data, such as the holdout test set.

Keras provides a weight regularization API that allows you to add a penalty for weight size to the loss function.

Three different regularizer instances are provided; they are:

L1: Sum of the absolute weights.

L2: Sum of the squared weights.

L1L2: Sum of the absolute and the squared weights.

D: Because a fully connected layer occupies most of the parameters, it is prone to overfitting. One method to reduce overfitting is dropout. At each training stage, individual nodes are either "dropped out" of the net with probability 1-p or kept with probability p, so that a reduced network is left; incoming and outgoing edges to a dropped-out node are also removed.

By avoiding training all nodes on all training data, dropout decreases overfitting.

Reference:

https://machinelearningmastery.com/how-to-reduce-overfitting-in-deep-learning-with-weight-regularization/

https://en.wikipedia.org/wiki/Convolutional_neural_network

asked 02/10/2024
Mark Oh
37 questions

Question 44

Report Export Collapse

Note: This question-is part of a series of questions that present the same scenario. Each question-in the series contains a unique solution that might meet the stated goals. Some question-sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question-in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are creating a model to predict the price of a student's artwork depending on the following variables: the student's length of education, degree type, and art form.

You start by creating a linear regression model.

You need to evaluate the linear regression model.

Solution: Use the following metrics: Mean Absolute Error, Root Mean Absolute Error, Relative Absolute Error, Accuracy, Precision, Recall, F1 score, and AUC.

Does the solution meet the goal?

Yes
Yes
No
No
Suggested answer: B
Explanation:

Accuracy, Precision, Recall, F1 score, and AUC are metrics for evaluating classification models.

Note: Mean Absolute Error, Root Mean Absolute Error, Relative Absolute Error are OK for the linear regression model.

Reference:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/evaluate-model

asked 02/10/2024
Giorgio Bertocchi
45 questions

Question 45

Report Export Collapse

You are building a binary classification model by using a supplied training set.

The training set is imbalanced between two classes.

You need to resolve the data imbalance.

What are three possible ways to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Penalize the classification
Penalize the classification
Resample the dataset using undersampling or oversampling
Resample the dataset using undersampling or oversampling
Normalize the training feature set
Normalize the training feature set
Generate synthetic samples in the minority class
Generate synthetic samples in the minority class
Use accuracy as the evaluation metric of the model
Use accuracy as the evaluation metric of the model
Suggested answer: A, B, D
Explanation:

A: Try Penalized Models

You can use the same algorithms but give them a different perspective on the problem.

Penalized classification imposes an additional cost on the model for making classification mistakes on the minority class during training. These penalties can bias the model to pay more attention to the minority class.

B: You can change the dataset that you use to build your predictive model to have more balanced data.

This change is called sampling your dataset and there are two main methods that you can use to even-up the classes:

Consider testing under-sampling when you have an a lot data (tens- or hundreds of thousands of instances or more)

Consider testing over-sampling when you don't have a lot of data (tens of thousands of records or less)

D: Try Generate Synthetic Samples

A simple way to generate synthetic samples is to randomly sample the attributes from instances in the minority class.

Reference:

https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/

asked 02/10/2024
Anton Khodyakov
49 questions

Question 46

Report Export Collapse

HOTSPOT

You write code to retrieve an experiment that is run from your Azure Machine Learning workspace.

Microsoft DP-100 image Question 46 89271 10022024015826000
Correct answer: Microsoft DP-100 image answer Question 46 89271 10022024015826000
Explanation:

Business managers in your organization want to see the importance of the features in the model.

You need to print out the model features and their relative importance in an output that looks similar to the following.

Microsoft DP-100 image Question 15 explanation 89271 10022024015826000000

How should you complete the code? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.


Explanation:

Box 1: from_run_id from_run_id(workspace, experiment_name, run_id)

Create the client with factory method given a run ID.

Returns an instance of the ExplanationClient.

Parameters

Workspace Workspace An object that represents a workspace.

experiment_name str The name of an experiment.

run_id str A GUID that represents a run.

Box 2: list_model_explanations

list_model_explanations returns a dictionary of metadata for all model explanations available.

Returns

A dictionary of explanation metadata such as id, data type, explanation method, model type, and upload time, sorted by upload time

Box 3: explanation

Reference:

https://docs.microsoft.com/en-us/python/api/azureml-contrib-interpret/azureml.contrib.interpret.explanation.explanation_client.explanationclient?view=azure-ml-py

asked 02/10/2024
Yenziwe Yengwa
47 questions

Question 47

Report Export Collapse

HOTSPOT

You are performing feature scaling by using the scikit-learn Python library for x.1 x2, and x3 features.

Original and scaled data is shown in the following image.

Microsoft DP-100 image Question 16 89272 10022024015826000000

Use the drop-down menus to select the answer choice that answers each question based on the information presented in the graphic.

NOTE: Each correct selection is worth one point.


Microsoft DP-100 image Question 47 89272 10022024015826000
Correct answer: Microsoft DP-100 image answer Question 47 89272 10022024015826000
Explanation:

Box 1: StandardScaler

The StandardScaler assumes your data is normally distributed within each feature and will scale them such that the distribution is now centred around 0, with a standard deviation of 1.

Example:

Microsoft DP-100 image Question 16 explanation 89272 10022024015826000000

All features are now on the same scale relative to one another.

Box 2: Min Max Scaler

Microsoft DP-100 image Question 16 explanation 89272 10022024015826000000

Notice that the skewness of the distribution is maintained but the 3 distributions are brought into the same scale so that they overlap.

Box 3: Normalizer

References:

http://benalexkeen.com/feature-scaling-with-scikit-learn/

asked 02/10/2024
Kingsley Tibs
44 questions

Question 48

Report Export Collapse

DRAG DROP

You are producing a multiple linear regression model in Azure Machine Learning Studio.

Several independent variables are highly correlated.

You need to select appropriate methods for conducting effective feature engineering on all the data.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.


Microsoft DP-100 image Question 48 89273 10022024015826000
Correct answer: Microsoft DP-100 image answer Question 48 89273 10022024015826000
Explanation:

Step 1: Use the Filter Based Feature Selection module

Filter Based Feature Selection identifies the features in a dataset with the greatest predictive power.

The module outputs a dataset that contains the best feature columns, as ranked by predictive power. It also outputs the names of the features and their scores from the selected metric.

Step 2: Build a counting transform

A counting transform creates a transformation that turns count tables into features, so that you can apply the transformation to multiple datasets.

Step 3: Test the hypothesis using t-Test

References:

https://docs.microsoft.com/bs-latn-ba/azure/machine-learning/studio-module-reference/filter-based-feature-selection

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/build-counting-transform

asked 02/10/2024
James Brion
42 questions

Question 49

Report Export Collapse

HOTSPOT

You are developing a linear regression model in Azure Machine Learning Studio. You run an experiment to compare different algorithms.

The following image displays the results dataset output:

Microsoft DP-100 image Question 18 89274 10022024015826000000

Use the drop-down menus to select the answer choice that answers each question based on the information presented in the image.

NOTE: Each correct selection is worth one point.


Microsoft DP-100 image Question 49 89274 10022024015826000
Correct answer: Microsoft DP-100 image answer Question 49 89274 10022024015826000
Explanation:

Box 1: Boosted Decision Tree Regression

Mean absolute error (MAE) measures how close the predictions are to the actual outcomes; thus, a lower score is better.

Box 2:

Online Gradient Descent: If you want the algorithm to find the best parameters for you, set Create trainer mode option to Parameter Range. You can then specify multiple values for the algorithm to try.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/evaluate-model

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/linear-regression

asked 02/10/2024
Steve Marechal
40 questions

Question 50

Report Export Collapse

HOTSPOT

You are using a decision tree algorithm. You have trained a model that generalizes well at a tree depth equal to 10.

You need to select the bias and variance properties of the model with varying tree depth values.

Which properties should you select for each tree depth? To answer, select the appropriate options in the answer area.


Microsoft DP-100 image Question 50 89275 10022024015826000
Correct answer: Microsoft DP-100 image answer Question 50 89275 10022024015826000
Explanation:

In decision trees, the depth of the tree determines the variance. A complicated decision tree (e.g. deep) has low bias and high variance.

Note: In statistics and machine learning, the bias–variance tradeoff is the property of a set of predictive models whereby models with a lower bias in parameter estimation have a higher variance of the parameter estimates across samples, and vice versa. Increasing the bias will decrease the variance. Increasing the variance will decrease the bias.

References:

https://machinelearningmastery.com/gentle-introduction-to-the-bias-variance-trade-off-in-machine-learning/

asked 02/10/2024
Santanu Roy
35 questions
Total 433 questions
Go to page: of 44
Search

Related questions