ExamGecko
Home / Huawei / H13-311_V3.5 / List of questions
Ask Question

Huawei H13-311_V3.5 Practice Test - Questions Answers, Page 4

Add to Whishlist

List of questions

Question 31

Report Export Collapse

The derivative of the Rectified Linear Unit (ReLU) activation function in the positive interval is always:

0
0
0.5
0.5
1
1
Variable
Variable
Suggested answer: C
Explanation:

The Rectified Linear Unit (ReLU) activation function is defined as f(x)=max(0,x)f(x) = \max(0, x)f(x)=max(0,x). In the positive interval, where x>0x > 0x>0, the derivative of ReLU is always 1. This makes ReLU popular for deep learning networks because it helps avoid the vanishing gradient problem during backpropagation, ensuring efficient gradient flow.

asked 26/09/2024
Arkadiusz Skopinski
48 questions

Question 32

Report Export Collapse

In a fully-connected structure, a hidden layer with 1000 neurons is used to process an image with the resolution of 100 x 100. Which of the following is the correct number of parameters?

100,000
100,000
10,000
10,000
1,000,000
1,000,000
10,000,000
10,000,000
Suggested answer: C
Explanation:

In a fully-connected layer, the number of parameters is calculated by multiplying the number of input features by the number of neurons in the hidden layer. For an image of resolution 100100=10,000100 \times 100 = 10,000100100=10,000 pixels and a hidden layer of 1,000 neurons, the total number of parameters is 10,0001,000=1,000,00010,000 \times 1,000 = 1,000,00010,0001,000=1,000,000.

asked 26/09/2024
Borja Arranz Palenzuela
47 questions

Question 33

Report Export Collapse

The global gradient descent, stochastic gradient descent, and batch gradient descent algorithms are gradient descent algorithms. Which of the following is true about these algorithms?

The batch gradient algorithm can solve the problem of local minimum value.
The batch gradient algorithm can solve the problem of local minimum value.
The global gradient algorithm can find the minimum value of the loss function.
The global gradient algorithm can find the minimum value of the loss function.
The stochastic gradient algorithm can find the minimum value of the loss function.
The stochastic gradient algorithm can find the minimum value of the loss function.
The convergence process of the global gradient algorithm is time-consuming.
The convergence process of the global gradient algorithm is time-consuming.
Suggested answer: D
Explanation:

The global gradient descent algorithm evaluates the gradient over the entire dataset before each update, leading to accurate but slow convergence, especially for large datasets. In contrast, stochastic gradient descent updates the model parameters more frequently, which allows for faster convergence but with noisier updates. While batch gradient descent updates the parameters based on smaller batches of data, none of these algorithms can fully guarantee finding the global minimum in non-convex problems, where local minima may exist.

asked 26/09/2024
de jong arjen
50 questions

Question 34

Report Export Collapse

Sigmoid, tanh, and softsign activation functions cannot avoid vanishing gradient problems when the network is deep.

TRUE
TRUE
FALSE
FALSE
Suggested answer: A
Explanation:

Activation functions like Sigmoid, tanh, and softsign suffer from the vanishing gradient problem when used in deep networks. This happens because, in these functions, gradients become very small as the input moves away from the origin (either positively or negatively). As a result, the weights of the earlier layers in the network receive very small updates, hindering the learning process in deep networks. This is one reason why activation functions like ReLU, which avoid this issue, are often preferred in deep learning.

asked 26/09/2024
Stergios Gaidatzis
45 questions

Question 35

Report Export Collapse

Single-layer perceptrons and logistic regression are linear classifiers that can only process linearly separable data.

TRUE
TRUE
FALSE
FALSE
Suggested answer: A
Explanation:

Both single-layer perceptrons and logistic regression are linear classifiers, meaning they are capable of separating data that is linearly separable. However, they cannot effectively model non-linear relationships in the data. For more complex, non-linearly separable data, multi-layer neural networks or other non-linear classifiers are required.

asked 26/09/2024
RODRIGO BALISTA
45 questions

Question 36

Report Export Collapse

Nesterov is a variant of the momentum optimizer.

Become a Premium Member for full access
  Unlock Premium Member

Question 37

Report Export Collapse

Convolutional neural networks (CNNs) cannot be used to process text data.

Become a Premium Member for full access
  Unlock Premium Member

Question 38

Report Export Collapse

Which of the following activation functions may cause the vanishing gradient problem?

Become a Premium Member for full access
  Unlock Premium Member

Question 39

Report Export Collapse

Which of the following are use cases of generative adversarial networks?

Become a Premium Member for full access
  Unlock Premium Member

Question 40

Report Export Collapse

DRAG DROP

Match the input and output of a generative adversarial network (GAN).


Become a Premium Member for full access
  Unlock Premium Member
Total 60 questions
Go to page: of 6