Huawei H13-311_V3.5 Practice Test - Questions Answers, Page 4

List of questions
Question 31

The derivative of the Rectified Linear Unit (ReLU) activation function in the positive interval is always:
The Rectified Linear Unit (ReLU) activation function is defined as f(x)=max(0,x)f(x) = \max(0, x)f(x)=max(0,x). In the positive interval, where x>0x > 0x>0, the derivative of ReLU is always 1. This makes ReLU popular for deep learning networks because it helps avoid the vanishing gradient problem during backpropagation, ensuring efficient gradient flow.
Question 32

In a fully-connected structure, a hidden layer with 1000 neurons is used to process an image with the resolution of 100 x 100. Which of the following is the correct number of parameters?
In a fully-connected layer, the number of parameters is calculated by multiplying the number of input features by the number of neurons in the hidden layer. For an image of resolution 100100=10,000100 \times 100 = 10,000100100=10,000 pixels and a hidden layer of 1,000 neurons, the total number of parameters is 10,0001,000=1,000,00010,000 \times 1,000 = 1,000,00010,0001,000=1,000,000.
Question 33

The global gradient descent, stochastic gradient descent, and batch gradient descent algorithms are gradient descent algorithms. Which of the following is true about these algorithms?
The global gradient descent algorithm evaluates the gradient over the entire dataset before each update, leading to accurate but slow convergence, especially for large datasets. In contrast, stochastic gradient descent updates the model parameters more frequently, which allows for faster convergence but with noisier updates. While batch gradient descent updates the parameters based on smaller batches of data, none of these algorithms can fully guarantee finding the global minimum in non-convex problems, where local minima may exist.
Question 34

Sigmoid, tanh, and softsign activation functions cannot avoid vanishing gradient problems when the network is deep.
Activation functions like Sigmoid, tanh, and softsign suffer from the vanishing gradient problem when used in deep networks. This happens because, in these functions, gradients become very small as the input moves away from the origin (either positively or negatively). As a result, the weights of the earlier layers in the network receive very small updates, hindering the learning process in deep networks. This is one reason why activation functions like ReLU, which avoid this issue, are often preferred in deep learning.
Question 35

Single-layer perceptrons and logistic regression are linear classifiers that can only process linearly separable data.
Both single-layer perceptrons and logistic regression are linear classifiers, meaning they are capable of separating data that is linearly separable. However, they cannot effectively model non-linear relationships in the data. For more complex, non-linearly separable data, multi-layer neural networks or other non-linear classifiers are required.
Question 36

Nesterov is a variant of the momentum optimizer.
Question 37

Convolutional neural networks (CNNs) cannot be used to process text data.
Question 38

Which of the following activation functions may cause the vanishing gradient problem?
Question 39

Which of the following are use cases of generative adversarial networks?
Question 40

DRAG DROP
Match the input and output of a generative adversarial network (GAN).
Question