ExamGecko
Question list
Search
Search

List of questions

Search

Question 38 - H13-311_V3.5 discussion

Report
Export

Which of the following activation functions may cause the vanishing gradient problem?

A.
Softplus
Answers
A.
Softplus
B.
ReLU
Answers
B.
ReLU
C.
Sigmoid
Answers
C.
Sigmoid
D.
Tanh
Answers
D.
Tanh
Suggested answer: C, D

Explanation:

Both Sigmoid and Tanh activation functions can cause the vanishing gradient problem. This issue occurs because these functions squash their inputs into a very small range, leading to very small gradients during backpropagation, which slows down learning. In deep neural networks, this can prevent the weights from updating effectively, causing the training process to stall.

Sigmoid: Outputs values between 0 and 1. For large positive or negative inputs, the gradient becomes very small.

Tanh: Outputs values between -1 and 1. While it has a broader range than Sigmoid, it still suffers from vanishing gradients for larger input values.

ReLU, on the other hand, does not suffer from the vanishing gradient problem since it outputs the input directly if positive, allowing gradients to pass through. However, Softplus is also less prone to this problem compared to Sigmoid and Tanh.

HCIA AI

Deep Learning Overview: Explains the vanishing gradient problem in deep networks, especially when using Sigmoid and Tanh activation functions.

AI Development Framework: Covers the use of ReLU to address the vanishing gradient issue and its prevalence in modern neural networks.

asked 26/09/2024
Flora Hundal
32 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first