Why is ReLU better and more often used than Sigmoid in Neural Networks?

Question

Why is ReLU better and more often used than Sigmoid in Neural Networks?

1 Answer

sharadyadav1986 · Answer 1 · 2023-06-10T17:21:13+0000

Imagine a network with random initialized weights ( or normalised ) and almost 50% of the network yields 0 activation because of the characteristic of ReLu ( output 0 for negative values of x ). This means a fewer neurons are firing ( sparse activation ) and the network is lighter.

Why is ReLU better and more often used than Sigmoid in Neural Networks?

Please log in or register to answer this question.

1 Answer