Activation function is a non-linear transformation, usually applied in neural networks to the output of the linear or convolutional layer. Common activation functions: sigmoid, tanh, ReLU, etc.
Questions tagged [activation-function]
172 questions
4
votes
6 answers
Why is activation needed at all in neural network?
I watched the Risto Siilasmaa video on Machine Learning. It's very well explained, but the question emerged that at what stage should we use the activation function and why we need it at all. I know that by definition the activation function…
Jane Mänd
- 349
- 3
- 9
1
vote
1 answer
Does the number of hidden layers affect the activation function?
Suppose there's a network with N hidden layers.
There are 2 cases:
The network is deep
The network is shallow
I've been wondering how N affects choosing the activation function.
Will it affect, for example, Sigmoid more than Leaky ReLU?
Rony
- 11
- 1
1
vote
1 answer
Activation function vs If else statement
The question is very naive and most of us may know the answer. I have googled it but was not able to find a satisfactory answer so posting it here. Can someone please put the right words on this question.
Activation functions like ReLU, Sigmoid etc…
Sandeep Bhutani
- 894
- 1
- 7
- 24
1
vote
1 answer
Square-law based RBF kernel
What is the Square-law based RBF kernel (SQ-RBF)? The definition in the table at the Wikipedia article Activation Function looks wrong, since it says
y =
1 - x^2/2 for |x| <= 1
2 - (2-x^2)/2 for 1 < |x| <= 2
0 for |x|…
Fortranner
- 215
- 1
- 5
0
votes
0 answers
tanh function values are either 1 or -1, how to interpret that distribution
I have a question regarding the tanh function. I trained an NN (with tanh activation functions in hidden layers) on a multiclass dataset and visualised the tanh values of the complete samples from the dataset passed through the NN.
The question is…
malocho
- 183
- 1
- 6
0
votes
2 answers
Is there a limit in the number of layers for neural network?
I heard the neural network has a problem with vanishing gradient problems even though the ReLU activation function is used.
In ResNet(that has a connecting function for reducing the problem), there is limit of maximum layers of 120~190(I heard).
For…
INNO TECH
- 139
- 4
0
votes
1 answer
How ReLU is bringing non linearity and why it is not an alternative to dropout?
The differentiation of ReLU function is 1 when input is greater than 0, and 0, when input is less than or equal to 0. In the backpropagation process it doesn’t change the value of d(error)/d(weight) at all. Either the gradient is multiplied by 1, or…
mir abir hossain
- 1
- 1
0
votes
2 answers
Why use tanh (or any other activation function)?
In machine learning, it is common to use activation functions like
tanh, sigmoid, or ReLU to introduce non-linearity into a neural
network. These non-linearities help the network learn complex
relationships between input features and output…
good_evening
- 55
- 3