The activation function you choose depends on the application you are building/data that you have got to work with. It is hard to recommend one over the other, without taking this into account.
Here is a short-summary of the advantages and disadvantages of some common activation functions:
https://missinglink.ai/guides/neural-network-concepts/7-types-neural-network-activation-functions-right/
What does the author mean with ReLU when I'm dealing with positive values, and a linear function when I'm dealing with general values.
ReLU is good for inputs > 0, since ReLU = 0 if input < 0(which would kill the neuron, if the gradient is = 0)
To remedy this, you could look into using a Leaky-ReLU instead.
(Which avoids killing the neuron by returning a non-zero value in the cases of input <= 0)
There are two major drawbacks of linear activation functions:
1.You cant use back-propagation in training(since the derivative is a constant, it does not convey which weight influenced the input the most).
2.Linear activation functions are only applicable to shallow networks, since the derivative of a linear function is a constant(multiple layers of linear functions is just another linear function).
So, in the case that you have a shallow network - that does not rely on backpropagation - then you can use a linear activation function.
– Krrrl Oct 22 '19 at 13:38