I have about 6 months of experience in building and using Neural Networks with no prior/formal training. As I explore this field further, I see a lot of discussions about determining how many layers/neurons to use and some rules of thumb of where to start. In development of my network models, I use a brute-force approach by incrementing the layers and neurons by 1 for several cycles of epoch training and then select the "best" model out of those. My understanding at this point, is that the layers/neurons represent the relationship between the NN inputs and the NN outputs, and while training my networks, I can determine the optimal number of layers to be 5 or more. I determine optimal based on using cross validation data the network was not trained on.
Given the above, I have read from various sources statements like "you never need more than 2 layers" and "you don't get better performance out of more than 2 layers", etc. I have also read comments that indicate more than 2 layers is expected. Is the idea of 2 layers could be "too much" really correct and I should be focusing on expanding the number of neurons used while capping the number of layer at 2 for my brute-force determination of optimal layers?
EDIT:
Is it is true that in the vast majority of cases no more than 2 layers is warranted as it seems some are claiming? My specific interest is in the area of numeric prediction, however, comments about other segments are welcome as well.