2

I have recently started Andrew NG's Machine Learning course on Coursera and I came across this cost function which is:

enter image description here

Why does the error in the cost function need to be squared? If its purpose is to eliminate the negative sign in the error then why don't we simply use the absolute function?

Stephen Rauch
  • 1,783
  • 11
  • 22
  • 34
elvisalive
  • 21
  • 2

2 Answers2

0

The simple answer is that it's conviences rather than necessities. You're more than welcome to take the absolute value and in many cases it may be better to do so. Squaring a function makes the math a bit happier and easier and for proofs has desirable properties. While you can basically do the same proofs with an absolute value function, you may have to handle certain edge cases and just amounts to more writing.

Tophat
  • 2,420
  • 11
  • 16
0

We use cost function to have the amount of error for a specified set of weights. We should find the weights which minimize the cost function. The used approach for minimizing the cost function are based on gradients. Means that you should move toward directions which minimize the error. To do so, the cost function have to have derivatives. Absolute function does not have derivatives in some places. Quadratic functions like square, has derivative. Although there are other reasons that we have become to this squared function, the reason it's not absolute was what I referred to.

Green Falcon
  • 14,058
  • 9
  • 57
  • 98
  • I would say that's a bit of a weak reason. For most approaches you can either use linear programming to find gradients for absolute value cost functiosn or smoothing function that closely approximates it. – Tophat Jan 04 '18 at 16:37
  • @Tophat sorry for late answer. This formula for cost function has been made from maximum likelihood. This is where it is coming from. About the absolute value I just wanted to say its difficulty. Thanks anyway. I didn't know that :) – Green Falcon Jan 05 '18 at 13:17