2

I'm trying to code my own logistic regression algorithm using Andrew NG's machine learning using Octave. lectures. So what I did was make a csv file, the first row being some parameter and the second one being the result:

121,1
124,0
97,0
104,0
110,0
...

Overall there are only 24 examples, but I've chosen points such that some pattern can be followed.

Here is my code:

data = load('data.dat');
x = data(:, 1);

y = data(:, 2)
m = length(y);

#plot(x, y, 'rx', 'MarkerSize', 10);
#xlabel('IQ');
#ylabel('Pass/Fail');
#title('Logistic Regression');

x = [ones(size(x, 1), 1) x];
alpha = 0.00001;
i = 15000;

g = inline("1 ./ (1 + exp(-z))")

theta = zeros(size(x(1, :)))';
j = zeros(i, 1);

for num = 1:i
  z = x * theta;
  h = g(z);
  j = (1./m) * ( -y' * log( h ) - ( 1 - y' ) * log ( 1 - h))
  grad = 1./m * x' * (h - y);
  theta = theta - alpha * grad;
end

However the output of the sigmoid function shows every value below 0.5... surely this has to be wrong. I've also tried with different learning rates and iterations, but to no avail. What is wrong with the code, or data?

Help would be appreciated.

dnclem
  • 23
  • 6

2 Answers2

0

Consider that you are doing vector operation, change your cost function to the following:

(1 / m) * sum(((-y) .* (log(h)) - ((1 - y) .* log((1-h)))));

and your gradient to the following:

grad = (1./m) * (x' * (h - y))

Although the latter is just for precedence reassuring.


Based on the discussion in the chat, although the code calculates the cost in a wrong way, the reason the cost does not decrease is that the data is not linearly separable. Logistic regression is a simple algorithm which classifies successfully linearly separable data. Take a look at here.
Green Falcon
  • 14,058
  • 9
  • 57
  • 98
0

In general, there's nothing wrong with the output of calling the sigmoid being below 0.5. All the sigmoid does is to "squash" the data into a tight range that is consistent with a probability value (which must be between $0$ and $1$) before the next step of fitting the model.

Now if you're saying that values fail to converge, or get stuck at some small value during iterations, then I would guess there is a bug in your implementation of the loss/cost/error function, its gradient and related quantities.

You should definitely update the matrix * operations to the element-wise .* in this line:

j = (1./m) * ( -y' * log( h ) - ( 1 - y' ) * log ( 1 - h))

Also, I'm a bit suspicious of the plus & minus signs in the following equations. You should double-check them against the original mathematical equations to make sure, as I don't think the signs look exactly right:

j = (1./m) * ( -y' * log( h ) - ( 1 - y' ) * log ( 1 - h))
grad = 1./m * x' * (h - y);
A. G.
  • 271
  • 1
  • 3