logistic regression algorithm fails to work

Question

I'm trying to code my own logistic regression algorithm using Andrew NG's machine learning using Octave. lectures. So what I did was make a csv file, the first row being some parameter and the second one being the result:

121,1
124,0
97,0
104,0
110,0
...

Overall there are only 24 examples, but I've chosen points such that some pattern can be followed.

Here is my code:

data = load('data.dat');
x = data(:, 1);

y = data(:, 2)
m = length(y);

#plot(x, y, 'rx', 'MarkerSize', 10);
#xlabel('IQ');
#ylabel('Pass/Fail');
#title('Logistic Regression');

x = [ones(size(x, 1), 1) x];
alpha = 0.00001;
i = 15000;

g = inline("1 ./ (1 + exp(-z))")

theta = zeros(size(x(1, :)))';
j = zeros(i, 1);

for num = 1:i
  z = x * theta;
  h = g(z);
  j = (1./m) * ( -y' * log( h ) - ( 1 - y' ) * log ( 1 - h))
  grad = 1./m * x' * (h - y);
  theta = theta - alpha * grad;
end

However the output of the sigmoid function shows every value below 0.5... surely this has to be wrong. I've also tried with different learning rates and iterations, but to no avail. What is wrong with the code, or data?

Help would be appreciated.

How balanced is your datasets (ratio of 0's to 1's)? If it is very unbalanced, and there is not a sufficient relationship to the input, predictions staying below 0.5 can easily happen. — Paul, Jan 31 '18 at 18:16
https://pastebin.com/FQgnSf57 they are equally balanced, 12 0s and 12 1s — dnclem, Jan 31 '18 at 18:32
`x = data(:, 1);
y = data(:, 2)` what is in the first column? — Green Falcon, Feb 04 '18 at 15:41
Do you mean column '0'? If so Octave indices start from 1 (I forgot to mention I was using that, so I edited my post). The first column = my feature/parameter #1 and the second column is the output — dnclem, Feb 04 '18 at 15:47
i is the number of iterations I set for theta and J to converge.. it didn't really make a difference beyond a few hundred — dnclem, Feb 04 '18 at 16:01
Help me understand that data - did you just make up those values or were those values provided to you? — I_Play_With_Data, Feb 05 '18 at 16:18

Green Falcon · Accepted Answer · 2018-02-04T17:06:15.053

0

Consider that you are doing vector operation, change your cost function to the following:

(1 / m) * sum(((-y) .* (log(h)) - ((1 - y) .* log((1-h)))));

and your gradient to the following:

grad = (1./m) * (x' * (h - y))

Although the latter is just for precedence reassuring.

Based on the discussion in the chat, although the code calculates the cost in a wrong way, the reason the cost does not decrease is that the data is not linearly separable. Logistic regression is a simple algorithm which classifies successfully linearly separable data. Take a look at here.

edited Feb 04 '18 at 17:06

answered Feb 04 '18 at 16:04

Green Falcon

14,058
9
57
98

Now the output of every value of the sigmoid function is above 0.5...(all approximately 0.54) – dnclem Feb 04 '18 at 16:18
@david does it learn? – Green Falcon Feb 04 '18 at 16:19
Consider that what I said was irrelevant to sigmoid. maybe stupid but instead of double quote use single quotation for your inline function. – Green Falcon Feb 04 '18 at 16:20
I mean the J value stays constant at approximately 0.69 after every iteration and never reduces so I don't think so. – dnclem Feb 04 '18 at 16:22
I tried using single quotes, no difference :( – dnclem Feb 04 '18 at 16:22
change your learning rate and increase it. assign alpha to 0.005. Also consider that J is cost. sigmoid is your g function. – Green Falcon Feb 04 '18 at 16:23
Increasing it doesn't seem to change anything.. – dnclem Feb 04 '18 at 16:27
Let us continue this discussion in chat. – Green Falcon Feb 04 '18 at 16:27

score 0 · Answer 2 · answered Feb 05 '18 at 08:55

In general, there's nothing wrong with the output of calling the sigmoid being below 0.5. All the sigmoid does is to "squash" the data into a tight range that is consistent with a probability value (which must be between $0$ and $1$) before the next step of fitting the model.

Now if you're saying that values fail to converge, or get stuck at some small value during iterations, then I would guess there is a bug in your implementation of the loss/cost/error function, its gradient and related quantities.

You should definitely update the matrix * operations to the element-wise .* in this line:

j = (1./m) * ( -y' * log( h ) - ( 1 - y' ) * log ( 1 - h))

Also, I'm a bit suspicious of the plus & minus signs in the following equations. You should double-check them against the original mathematical equations to make sure, as I don't think the signs look exactly right:

j = (1./m) * ( -y' * log( h ) - ( 1 - y' ) * log ( 1 - h))
grad = 1./m * x' * (h - y);

logistic regression algorithm fails to work

2 Answers2