How to use Cross Entropy loss in pytorch for binary prediction?

Question

In the pytorch docs, it says for cross entropy loss:

input has to be a Tensor of size (minibatch, C)

Does this mean that for binary (0,1) prediction, the input must be converted into an (N,2) tensor where the second dimension is equal to (1-p)?

So for instance if I predict 0.75 for a class with target 1 (true), would I have to stack two values (0.75; 0.25) on top of each other as input?

score 9 · Accepted Answer · answered Aug 19 '18 at 23:34

Actually there is no need for that. PyTorch has BCELoss which stands for Binary Cross Entropy Loss. Please check out original documentation here. Here is a quick example:

m = nn.Sigmoid() # initialize sigmoid layer
loss = nn.BCELoss() # initialize loss function
input = torch.randn(3, requires_grad=True) # give some random input
target = torch.empty(3).random_(2) # create some ground truth values
output = loss(m(input), target) # forward pass
output.backward() # backward pass

score 2 · Answer 2 · answered Aug 18 '18 at 12:08

In below-given example 3 is the batch size and 2 will be probabilities for each class in given example.

loss = nn.CrossEntropyLoss()
input = torch.randn(3, 2, requires_grad=True)
target = torch.empty(3, dtype=torch.long).random_(2)
output = loss(input, target)

score 0 · Answer 3 · answered Jun 17 '19 at 21:22

Your network must have two final neural nodes, if your input can belong to one of two different classes. The target tensor must then be of the shape (minibatch) and only contain zeros and ones.

Generally your network must output a Tensor with shape (minibatch, C), where C is the number of classes your data can be classified in. The target tensor must be of shape (minibatch) and consist of only numbers of type long, that are element of the set {0, ..., C-1}.

Maybe a real world example: Your network is fed with pictures of dogs, cats and pigs. Therefore it has 3 final nodes which (usually) represent the probability of the input image showing a dog (class 0), cat (class 1) or pig (class 2). Let's assume, you put 10 images into your network at once (minibatch = 10). Then your target tensor could be:

torch.LongTensor([0, 2, 1, 0, 1, 0, 2, 2, 1, 0, 0, 1])

This can be interpreted as the first image of your batch shows a dog, the second image a pig, the third image a cat, ...

How to use Cross Entropy loss in pytorch for binary prediction?

3 Answers3