Your network must have two final neural nodes, if your input can belong to one of two different classes. The target tensor must then be of the shape (minibatch) and only contain zeros and ones.
Generally your network must output a Tensor with shape (minibatch, C), where C is the number of classes your data can be classified in. The target tensor must be of shape (minibatch) and consist of only numbers of type long, that are element of the set {0, ..., C-1}.
Maybe a real world example: Your network is fed with pictures of dogs, cats and pigs. Therefore it has 3 final nodes which (usually) represent the probability of the input image showing a dog (class 0), cat (class 1) or pig (class 2). Let's assume, you put 10 images into your network at once (minibatch = 10). Then your target tensor could be:
torch.LongTensor([0, 2, 1, 0, 1, 0, 2, 2, 1, 0, 0, 1])
This can be interpreted as the first image of your batch shows a dog, the second image a pig, the third image a cat, ...