What is the use of torch.no_grad in pytorch?

Question

I am new to pytorch and started with this github code. I do not understand the comment in line 60-61 in the code "because weights have requires_grad=True, but we don't need to track this in autograd". I understood that we mention requires_grad=True to the variables which we need to calculate the gradients for using autograd but what does it mean to be "tracked by autograd" ?

score 61 · Accepted Answer · edited Feb 07 '21 at 21:01

61

The wrapper with torch.no_grad() temporarily sets all of the requires_grad flags to false. An example is from the official PyTorch tutorial.

x = torch.randn(3, requires_grad=True)
print(x.requires_grad)
print((x ** 2).requires_grad)
with torch.no_grad():
    print((x ** 2).requires_grad)

Output:

True
True
False

I recommend you to read all the tutorials from the link above.

In your example: I guess the author does not want PyTorch to calculate the gradients of the new defined variables w1 and w2 since he just want to update their values.

edited Feb 07 '21 at 21:01

Ethan

1,633
9
24
39

answered Jun 05 '18 at 09:13

Adrien D

1,113
8
7

3

Thanks for this answer. However, I still have a doubt: why would you want to explicitly code that you do NOT want to track the variables with autograd? Is there a memory allocation reason or what? Thanks! – desmond13 Mar 30 '20 at 08:36
1

Hey, I looked at the code and I can't see any problem with not having the torch.no_grad() line. I mean anyway we clear the grad so that tracking should not matter. Please correct me if I am wrong! – Black Jack 21 Apr 07 '20 at 10:41
1

To add to this answer: I had this same question, and had assumed that using model.eval() would mean that I didn't need to also use torch.no_grad(). Turns out that both have different goals: model.eval() will ensure that layers like batchnorm or dropout will work in eval mode instead of training mode; whereas, torch.no_grad() is used for the reason specified above in the answer. Ideally, one should use both if in the evaluation phase. – Lakshay Sharma Apr 23 '20 at 22:34
1

This answer is a bit misleading- torch.no_grad() does not set 'all of the requires_grad flags to False; it only sets these to False for new tensors. requires_grad will not be set to False for the parameters of the model (or, in this example, for the original tensor x). – eric.mitchell Sep 15 '21 at 17:33

score 15 · Answer 2 · edited Feb 07 '21 at 21:01

15

Torch.no_grad() deactivates autograd engine. Eventually it will reduce the memory usage and speed up computations.

Use of Torch.no_grad():

To perform inference without Gradient Calculation.
To make sure there's no leak test data into the model.

It's generally used to perform Validation. Reason in this case one can use validation batch of large size.

edited Feb 07 '21 at 21:01

Ethan

1,633
9
24
39

answered Apr 02 '20 at 09:55

Rohan Shetty

191
1
5

score 10 · Answer 3 · edited Apr 25 '19 at 02:16

with torch.no_grad()

will make all the operations in the block have no gradients.

In pytorch, you can't do inplacement changing of w1 and w2, which are two variables with require_grad = True. I think that avoiding the inplacement changing of w1 and w2 is because it will cause error in back propagation calculation. Since inplacement change will totally change w1 and w2.

However, if you use this no_grad(), you can control the new w1 and new w2 have no gradients since they are generated by operations, which means you only change the value of w1 and w2, not gradient part, they still have previous defined variable gradient information and back propagation can continue.

Hey, I looked at the code and I can't see any problem with not having the torch.no_grad() line. I mean anyway we clear the grad so that tracking should not matter. Please correct me if I am wrong! — Black Jack 21, Apr 07 '20 at 10:42

score 1 · Answer 4 · edited Feb 07 '21 at 21:01

1

I think if we do not use torch.no_grad then the weight update step will be added to the computational graph of the neural network which is not desired.

edited Feb 07 '21 at 21:01

Ethan

1,633
9
24
39

answered Apr 28 '20 at 16:07

Abhishek Kishore

111

What is the use of torch.no_grad in pytorch?

4 Answers4