Pytorch Detach Analysis

Source: Internet
Author: User
Tags volatile pytorch
Pytorch Detach and Detach_

Pytorch's Variable object has two methods, detach and Detach_ This article mainly describes the effect of these two methods and what can be done with these two methods. Detach

This method is described in the official documentation. Returns a new Variable that is detached from the current diagram. The returned Variable will never need a gradient if the detach Variable volatile=true, then detach out of the volatile is also true there is a note, namely: return Variable and be Deta The variable of CH points to the same tensor

Import Torch from
torch.nn import init from
torch.autograd import Variable
t1 = torch. Floattensor ([1., 2.])
V1 = Variable (t1)
t2 = torch. Floattensor ([2., 3.])
v2 = Variable (t2)
v3 = v1 + v2
v3_detached = V3.detach ()
v3_detached.data.add_ (t1) # modified v3_detached Variable The value of tensor in
print (v3, v3_detached)    # V3 will also change
1 2 3 4 5 6 7 8 9 10 11
# detach Source
def detach (self):
    result = Nograd () (self) #-is  needed, because it merges version counters
    RESULT._GRAD_FN = None
    return result
1 2 3) 4 5 Detach_

The official website explains that the Variable is separated from the graph that created it as a leaf node.

From the source code can also be seen this point will Variable GRAD_FN set to None, so that, BP, to this Variable can not find its grad_fn, so it will not be back to BP. Set the Requires_grad to False. This feeling is not necessary, but since the source code is so written, if there is a need for a gradient, you can manually set the Requires_grad to True

# Detach_ Source
def detach_ (self): "" "detaches the Variable from the
    graph that created it, making it a
    leaf.
   "" "
    self._grad_fn = None
    Self.requires_grad = False
1 2 3 4 5 6 7 What can I do for you?

If we have two network a,ba,b, the two relationships are such y=a (x), Z=b (y) y=a (x), Z=b (y) Now we want to use Z.backward () Z.backward () to calculate the gradient for the BB network parameters, but do not want to ask for the gradient of AA network parameters. We can do this:

# y=a (x), z=b (y) asks for the gradient of the parameter in B, does not ask for the gradient of the parameter in A
# The first method
y = A (x)
z = B (Y.detach ())
Z.backward ()

# The second method
y = A (x)
y.detach_ ()
z = B (y)
Z.backward ()
1 2 3 4 5 6 7 8 9 10 11

In this case, both detach and detach_ can be used. But if you want to use YY to do BP to AA. Then you can only use the first method. Because the second method already gives the output of the AA model to detach (separate).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.