首頁 > 軟體

Pytorch中如何呼叫forward()函數

2023-02-20 06:00:28

Pytorch呼叫forward()函數

Module類是nn模組裡提供的一個模型構造類,是所有神經網路模組的基礎類別,我們可以繼承它來定義我們想要的模型。

下面繼承Module類構造本節開頭提到的多層感知機。

這裡定義的MLP類過載了Module類的__init__函數和forward函數。

它們分別用於建立模型引數和定義前向計算。

前向計算也即正向傳播。

import torch
from torch import nn
 
class MLP(nn.Module):
    # 宣告帶有模型引數的層,這裡宣告了兩個全連線層
    def __init__(self, **kwargs):
        # 呼叫MLP父類別Module的建構函式來進行必要的初始化。這樣在構造範例時還可以指定其他函數
        # 引數,如「模型引數的存取、初始化和共用」一節將介紹的模型引數params
        super(MLP, self).__init__(**kwargs)
        self.hidden = nn.Linear(784, 256) # 隱藏層
        self.act = nn.ReLU()
        self.output = nn.Linear(256, 10)  # 輸出層
 
 
    # 定義模型的前向計算,即如何根據輸入x計算返回所需要的模型輸出
    def forward(self, x):
        a = self.act(self.hidden(x))
        return self.output(a)
  
X = torch.rand(2, 784)
net = MLP()
print(net)
net(X)

輸出:

MLP( (hidden): Linear(in_features=784, out_features=256, bias=True) (act): ReLU() (output): Linear(in_features=256, out_features=10, bias=True) ) tensor([[-0.1798, -0.2253, 0.0206, -0.1067, -0.0889, 0.1818, -0.1474, 0.1845, -0.1870, 0.1970], [-0.1843, -0.1562, -0.0090, 0.0351, -0.1538, 0.0992, -0.0883, 0.0911, -0.2293, 0.2360]], grad_fn=<ThAddmmBackward>)

為什麼會呼叫forward()呢,是因為Module中定義了__call__()函數,該函數呼叫了forward()函數,當執行net(x)的時候,會自動呼叫__call__()函數

Pytorch函數呼叫的問題和原始碼解讀

最近用到 softmax 函數,但是發現 softmax 的寫法五花八門,記錄如下

# torch._C._VariableFunctions
torch.softmax(x, dim=-1)
# class
softmax = torch.nn.Softmax(dim=-1)
x=softmax(x)
# function
x = torch.nn.functional.softmax(x, dim=-1)

簡單測試了一下,用 torch.nn.Softmax 類是最慢的,另外兩個差不多

torch.nn.Softmax 原始碼如下,可以看到這是個類,而他這裡的 return F.softmax(input, self.dim, _stacklevel=5) 呼叫的是 torch.nn.functional.softmax

class Softmax(Module):
    r"""Applies the Softmax function to an n-dimensional input Tensor
    rescaling them so that the elements of the n-dimensional output Tensor
    lie in the range [0,1] and sum to 1.

    Softmax is defined as:

    .. math::
        text{Softmax}(x_{i}) = frac{exp(x_i)}{sum_j exp(x_j)}

    When the input Tensor is a sparse tensor then the unspecifed
    values are treated as ``-inf``.

    Shape:
        - Input: :math:`(*)` where `*` means, any number of additional
          dimensions
        - Output: :math:`(*)`, same shape as the input

    Returns:
        a Tensor of the same dimension and shape as the input with
        values in the range [0, 1]

    Args:
        dim (int): A dimension along which Softmax will be computed (so every slice
            along dim will sum to 1).

    .. note::
        This module doesn't work directly with NLLLoss,
        which expects the Log to be computed between the Softmax and itself.
        Use `LogSoftmax` instead (it's faster and has better numerical properties).

    Examples::

        >>> m = nn.Softmax(dim=1)
        >>> input = torch.randn(2, 3)
        >>> output = m(input)

    """
    __constants__ = ['dim']
    dim: Optional[int]

    def __init__(self, dim: Optional[int] = None) -> None:
        super(Softmax, self).__init__()
        self.dim = dim

    def __setstate__(self, state):
        self.__dict__.update(state)
        if not hasattr(self, 'dim'):
            self.dim = None

    def forward(self, input: Tensor) -> Tensor:
        return F.softmax(input, self.dim, _stacklevel=5)

    def extra_repr(self) -> str:
        return 'dim={dim}'.format(dim=self.dim)

torch.nn.functional.softmax 函數原始碼如下,可以看到 ret = input.softmax(dim) 實際上呼叫了 torch._C._VariableFunctions 中的 softmax 函數

def softmax(input: Tensor, dim: Optional[int] = None, _stacklevel: int = 3, dtype: Optional[DType] = None) -> Tensor:
    r"""Applies a softmax function.

    Softmax is defined as:

    :math:`text{Softmax}(x_{i}) = frac{exp(x_i)}{sum_j exp(x_j)}`

    It is applied to all slices along dim, and will re-scale them so that the elements
    lie in the range `[0, 1]` and sum to 1.

    See :class:`~torch.nn.Softmax` for more details.

    Args:
        input (Tensor): input
        dim (int): A dimension along which softmax will be computed.
        dtype (:class:`torch.dtype`, optional): the desired data type of returned tensor.
          If specified, the input tensor is casted to :attr:`dtype` before the operation
          is performed. This is useful for preventing data type overflows. Default: None.

    .. note::
        This function doesn't work directly with NLLLoss,
        which expects the Log to be computed between the Softmax and itself.
        Use log_softmax instead (it's faster and has better numerical properties).

    """
    if has_torch_function_unary(input):
        return handle_torch_function(softmax, (input,), input, dim=dim, _stacklevel=_stacklevel, dtype=dtype)
    if dim is None:
        dim = _get_softmax_dim("softmax", input.dim(), _stacklevel)
    if dtype is None:
        ret = input.softmax(dim)
    else:
        ret = input.softmax(dim, dtype=dtype)
    return ret

那麼不如直接呼叫 built-in C 的函數?

但是有個部落格 A selective excursion into the internals of PyTorch 裡說

Note: That bilinear is exported as torch.bilinear is somewhat accidental. Do use the documented interfaces, here torch.nn.functional.bilinear whenever you can!

意思是說 built-in C 能被 torch.xxx 直接呼叫是意外的,強烈建議使用 torch.nn.functional.xxx 這樣的介面

看到最新的 transformer 官方程式碼裡也用的是 torch.nn.functional.softmax,還是和他們一致更好(雖然他們之前用的是類。。。)

總結

以上為個人經驗,希望能給大家一個參考,也希望大家多多支援it145.com。


IT145.com E-mail:sddin#qq.com