<em>Mac</em>Book项目 2009年学校开始实施<em>Mac</em>Book项目,所有师生配备一本<em>Mac</em>Book,并同步更新了校园无线网络。学校每周进行电脑技术更新,每月发送技术支持资料,极大改变了教学及学习方式。因此2011
2021-06-01 09:32:01
在學習transformer時,遇到過非常頻繁的nn.Linear()函數,這裡對nn.Linear進行一個詳解。
參考:https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html
從名稱就可以看出來,nn.Linear表示的是線性變換,原型就是初級數學裡學到的線性函數:y=kx+b
不過在深度學習中,變數都是多維張量,乘法就是矩陣乘法,加法就是矩陣加法,因此nn.Linear()執行的真正的計算就是:
output = weight @ input + bias
@: 在python中代表矩陣乘法
input: 表示輸入的Tensor,可以有多個維度
weights: 表示可學習的權重,shape=(output_feature,in_feature)
bias: 表示科學習的偏置,shape=(output_feature)
in_feature: nn.Linear 初始化的第一個引數,即輸入Tensor最後一維的通道數
out_feature: nn.Linear 初始化的第二個引數,即返回Tensor最後一維的通道數
output: 表示輸入的Tensor,可以有多個維度
常用標頭檔案:import torch.nn as nn
nn.Linear()的初始化:
nn.Linear(in_feature,out_feature,bias)
in_feature: int型, 在forward中輸入Tensor最後一維的通道數
out_feature: int型, 在forward中輸出Tensor最後一維的通道數
bias: bool型, Linear線性變換中是否新增bias偏置
nn.Linear()的執行:(即執行forward函數)
out=nn.Linear(input)
input: 表示輸入的Tensor,可以有多個維度
output: 表示輸入的Tensor,可以有多個維度
舉例:
2維 Tensor
m = nn.Linear(20, 40) input = torch.randn(128, 20) output = m(input) print(output.size()) # [(128,40])
4維 Tensor:
m = nn.Linear(128, 64) input = torch.randn(512, 3,128,128) output = m(input) print(output.size()) # [(512, 3,128,64))
import math import torch import torch.nn as nn from torch import Tensor from torch.nn.parameter import Parameter, UninitializedParameter from torch.nn import functional as F from torch.nn import init # from .lazy import LazyModuleMixin class myLinear(nn.Module): r"""Applies a linear transformation to the incoming data: :math:`y = xA^T + b` This module supports :ref:`TensorFloat32<tf32_on_ampere>`. Args: in_features: size of each input sample out_features: size of each output sample bias: If set to ``False``, the layer will not learn an additive bias. Default: ``True`` Shape: - Input: :math:`(*, H_{in})` where :math:`*` means any number of dimensions including none and :math:`H_{in} = text{in_features}`. - Output: :math:`(*, H_{out})` where all but the last dimension are the same shape as the input and :math:`H_{out} = text{out_features}`. Attributes: weight: the learnable weights of the module of shape :math:`(text{out_features}, text{in_features})`. The values are initialized from :math:`mathcal{U}(-sqrt{k}, sqrt{k})`, where :math:`k = frac{1}{text{in_features}}` bias: the learnable bias of the module of shape :math:`(text{out_features})`. If :attr:`bias` is ``True``, the values are initialized from :math:`mathcal{U}(-sqrt{k}, sqrt{k})` where :math:`k = frac{1}{text{in_features}}` Examples:: >>> m = nn.Linear(20, 30) >>> input = torch.randn(128, 20) >>> output = m(input) >>> print(output.size()) torch.Size([128, 30]) """ __constants__ = ['in_features', 'out_features'] in_features: int out_features: int weight: Tensor def __init__(self, in_features: int, out_features: int, bias: bool = True, device=None, dtype=None) -> None: factory_kwargs = {'device': device, 'dtype': dtype} super(myLinear, self).__init__() self.in_features = in_features self.out_features = out_features self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs)) if bias: self.bias = Parameter(torch.empty(out_features, **factory_kwargs)) else: self.register_parameter('bias', None) self.reset_parameters() def reset_parameters(self) -> None: # Setting a=sqrt(5) in kaiming_uniform is the same as initializing with # uniform(-1/sqrt(in_features), 1/sqrt(in_features)). For details, see # https://github.com/pytorch/pytorch/issues/57109 print("333") init.kaiming_uniform_(self.weight, a=math.sqrt(5)) if self.bias is not None: fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight) bound = 1 / math.sqrt(fan_in) if fan_in > 0 else 0 init.uniform_(self.bias, -bound, bound) def forward(self, input: Tensor) -> Tensor: print("111") print("self.weight.shape=(", ) return F.linear(input, self.weight, self.bias) def extra_repr(self) -> str: print("www") return 'in_features={}, out_features={}, bias={}'.format( self.in_features, self.out_features, self.bias is not None ) # m = myLinear(20, 40) # input = torch.randn(128, 40, 20) # output = m(input) # print(output.size()) m = myLinear(128, 64) input = torch.randn(512, 3,128,128) output = m(input) print(output.size()) # [(512, 3,128,64))
4. nn.Linear的官方原始碼:
import math import torch from torch import Tensor from torch.nn.parameter import Parameter, UninitializedParameter from .. import functional as F from .. import init from .module import Module from .lazy import LazyModuleMixin class Identity(Module): r"""A placeholder identity operator that is argument-insensitive. Args: args: any argument (unused) kwargs: any keyword argument (unused) Shape: - Input: :math:`(*)`, where :math:`*` means any number of dimensions. - Output: :math:`(*)`, same shape as the input. Examples:: >>> m = nn.Identity(54, unused_argument1=0.1, unused_argument2=False) >>> input = torch.randn(128, 20) >>> output = m(input) >>> print(output.size()) torch.Size([128, 20]) """ def __init__(self, *args, **kwargs): super(Identity, self).__init__() def forward(self, input: Tensor) -> Tensor: return input class Linear(Module): r"""Applies a linear transformation to the incoming data: :math:`y = xA^T + b` This module supports :ref:`TensorFloat32<tf32_on_ampere>`. Args: in_features: size of each input sample out_features: size of each output sample bias: If set to ``False``, the layer will not learn an additive bias. Default: ``True`` Shape: - Input: :math:`(*, H_{in})` where :math:`*` means any number of dimensions including none and :math:`H_{in} = text{in_features}`. - Output: :math:`(*, H_{out})` where all but the last dimension are the same shape as the input and :math:`H_{out} = text{out_features}`. Attributes: weight: the learnable weights of the module of shape :math:`(text{out_features}, text{in_features})`. The values are initialized from :math:`mathcal{U}(-sqrt{k}, sqrt{k})`, where :math:`k = frac{1}{text{in_features}}` bias: the learnable bias of the module of shape :math:`(text{out_features})`. If :attr:`bias` is ``True``, the values are initialized from :math:`mathcal{U}(-sqrt{k}, sqrt{k})` where :math:`k = frac{1}{text{in_features}}` Examples:: >>> m = nn.Linear(20, 30) >>> input = torch.randn(128, 20) >>> output = m(input) >>> print(output.size()) torch.Size([128, 30]) """ __constants__ = ['in_features', 'out_features'] in_features: int out_features: int weight: Tensor def __init__(self, in_features: int, out_features: int, bias: bool = True, device=None, dtype=None) -> None: factory_kwargs = {'device': device, 'dtype': dtype} super(Linear, self).__init__() self.in_features = in_features self.out_features = out_features self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs)) if bias: self.bias = Parameter(torch.empty(out_features, **factory_kwargs)) else: self.register_parameter('bias', None) self.reset_parameters() def reset_parameters(self) -> None: # Setting a=sqrt(5) in kaiming_uniform is the same as initializing with # uniform(-1/sqrt(in_features), 1/sqrt(in_features)). For details, see # https://github.com/pytorch/pytorch/issues/57109 init.kaiming_uniform_(self.weight, a=math.sqrt(5)) if self.bias is not None: fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight) bound = 1 / math.sqrt(fan_in) if fan_in > 0 else 0 init.uniform_(self.bias, -bound, bound) def forward(self, input: Tensor) -> Tensor: return F.linear(input, self.weight, self.bias) def extra_repr(self) -> str: return 'in_features={}, out_features={}, bias={}'.format( self.in_features, self.out_features, self.bias is not None ) # This class exists solely to avoid triggering an obscure error when scripting # an improperly quantized attention layer. See this issue for details: # https://github.com/pytorch/pytorch/issues/58969 # TODO: fail fast on quantization API usage error, then remove this class # and replace uses of it with plain Linear class NonDynamicallyQuantizableLinear(Linear): def __init__(self, in_features: int, out_features: int, bias: bool = True, device=None, dtype=None) -> None: super().__init__(in_features, out_features, bias=bias, device=device, dtype=dtype) [docs]class Bilinear(Module): r"""Applies a bilinear transformation to the incoming data: :math:`y = x_1^T A x_2 + b` Args: in1_features: size of each first input sample in2_features: size of each second input sample out_features: size of each output sample bias: If set to False, the layer will not learn an additive bias. Default: ``True`` Shape: - Input1: :math:`(*, H_{in1})` where :math:`H_{in1}=text{in1_features}` and :math:`*` means any number of additional dimensions including none. All but the last dimension of the inputs should be the same. - Input2: :math:`(*, H_{in2})` where :math:`H_{in2}=text{in2_features}`. - Output: :math:`(*, H_{out})` where :math:`H_{out}=text{out_features}` and all but the last dimension are the same shape as the input. Attributes: weight: the learnable weights of the module of shape :math:`(text{out_features}, text{in1_features}, text{in2_features})`. The values are initialized from :math:`mathcal{U}(-sqrt{k}, sqrt{k})`, where :math:`k = frac{1}{text{in1_features}}` bias: the learnable bias of the module of shape :math:`(text{out_features})`. If :attr:`bias` is ``True``, the values are initialized from :math:`mathcal{U}(-sqrt{k}, sqrt{k})`, where :math:`k = frac{1}{text{in1_features}}` Examples:: >>> m = nn.Bilinear(20, 30, 40) >>> input1 = torch.randn(128, 20) >>> input2 = torch.randn(128, 30) >>> output = m(input1, input2) >>> print(output.size()) torch.Size([128, 40]) """ __constants__ = ['in1_features', 'in2_features', 'out_features'] in1_features: int in2_features: int out_features: int weight: Tensor def __init__(self, in1_features: int, in2_features: int, out_features: int, bias: bool = True, device=None, dtype=None) -> None: factory_kwargs = {'device': device, 'dtype': dtype} super(Bilinear, self).__init__() self.in1_features = in1_features self.in2_features = in2_features self.out_features = out_features self.weight = Parameter(torch.empty((out_features, in1_features, in2_features), **factory_kwargs)) if bias: self.bias = Parameter(torch.empty(out_features, **factory_kwargs)) else: self.register_parameter('bias', None) self.reset_parameters() def reset_parameters(self) -> None: bound = 1 / math.sqrt(self.weight.size(1)) init.uniform_(self.weight, -bound, bound) if self.bias is not None: init.uniform_(self.bias, -bound, bound) def forward(self, input1: Tensor, input2: Tensor) -> Tensor: return F.bilinear(input1, input2, self.weight, self.bias) def extra_repr(self) -> str: return 'in1_features={}, in2_features={}, out_features={}, bias={}'.format( self.in1_features, self.in2_features, self.out_features, self.bias is not None ) class LazyLinear(LazyModuleMixin, Linear): r"""A :class:`torch.nn.Linear` module where `in_features` is inferred. In this module, the `weight` and `bias` are of :class:`torch.nn.UninitializedParameter` class. They will be initialized after the first call to ``forward`` is done and the module will become a regular :class:`torch.nn.Linear` module. The ``in_features`` argument of the :class:`Linear` is inferred from the ``input.shape[-1]``. Check the :class:`torch.nn.modules.lazy.LazyModuleMixin` for further documentation on lazy modules and their limitations. Args: out_features: size of each output sample bias: If set to ``False``, the layer will not learn an additive bias. Default: ``True`` Attributes: weight: the learnable weights of the module of shape :math:`(text{out_features}, text{in_features})`. The values are initialized from :math:`mathcal{U}(-sqrt{k}, sqrt{k})`, where :math:`k = frac{1}{text{in_features}}` bias: the learnable bias of the module of shape :math:`(text{out_features})`. If :attr:`bias` is ``True``, the values are initialized from :math:`mathcal{U}(-sqrt{k}, sqrt{k})` where :math:`k = frac{1}{text{in_features}}` """ cls_to_become = Linear # type: ignore[assignment] weight: UninitializedParameter bias: UninitializedParameter # type: ignore[assignment] def __init__(self, out_features: int, bias: bool = True, device=None, dtype=None) -> None: factory_kwargs = {'device': device, 'dtype': dtype} # bias is hardcoded to False to avoid creating tensor # that will soon be overwritten. super().__init__(0, 0, False) self.weight = UninitializedParameter(**factory_kwargs) self.out_features = out_features if bias: self.bias = UninitializedParameter(**factory_kwargs) def reset_parameters(self) -> None: if not self.has_uninitialized_params() and self.in_features != 0: super().reset_parameters() def initialize_parameters(self, input) -> None: # type: ignore[override] if self.has_uninitialized_params(): with torch.no_grad(): self.in_features = input.shape[-1] self.weight.materialize((self.out_features, self.in_features)) if self.bias is not None: self.bias.materialize((self.out_features,)) self.reset_parameters() # TODO: PartialLinear - maybe in sparse?
1)nn.Linear是一個類,使用時進行類的範例化
2)範例化的時候,nn.Linear需要輸入兩個引數,in_features為上一層神經元的個數,out_features為這一層的神經元個數
3)不需要定義w和b。所有nn.Module的子類,形如nn.XXX的層,都會在範例化的同時隨機生成w和b的初始值。所以範例化之後,我們就可以呼叫屬性weight和bias來檢視生成的w和b。其中w是必然會生成的,b是我們可以控制是否要生成的。在nn.Linear類中,有引數bias,預設 bias = True。如果我們希望不擬合常數b,在範例化時將引數bias設定為False即可。
4)由於w和b是隨機生成的,所以同樣的程式碼多次執行後的結果是不一致的。如果我們希望控制隨機性,則可以使用torch中的random類。如:torch.random.manual_seed(420) #人為設定亂數種子
5)由於不需要定義常數b,因此在特徵張量中,不需要留出與常數項相乘的那一列,只需要輸入特徵張量。
6)輸入層只有一層,並且輸入層的結構(神經元的個數)由輸入的特徵張量X決定,因此在PyTorch中構築神經網路時,不需要定義輸入層。
7)範例化之後,將特徵張量輸入到範例化後的類中。
到此這篇關於PyTorch中torch.nn.Linear範例詳解的文章就介紹到這了,更多相關PyTorch torch.nn.Linear詳解內容請搜尋it145.com以前的文章或繼續瀏覽下面的相關文章希望大家以後多多支援it145.com!
相關文章
<em>Mac</em>Book项目 2009年学校开始实施<em>Mac</em>Book项目,所有师生配备一本<em>Mac</em>Book,并同步更新了校园无线网络。学校每周进行电脑技术更新,每月发送技术支持资料,极大改变了教学及学习方式。因此2011
2021-06-01 09:32:01
综合看Anker超能充系列的性价比很高,并且与不仅和iPhone12/苹果<em>Mac</em>Book很配,而且适合多设备充电需求的日常使用或差旅场景,不管是安卓还是Switch同样也能用得上它,希望这次分享能给准备购入充电器的小伙伴们有所
2021-06-01 09:31:42
除了L4WUDU与吴亦凡已经多次共事,成为了明面上的厂牌成员,吴亦凡还曾带领20XXCLUB全队参加2020年的一场音乐节,这也是20XXCLUB首次全员合照,王嗣尧Turbo、陈彦希Regi、<em>Mac</em> Ova Seas、林渝植等人全部出场。然而让
2021-06-01 09:31:34
目前应用IPFS的机构:1 谷歌<em>浏览器</em>支持IPFS分布式协议 2 万维网 (历史档案博物馆)数据库 3 火狐<em>浏览器</em>支持 IPFS分布式协议 4 EOS 等数字货币数据存储 5 美国国会图书馆,历史资料永久保存在 IPFS 6 加
2021-06-01 09:31:24
开拓者的车机是兼容苹果和<em>安卓</em>,虽然我不怎么用,但确实兼顾了我家人的很多需求:副驾的门板还配有解锁开关,有的时候老婆开车,下车的时候偶尔会忘记解锁,我在副驾驶可以自己开门:第二排设计很好,不仅配置了一个很大的
2021-06-01 09:30:48
不仅是<em>安卓</em>手机,苹果手机的降价力度也是前所未有了,iPhone12也“跳水价”了,发布价是6799元,如今已经跌至5308元,降价幅度超过1400元,最新定价确认了。iPhone12是苹果首款5G手机,同时也是全球首款5nm芯片的智能机,它
2021-06-01 09:30:45