pytorch相关代码-笔记

it2023-08-13  68

1.pytorch 模型初始化

import torch if __name__ == '__main__': model=Endoder() model.apply(weights_init) def weights_init(layer): #权重初始化,此方法为批操作,亦可放于模型细节构建中 if hasattr(layer, 'weight'): if len(layer.weight.shape) > 1: torch.nn.init.kaiming_normal_(layer.weight, nonlinearity='relu') class Endoder(nn.Module): #模型细节构建 def __init__(self): super(Module, self).__init__() # ...... def forward(self, x): # ......

2.优化算法—optim.Adam()

import torch.optim as optim optim.Adam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0) ''' params (iterable) – 待优化参数的iterable或者是定义了参数组的dict lr (float, 可选) – 学习率(默认:1e-3) betas (Tuple[float, float], 可选) – 用于计算梯度以及梯度平方的运行平均值的系数(默认:0.9,0.999) ---beta1:一阶矩估计的指数衰减率(如 0.9)。 ---beta2:二阶矩估计的指数衰减率(如 0.999)。该超参数在稀疏梯度(如在 NLP 或计算机视觉任务中)中应该设置为接近 1 的数。 eps (float, 可选) – 为了增加数值计算的稳定性而加到分母里的项(默认:1e-8) weight_decay (float, 可选) – 权重衰减(L2惩罚)(默认: 0) '''

3.保存与加载模型

import torch # 保存模型 torch.save({ 'epoch': epoch, 'model_state_dict': model.state_dict(), 'loss': loss, ... }, PATH) #加载模型 model = TheModelClass(*args, **kwargs) checkpoint = torch.load(PATH) model.load_state_dict(checkpoint['model_state_dict']) epoch = checkpoint['epoch'] loss = checkpoint['loss'] model.eval() # or model.train()

4.LSTM与BiLSTM

参数信息:

input_size:x的特征维度hidden_size:隐藏层的特征维度num_layers:lstm隐层的层数,默认为1bias:False则bih=0和bhh=0. 默认为Truebatch_first:True则输入输出的数据格式为 (batch, seq, feature)dropout:除最后一层,每一层的输出都进行dropout,默认为: 0bidirectional:True则为双向lstm默认为False输入:input, (h0, c0)输出:output, (hn,cn)

输入数据格式: input(seq_len, batch, input_size) h0(num_layers * num_directions, batch, hidden_size) c0(num_layers * num_directions, batch, hidden_size)

输出数据格式: output(seq_len, batch, hidden_size * num_directions) hn(num_layers * num_directions, batch, hidden_size) cn(num_layers * num_directions, batch, hidden_size)

seq_len:序列个数(文本长度)batch:批次input_size:输入维度num_layers:LSTM层数num_directions:是否为双向,双向为2,单向为1hidden_size:隐藏层维度

h0-hn与c0-cn分别为lstm各层的隐藏层状态与细胞状态

import torch import torch.nn as nn lstm=nn.LSTM(input_size=64,hidden_size=128,num_layers=3,bidirectional=True) lstminput=torch.rand(10,64,64) lstmout,(h,c)=lstm(lstminput) print(lstmout.shape) print(h.shape) print(c.shape) lstm=nn.LSTM(input_size=64,hidden_size=128,num_layers=3) lstminput=torch.rand(10,64,64) lstmout,(h,c)=lstm(lstminput) print(lstmout.shape) print(h.shape) print(c.shape)

输出结果:

torch.Size([10, 64, 256]) torch.Size([6, 64, 128]) torch.Size([6, 64, 128]) torch.Size([10, 64, 128]) torch.Size([3, 64, 128]) torch.Size([3, 64, 128])
最新回复(0)