pytorch相关代码-笔记

it2023-08-13 68

1.pytorch 模型初始化

import torch if __name__ == '__main__': model=Endoder() model.apply(weights_init) def weights_init(layer): #权重初始化，此方法为批操作，亦可放于模型细节构建中 if hasattr(layer, 'weight'): if len(layer.weight.shape) > 1: torch.nn.init.kaiming_normal_(layer.weight, nonlinearity='relu') class Endoder(nn.Module): #模型细节构建 def __init__(self): super(Module, self).__init__() # ...... def forward(self, x): # ......

2.优化算法—optim.Adam()

import torch.optim as optim optim.Adam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0) ''' params (iterable) – 待优化参数的iterable或者是定义了参数组的dict lr (float, 可选) – 学习率（默认：1e-3） betas (Tuple[float, float], 可选) – 用于计算梯度以及梯度平方的运行平均值的系数（默认：0.9，0.999） ---beta1：一阶矩估计的指数衰减率（如 0.9）。 ---beta2：二阶矩估计的指数衰减率（如 0.999）。该超参数在稀疏梯度（如在 NLP 或计算机视觉任务中）中应该设置为接近 1 的数。 eps (float, 可选) – 为了增加数值计算的稳定性而加到分母里的项（默认：1e-8） weight_decay (float, 可选) – 权重衰减（L2惩罚）（默认: 0） '''

3.保存与加载模型

import torch # 保存模型 torch.save({ 'epoch': epoch, 'model_state_dict': model.state_dict(), 'loss': loss, ... }, PATH) #加载模型 model = TheModelClass(*args, **kwargs) checkpoint = torch.load(PATH) model.load_state_dict(checkpoint['model_state_dict']) epoch = checkpoint['epoch'] loss = checkpoint['loss'] model.eval() # or model.train()

4.LSTM与BiLSTM

参数信息：

input_size：x的特征维度hidden_size：隐藏层的特征维度num_layers：lstm隐层的层数，默认为1bias：False则bih=0和bhh=0. 默认为Truebatch_first：True则输入输出的数据格式为 (batch, seq, feature)dropout：除最后一层，每一层的输出都进行dropout，默认为: 0bidirectional：True则为双向lstm默认为False输入：input, (h0, c0)输出：output, (hn,cn)

输入数据格式： input(seq_len, batch, input_size) h0(num_layers * num_directions, batch, hidden_size) c0(num_layers * num_directions, batch, hidden_size)

输出数据格式： output(seq_len, batch, hidden_size * num_directions) hn(num_layers * num_directions, batch, hidden_size) cn(num_layers * num_directions, batch, hidden_size)

seq_len:序列个数（文本长度）batch:批次input_size:输入维度num_layers:LSTM层数num_directions:是否为双向，双向为2，单向为1hidden_size:隐藏层维度

h0-hn与c0-cn分别为lstm各层的隐藏层状态与细胞状态

import torch import torch.nn as nn lstm=nn.LSTM(input_size=64,hidden_size=128,num_layers=3,bidirectional=True) lstminput=torch.rand(10,64,64) lstmout,(h,c)=lstm(lstminput) print(lstmout.shape) print(h.shape) print(c.shape) lstm=nn.LSTM(input_size=64,hidden_size=128,num_layers=3) lstminput=torch.rand(10,64,64) lstmout,(h,c)=lstm(lstminput) print(lstmout.shape) print(h.shape) print(c.shape)

输出结果：

torch.Size([10, 64, 256]) torch.Size([6, 64, 128]) torch.Size([6, 64, 128]) torch.Size([10, 64, 128]) torch.Size([3, 64, 128]) torch.Size([3, 64, 128])

最新回复(0)