0


人工智能(pytorch)搭建模型12-pytorch搭建BiGRU模型,利用正态分布数据训练该模型

大家好,我是微学AI,今天给大家介绍一下人工智能(pytorch)搭建模型12-pytorch搭建BiGRU模型,利用正态分布数据训练该模型。本文将介绍一种基于PyTorch的BiGRU模型应用项目。我们将首先解释BiGRU模型的原理,然后使用PyTorch搭建模型,并提供模型代码和数据样例。接下来,我们将加载数据到模型中进行训练,打印损失值与准确率,并在训练完成后进行测试。最后,我们将提供完整的文章目录结构和全套实现代码。

目录

  1. BiGRU模型原理
  2. 使用PyTorch搭建BiGRU模型
  3. 数据样例
  4. 模型训练
  5. 模型测试
  6. 完整代码

1. BiGRU模型原理

BiGRU(双向门控循环单元)是一种改进的循环神经网络(RNN)结构,它由两个独立的GRU层组成,一个沿正向处理序列,另一个沿反向处理序列。这种双向结构使得BiGRU能够捕捉到序列中的长距离依赖关系,从而提高模型的性能。
在这里插入图片描述

GRU(门控循环单元)是一种RNN变体,它通过引入更新门和重置门来解决传统RNN中的梯度消失问题。更新门负责确定何时更新隐藏状态,而重置门负责确定何时允许过去的信息影响当前隐藏状态。

BiGRU模型的数学原理可以用以下公式表示:

首先,对于一个输入序列

  1. X
  2. =
  3. x
  4. 1
  5. x
  6. 2
  7. ,
  8. .
  9. .
  10. .
  11. ,
  12. x
  13. T
  14. X = {x_1 x_2, ..., x_T}
  15. X=x1x2​,...,xT​,BiGRU模型的前向计算可以表示为:
  16. h
  17. t
  18. =
  19. GRU
  20. (
  21. h
  22. t
  23. 1
  24. ,
  25. x
  26. t
  27. )
  28. \overrightarrow{h_t} = \text{GRU}(\overrightarrow{h_{t-1}}, x_t)
  29. ht​​=GRU(ht1​​,xt​)
  30. h
  31. t
  32. =
  33. GRU
  34. (
  35. h
  36. t
  37. +
  38. 1
  39. ,
  40. x
  41. t
  42. )
  43. \overleftarrow{h_t} = \text{GRU}(\overleftarrow{h_{t+1}}, x_t)
  44. ht​​=GRU(ht+1​​,xt​)

其中,

  1. h
  2. t
  3. \overrightarrow{h_t}
  4. ht​​
  5. h
  6. t
  7. \overleftarrow{h_t}
  8. ht​​ 分别表示从左到右和从右到左的隐藏状态,
  9. GRU
  10. \text{GRU}
  11. GRU 表示GRU单元,
  12. x
  13. t
  14. x_t
  15. xt 表示输入序列中的第
  16. t
  17. t
  18. t 个元素。

然后,将两个方向的隐藏状态拼接在一起,得到最终的隐藏状态

  1. h
  2. t
  3. h_t
  4. ht​:
  5. h
  6. t
  7. =
  8. [
  9. h
  10. t
  11. ;
  12. h
  13. t
  14. ]
  15. h_t = [\overrightarrow{h_t}; \overleftarrow{h_t}]
  16. ht​=[ht​​;ht​​]

其中,

  1. [
  2. ;
  3. ]
  4. [\cdot;\cdot]
  5. [⋅;⋅] 表示向量的拼接操作。

最后,将隐藏状态

  1. h
  2. t
  3. h_t
  4. ht 传递给一个全连接层,得到输出
  5. y
  6. t
  7. y_t
  8. yt​:
  9. y
  10. t
  11. =
  12. softmax
  13. (
  14. W
  15. h
  16. t
  17. +
  18. b
  19. )
  20. y_t = \text{softmax}(W h_t + b)
  21. yt​=softmax(Wht​+b)

其中,

  1. W
  2. W
  3. W
  4. b
  5. b
  6. b 分别表示全连接层的权重和偏置,
  7. softmax
  8. \text{softmax}
  9. softmax 表示
  10. softmax
  11. \text{softmax}
  12. softmax激活函数。

2. 使用PyTorch搭建BiGRU模型

首先,我们需要导入所需的库:

  1. import torch
  2. import torch.nn as nn

接下来,我们定义BiGRU模型类:

  1. classBiGRU(nn.Module):def__init__(self, input_size, hidden_size, num_layers, num_classes):super(BiGRU, self).__init__()
  2. self.hidden_size = hidden_size
  3. self.num_layers = num_layers
  4. self.gru = nn.GRU(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)
  5. self.fc = nn.Linear(hidden_size *2, num_classes)defforward(self, x):# 初始化隐藏状态
  6. h0 = torch.zeros(self.num_layers *2, x.size(0), self.hidden_size).to(device)# 双向GRU
  7. out, _ = self.gru(x, h0)
  8. out = out[:,-1,:]# 全连接层
  9. out = self.fc(out)return out

3. 数据样例

为了简化问题,我们将使用一个简单的人造数据集。数据集包含10个样本,每个样本有8个时间步长,每个时间步长有一个特征。标签是一个二分类问题。

  1. # 生成数据样例import numpy as np
  2. # 均值为1的正态分布随机数
  3. data_0 = np.random.randn(50,20,1)+1# 均值为-1的正态分布随机数
  4. data_1 = np.random.randn(50,20,1)-1# 合并为总数据集
  5. data = np.concatenate([data_0, data_1], axis=0)# labels 修改为对应大小的数组
  6. labels = np.concatenate([np.zeros((50,1)), np.ones((50,1))], axis=0)

4. 模型训练

首先,我们需要将数据转换为PyTorch张量,并将其分为训练集和验证集。

  1. from sklearn.model_selection import train_test_split
  2. X_train, X_val, y_train, y_val = train_test_split(data, labels, test_size=0.2, random_state=42)
  3. X_train = torch.tensor(X_train, dtype=torch.float32)
  4. y_train = torch.tensor(y_train, dtype=torch.long)
  5. X_val = torch.tensor(X_val, dtype=torch.float32)
  6. y_val = torch.tensor(y_val, dtype=torch.long)

接下来,我们定义训练和验证函数:

  1. deftrain(model, device, X_train, y_train, optimizer, criterion):
  2. model.train()
  3. optimizer.zero_grad()
  4. output = model(X_train.to(device))
  5. loss = criterion(output, y_train.squeeze().to(device))
  6. loss.backward()
  7. optimizer.step()return loss.item()defvalidate(model, device, X_val, y_val, criterion):
  8. model.eval()with torch.no_grad():
  9. output = model(X_val.to(device))
  10. loss = criterion(output, y_val.squeeze().to(device))return loss.item()

现在,我们可以开始训练模型:

  1. device = torch.device("cuda"if torch.cuda.is_available()else"cpu")
  2. input_size =1
  3. hidden_size =32
  4. num_layers =1
  5. num_classes =2
  6. num_epochs =10
  7. learning_rate =0.01
  8. model = BiGRU(input_size, hidden_size, num_layers, num_classes).to(device)
  9. criterion = nn.CrossEntropyLoss()
  10. optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)for epoch inrange(num_epochs):
  11. train_loss = train(model, device, X_train, y_train, optimizer, criterion)
  12. val_loss = validate(model, device, X_val, y_val, criterion)print(f"Epoch [{epoch +1}/{num_epochs}], Train Loss: {train_loss:.4f}, Validation Loss: {val_loss:.4f}")

5. 模型测试

在训练完成后,我们可以使用测试数据集评估模型的性能。这里,我们将使用训练过程中的验证数据作为测试数据。

  1. deftest(model, device, X_test, y_test):
  2. model.eval()with torch.no_grad():
  3. output = model(X_test.to(device))
  4. _, predicted = torch.max(output.data,1)
  5. correct =(predicted == y_test.squeeze().to(device)).sum().item()
  6. accuracy = correct / y_test.size(0)return accuracy
  7. test_accuracy = test(model, device, X_val, y_val)print(f"Test Accuracy: {test_accuracy *100:.2f}%")

6. 完整代码

以下是本文中提到的完整代码:

  1. # 导入库import torch
  2. import torch.nn as nn
  3. import numpy as np
  4. from sklearn.model_selection import train_test_split
  5. # 定义BiGRU模型classBiGRU(nn.Module):def__init__(self, input_size, hidden_size, num_layers, num_classes):super(BiGRU, self).__init__()
  6. self.hidden_size = hidden_size
  7. self.num_layers = num_layers
  8. self.gru = nn.GRU(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)
  9. self.fc = nn.Linear(hidden_size *2, num_classes)defforward(self, x):
  10. h0 = torch.zeros(self.num_layers *2, x.size(0), self.hidden_size).to(device)
  11. out, _ = self.gru(x, h0)
  12. out = out[:,-1,:]
  13. out = self.fc(out)return out
  14. # 生成数据样例# 均值为1的正态分布随机数
  15. data_0 = np.random.randn(50,20,1)+1# 均值为-1的正态分布随机数
  16. data_1 = np.random.randn(50,20,1)-1# 合并为总数据集
  17. data = np.concatenate([data_0, data_1], axis=0)# labels 修改为对应大小的数组
  18. labels = np.concatenate([np.zeros((50,1)), np.ones((50,1))], axis=0)# 划分训练集和验证集
  19. X_train, X_val, y_train, y_val = train_test_split(data, labels, test_size=0.2, random_state=42)
  20. X_train = torch.tensor(X_train, dtype=torch.float32)
  21. y_train = torch.tensor(y_train, dtype=torch.long)
  22. X_val = torch.tensor(X_val, dtype=torch.float32)
  23. y_val = torch.tensor(y_val, dtype=torch.long)# 定义训练和验证函数deftrain(model, device, X_train, y_train, optimizer, criterion):
  24. model.train()
  25. optimizer.zero_grad()
  26. output = model(X_train.to(device))
  27. loss = criterion(output, y_train.squeeze().to(device))
  28. loss.backward()
  29. optimizer.step()return loss.item()defvalidate(model, device, X_val, y_val, criterion):
  30. model.eval()with torch.no_grad():
  31. output = model(X_val.to(device))
  32. loss = criterion(output, y_val.squeeze().to(device))return loss.item()# 训练模型
  33. device = torch.device("cuda"if torch.cuda.is_available()else"cpu")
  34. input_size =1
  35. hidden_size =32
  36. num_layers =1
  37. num_classes =2
  38. num_epochs =10
  39. learning_rate =0.01
  40. model = BiGRU(input_size, hidden_size, num_layers, num_classes).to(device)
  41. criterion = nn.CrossEntropyLoss()
  42. optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)for epoch inrange(num_epochs):
  43. train_loss = train(model, device, X_train, y_train, optimizer, criterion)
  44. val_loss = validate(model, device, X_val, y_val, criterion)print(f"Epoch [{epoch +1}/{num_epochs}], Train Loss: {train_loss:.4f}, Validation Loss: {val_loss:.4f}")# 测试模型deftest(model, device, X_test, y_test):
  45. model.eval()with torch.no_grad():
  46. output = model(X_test.to(device))
  47. _, predicted = torch.max(output.data,1)
  48. correct =(predicted == y_test.squeeze().to(device)).sum().item()
  49. accuracy = correct / y_test.size(0)return accuracy
  50. test_accuracy = test(model, device, X_val, y_val)print(f"Test Accuracy: {test_accuracy *100:.2f}%")

运行结果:

  1. Epoch [1/10], Train Loss:0.7157, Validation Loss:0.6330
  2. Epoch [2/10], Train Loss:0.6215, Validation Loss:0.5666
  3. Epoch [3/10], Train Loss:0.5390, Validation Loss:0.4980
  4. Epoch [4/10], Train Loss:0.4613, Validation Loss:0.4214
  5. Epoch [5/10], Train Loss:0.3825, Validation Loss:0.3335
  6. Epoch [6/10], Train Loss:0.2987, Validation Loss:0.2357
  7. Epoch [7/10], Train Loss:0.2096, Validation Loss:0.1381
  8. Epoch [8/10], Train Loss:0.1230, Validation Loss:0.0644
  9. Epoch [9/10], Train Loss:0.0581, Validation Loss:0.0273
  10. Epoch [10/10], Train Loss:0.0252, Validation Loss:0.0125
  11. Test Accuracy:100.00%

本文介绍了一个基于PyTorch的BiGRU模型应用项目的完整实现。我们详细介绍了BiGRU模型的原理,并使用PyTorch搭建了模型。我们还提供了模型代码和数据样例,并展示了如何加载数据到模型中进行训练和测试。希望能帮助大家理解和实现BiGRU模型。


本文转载自: https://blog.csdn.net/weixin_42878111/article/details/131227085
版权归原作者 微学AI 所有, 如有侵权,请联系我们删除。

“人工智能(pytorch)搭建模型12-pytorch搭建BiGRU模型,利用正态分布数据训练该模型”的评论:

还没有评论