从0到1实现GCN——最详细的代码实现

最近论文中需要使用图卷积神经网络（GNN），看了一些关于GCN的代码，还有基于PyTorch Geometric Temporal的代码实现，在这里做一下记录。

GCN原始代码

关于GCN的原理在这里不进行过多阐述，其他文章里面解释的已经很详细了，这里就直接进入到代码的部分。GCN的公式如下：

$G C N(A, X)=\sigma\left(\widehat{D}^{-\frac{1}{2}} \hat{A} \widehat{D}^{-\frac{1}{2}} X W\right)$

其中* $A$ 为邻接矩阵； $X$ 为t时刻输入的节点的特征矩阵； $\widehat{D}^{-\frac{1}{2}} \hat{A} \widehat{D}^{-\frac{1}{2}}$ 是近似的图卷积滤波器，其中** $\hat{A}$ ****= $A$ + $I_{N}$ ( $I_{N}$ 是N维的单位矩阵)； $\hat{D}$ 是度矩阵； $W$ 代表需要神经网络训练的权重矩阵； $\sigma \left ( \cdot \right )$ 是激活函数Relu。*

根据公式逐步实现GCN的代码如下：

def get_gcn_fact(adj):
    '''
    Function to calculate the GCN factor of a certain network snapshot
    计算GCN因子(图卷积因子D^-1/2AD^-1/2)的函数
    :param adj: the adjacency matrix of a specific network snapshot 特定网络快照的邻接矩阵
    :return: the corresponding GCN factor 对应的GCN因子 DAD
    '''
    adj_ = adj + np.eye(node_num, node_num) # A+IN
    row_sum = np.array(adj_.sum(1)) # 求度矩阵D 
    d_inv_sqrt = np.power(row_sum, -0.5).flatten() # D^-1/2
    d_inv_sqrt[np.isinf(d_inv_sqrt)] = 0. # 将一些计算得到的NAN值赋0值
    d_mat_inv_sqrt = np.mat(np.diag(d_inv_sqrt)) # 将D^-1/2对角化
    gcn_fact = d_mat_inv_sqrt*adj_*d_mat_inv_sqrt # 计算D^-1/2AD^-1/2

    return gcn_fact

这里根据输入数据 $adj$ 代表的邻接矩阵** $A$ **，如果图的拓扑结构不会发生变化，那么GCN因子的值就是固定的，否则要根据时序变化分别计算其对应的GCN因子。

基于PyTorch Geometric的GCN实现

但由于上述代码中需要对矩阵进行复杂的计算，并且大部分图数据的邻接矩阵比较稀疏，因此这种计算方法会造成内存资源的浪费，计算效率也比较低。但是幸运的是，PyTorch Geometric(PyG)中封装了大量已经编写好的图神经网络，我们只需要调库进行使用就好了（哈哈哈大家最喜欢的部分）。

PyG库介绍

PyG的下载：Installation — pytorch_geometric documentation里面提供了各种安装方法。

提供的各种神经网络层：

部分图卷积操作层：

（哈哈哈哈哈哈因为最近一直在看GNN方面的文章，有这些库可以直接调真的是救大命了）。

言归正传，根据PyG实现GCN的代码如下：

from torch_geometric.nn import GCNConv

class GCN(torch.nn.Module):
    def __init__(self, node_features, input_size, output_size):
        super(GCN, self).__init__()
        self.conv1 = GCNConv(node_features, input_size)
        self.MLP = torch.nn.Sequential(
            torch.nn.Linear(input_size, input_size // 2),
            torch.nn.ReLU(inplace=True),
            torch.nn.Linear(input_size // 2, input_size // 4),
            torch.nn.ReLU(inplace=True),
            torch.nn.Linear(input_size // 4, output_size))
        self.relu = torch.nn.ReLU()

    def forward(self, x, edge_index, edge_weight):
        '''
        GCN
        '''
        x = self.relu(self.conv1(x, edge_index))
        x = F.dropout(x, training=self.training)
        x = self.MLP(x)

        return x

这里面我们使用GCN对输入数据进行编码，MLP全连接层对提取到的数据进行解码，实现了一个简单的对输入数据进行特征提取的网络。

现有模型中，有许多模型在特征提取时对GCN的处理结果进行拼接处理的，比如： $\left ( X,GCN \right )$ ，因此对上述模型进行改进：

class GCN(torch.nn.Module):
    def __init__(self, node_features, input_size, output_size):
        super(GCN, self).__init__()
        self.conv1 = GCNConv(node_features, input_size)
        self.linear = torch.nn.Linear(node_features+input_size, input_size)
        self.MLP = torch.nn.Sequential(
            torch.nn.Linear(input_size, input_size // 2),
            torch.nn.ReLU(inplace=True),
            torch.nn.Linear(input_size // 2, input_size // 4),
            torch.nn.ReLU(inplace=True),
            torch.nn.Linear(input_size // 4, output_size))
        self.relu = torch.nn.ReLU()

    def forward(self, x, edge_index, edge_weight):
        '''
        (x, GCN)
        '''
        lst = list()
        lst.append(x)

        x = self.relu(self.conv1(x, edge_index, edge_weight)) #根据数据集确定有没有edge_weight
        x = F.dropout(x, training=self.training)
        lst.append(x)

        x = torch.cat(lst, dim=1)
        # print('cat', x.shape)cat torch.Size([node_num, node_features+input_size])
        x = self.relu(self.linear(x))
        x = F.dropout(x, training=self.training)

        x = self.MLP(x)

完整代码

模型中简单的随机生成了图数据，只是为了展示GCN模型在具体代码中应该如何使用。

分类模型：

import torch
import random
import matplotlib.pyplot as plt
from tqdm import tqdm
import numpy as np
import networkx as nx
import torch.nn.functional as F
from torch_geometric.nn import GCNConv

def create_mock_data(number_of_nodes, edge_per_node, in_channels):
    """
    Creating a mock feature matrix and edge index.
    """
    graph = nx.watts_strogatz_graph(number_of_nodes, edge_per_node, 0.5)
    edge_index = torch.LongTensor(np.array([edge for edge in graph.edges()]).T)
    X = torch.FloatTensor(np.random.uniform(-1, 1, (number_of_nodes, in_channels)))
    return X, edge_index

def create_mock_edge_weight(edge_index):
    """
    Creating a mock edge weight tensor.
    """
    return torch.FloatTensor(np.random.uniform(0, 1, (edge_index.shape[1])))

def create_mock_target(number_of_nodes, number_of_classes):
    """
    Creating a mock target vector.
    """
    return torch.LongTensor([random.randint(0, number_of_classes-1) for node in range(number_of_nodes)])

class GCN(torch.nn.Module):
    def __init__(self, node_features, input_size, num_classes):
        super(GCN, self).__init__()
        self.conv1 = GCNConv(node_features, input_size)
        self.MLP = torch.nn.Sequential(
            torch.nn.Linear(input_size, input_size // 2),
            torch.nn.ReLU(inplace=True),
            torch.nn.Linear(input_size // 2, input_size // 4),
            torch.nn.ReLU(inplace=True),
            torch.nn.Linear(input_size // 4, num_classes))
        self.relu = torch.nn.ReLU()

    def forward(self, x, edge_index, edge_weight):
        '''
        GCN
        '''
        x = self.relu(self.conv1(x, edge_index))
        x = F.dropout(x, training=self.training)
        x = self.MLP(x)

        return F.log_softmax(x, dim=1)

node_features = 100
node_count = 1000
input_size = 32
num_classes = 10
edge_per_node = 15
epochs = 200
learning_rate = 0.01
weight_decay = 5e-4

model = GCN(node_features=node_features, input_size=input_size, num_classes=num_classes)

optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay)

model.train()

loss_list = []
for epoch in tqdm(range(epochs)):
    optimizer.zero_grad()
    x, edge_index = create_mock_data(node_count, edge_per_node, node_features)
    edge_weight = create_mock_edge_weight(edge_index)
    scores = model(x, edge_index, edge_weight)
    target = create_mock_target(node_count, num_classes)
    loss = F.nll_loss(scores, target)
    loss_list.append(loss.item())
    loss.backward()
    optimizer.step()

plt.plot(loss_list)
plt.xlabel("Epoch")
plt.ylabel("MSE")
plt.title("loss")
plt.show()

损失函数：

预测模型：

这里不再赘述预测模型的代码，其实预测问题和分类问题非常相像，预测模型只需要去掉模型最后的softmax函数，改变output_size就好。

如果想看预测模型，或者其他图神经网络模型的欢迎大家在评论区讨论。有哪里写的不对的地方也欢迎指正！

标签： pytorch 深度学习人工智能

本文转载自: https://blog.csdn.net/m0_53961910/article/details/127856355
版权归原作者 早睡早起困得早 所有，如有侵权，请联系我们删除。

从0到1实现GCN——最详细的代码实现

GCN原始代码

基于PyTorch Geometric的GCN实现

PyG库介绍

言归正传，根据PyG实现GCN的代码如下：

完整代码

发表评论

“从0到1实现GCN——最详细的代码实现”的评论:

关于作者

overfit同步小助手

相关阅读

文章导航