python 参加某图像去噪比赛有感

用之前的去噪图镇文

在这里插入图片描述

一、体会

本菜鸡本科毕设在FPGA上搞过图像滤波等算法，研究生期间虽然搞的是基于深度学习的图形学，但是主干网络用的还是卷积… 感觉自己代码能力还可以，基础还行，参赛之前还是比较自信的：
觉着看几篇顶会去噪的文章，复现借鉴一下应该能取得一个不错的结果，但是-------大概1000+人参赛，一多半没有提交的或者只提交个baseline，本菜最终100+ 额还没结束明天结束了估计排名快接近200了实在卷不动了
主要有以下三点问题：

与佬们能力差别还是不小，可能与研究方向也有关，毕竟不是专业的
Money is all my need?? 经历了学校服务器排队人数爆满、维修，小组里的机器也排不上队(我一个参加比赛的也不好意思和别人抢- -)，就想调个参，很难，即便排上队 – 跑的时候batch_size都调的很小才能跑
我–工具人… 论文直接被拒好家伙审稿人提的什么鬼意见 - -

虽然知道自己菜，但还是希望尝试一下。吐槽结束，进入正题：：：：

二、收获

谈一下收获，虽然困难挺多，但是收获也很多

看了几篇cv顶会的去噪文章，了解并尝试了cv算法中low-level的方向
尝试复现了两篇顶会，效果并没有baseline好，差不太多 - - (可能复现的不太对，毕竟只是借用思想不是完全拷贝) 最终魔改了一篇别的论文.
从dataloder、网络框架、网络初始化、训练策略到最后的损失函数等等，第一次完整的写了一个深度学习的项目(以前都是拿别人代码框架改改)，遇到很多坑，也学到了许多新的知识点

三、经验分享(部分源码展示与注释)

3.1 输入

图片是要切片的，一整张图太大了，网络稍大点，32G的显卡也会爆显存
把一张图分块为多个图，伪代码如下：

# 外层是一个循环 根据图像大小进行切片 
tmp['imgs']= data['imgs'][:,:, a:b, c:d]# batch 通道数 图片的长和宽
tmp['gts']= data['gts'][:,:, a:b, c:d]# 标签
model.set_input(tmp)# 网络输入

3.2 网络

我的网络主要借鉴的思想：

1.不直接学习端到端的像素值，而是学习噪声(网络更容易拟合?)
2.使用通道可分离的卷积，适当增加通道数(显存太小，跑起来速度很慢)
3.尝试增加卷积核大小(显存太小，跑起来速度很慢)

(比赛有模型大小限制)–增大通道和卷积核都会增加显存的使用，设备不行，故只增了通道数。具体的实现细节如下：

纯纯的Unet baseline修改而来

classUnet2(nn.Module):def__init__(self, dim=4):super(Unet2, self).__init__()
        self.dims =[32,64,128,256,512]
        self.ks =[3,3,3,3,3]
        self.dims_up = self.dims[::-1]
        self.ks_up = self.ks[-2::-1]

        self.first_block = Block2(dim, self.dims[0], self.ks[0])
        self.first_pool = nn.MaxPool2d(kernel_size=2)# AvgPool2d pnsr: 37.683, ssim: 0.902, score: 30.679, time: 52.650for i, dim_in inenumerate(self.dims[:-2]):
            dim_out = self.dims[i+1]setattr(self,'Block{}'.format(i), Block2(dim_in, dim_out, k=self.ks[i+1]))setattr(self,'pool{}'.format(i), nn.MaxPool2d(kernel_size=2))

        self.conv_mid = Block2(self.dims[-2], self.dims[-1], self.ks[-1])for i, dim_in inenumerate(self.dims_up[:-1]):
            dim_out = self.dims_up[i+1]setattr(self,'ConvTrans{}'.format(i), nn.ConvTranspose2d(dim_in, dim_out,2, stride=2, bias=True))setattr(self,'up_Block{}'.format(i), Block2(dim_in, dim_out, k=self.ks_up[i]))

        self.last_conv = nn.Conv2d(self.dims[0], dim,1, bias=True)defforward(self, x):
        n, c, h, w = x.shape
        h_pad =32- h %32ifnot h %32==0else0
        w_pad =32- w %32ifnot w %32==0else0
        padded_image = F.pad(x,(0, w_pad,0, h_pad),'replicate')
        list_pools =[]

        x_bk = x
        # 1.first Block
        x = self.first_block(padded_image)
        list_pools.append(x)
        x = self.first_pool(x)# 2.Blocksfor i, dim_in inenumerate(self.dims[:-2]):
            x =getattr(self,'Block{}'.format(i))(x)
            list_pools.append(x)
            x =getattr(self,'pool{}'.format(i))(x)

        x = self.conv_mid(x)for i, dim_in inenumerate(self.dims_up[:-1]):
            x =getattr(self,'ConvTrans{}'.format(i))(x)# tmp = list_pools.pop()
            x = torch.cat([x, list_pools.pop()],1)
            x =getattr(self,'up_Block{}'.format(i))(x)# 3.last
        x = self.last_conv(x)
        out = x[:,:,:h,:w]+ x_bk

        return out

classBlock2(nn.Module):def__init__(self, dim_in, dim_out, k=3):super(Block2, self).__init__()
        self.conv1 = nn.Conv2d(dim_in, dim_in, kernel_size=k, padding=k //2, padding_mode='zeros', bias=True)
        self.conv2 = nn.Conv2d(dim_in, dim_out, kernel_size=k, padding=k //2, padding_mode='zeros', bias=True)defforward(self, x):
        x = self.conv1(x)
        x = self.leaky_relu(x)
        x = self.conv2(x)
        x = self.leaky_relu(x)return x

    defleaky_relu(self, x, a=0.2):
        out = torch.max(a * x, x)return out

我使用的网络魔改ConvNet

classOur(nn.Module):def__init__(self, dim=4):super(Our, self).__init__()
        self.dims =[128,256,512,1024]
        self.ks =[3,3,3,3]# 内存不够啊# self.dims = [16, 32, 64, 128, 256]# self.ks = [23, 23, 23, 17, 3]######################################
        self.dims_up = self.dims[::-1]
        self.ks_up = self.ks[-2::-1]

        self.first_block = Block(dim, self.dims[0], self.ks[0])
        self.first_pool = nn.MaxPool2d(kernel_size=2)for i, dim_in inenumerate(self.dims[:-2]):
            dim_out = self.dims[i+1]setattr(self,'Block{}'.format(i), Block(dim_in, dim_out, k=self.ks[i+1]))setattr(self,'pool{}'.format(i), nn.MaxPool2d(kernel_size=2))

        self.conv_mid = Block(self.dims[-2], self.dims[-1], self.ks[-1])for i, dim_in inenumerate(self.dims_up[:-1]):
            dim_out = self.dims_up[i+1]setattr(self,'ConvTrans{}'.format(i), nn.ConvTranspose2d(dim_in, dim_out,2, stride=2))setattr(self,'up_Block{}'.format(i), Block(dim_in, dim_out, k=self.ks_up[i]))

        self.last_ln = nn.LayerNorm(self.dims[0], eps=1e-6)
        self.last_conv = nn.Linear(self.dims[0], dim)defforward(self, x):
        n, c, h, w = x.shape
        h_pad =32- h %32ifnot h %32==0else0
        w_pad =32- w %32ifnot w %32==0else0
        padded_image = F.pad(x,(0, w_pad,0, h_pad),'replicate')
        list_pools =[]
        x_bk = x
        # 1.first Block
        x = self.first_block(padded_image)
        list_pools.append(x)
        x = self.first_pool(x)# 2.Blocksfor i, dim_in inenumerate(self.dims[:-2]):
            x =getattr(self,'Block{}'.format(i))(x)
            list_pools.append(x)
            x =getattr(self,'pool{}'.format(i))(x)

        x = self.conv_mid(x)for i, dim_in inenumerate(self.dims_up[:-1]):
            x =getattr(self,'ConvTrans{}'.format(i))(x)# tmp = list_pools.pop()
            x = torch.cat([x, list_pools.pop()],1)
            x =getattr(self,'up_Block{}'.format(i))(x)# 3.last
        x = x.permute(0,2,3,1).contiguous()
        x = self.last_ln(x)
        x = self.last_conv(x)
        x = x.permute(0,3,1,2).contiguous()
        out = x[:,:,:h,:w]+ x_bk

        return out

classBlock(nn.Module):def__init__(self, dim_in, dim_out, k=9):super(Block, self).__init__()
        self.conv = nn.Conv2d(dim_in, dim_in, groups=dim_in, kernel_size=k, padding=k //2)
        self.ln = nn.LayerNorm(dim_in,eps=1e-6)
        self.conv1x1up = nn.Linear(dim_in, dim_in *2)#nn.Conv2d(dim, dim * 2, 1)
        self.act = nn.GELU()
        self.conv1x1dn = nn.Linear(dim_in *2, dim_out)#nn.Conv2d(dim * 2, dim, 1)
        self.w = nn.Parameter(torch.zeros(1))# res
        self.res_conv = nn.Conv2d(dim_in, dim_out,1)defforward(self, x):
        identity = x
        x = self.conv(x)
        x = x.permute(0,2,3,1).contiguous()
        x = self.ln(x)
        x = self.conv1x1up(x)
        x = self.act(x)
        x = self.conv1x1dn(x)
        x = x.permute(0,3,1,2).contiguous()
        x = x * self.w
        x = x + self.res_conv(identity)return x

3.3 损失函数

loss = torch.nn.L1Loss()

实测了一下，还是L1效果好啊
其它L2、SSIM之类的花里胡哨的效果并不理想 (毕竟是炼丹，可能只是不适合我的网络)

3.4 传统滤波方法

哈、我还试了一下传统的去噪，顺便使用纯python写了一个双边滤波(参考我以前matlab的代码)，不得不说，还是深度学习yyds!

defbilateral_filter(img):# 参考自己博客 matlab的实现 https://blog.csdn.net/qq_38204686/article/details/106929922
    r =20# 窗口半径     核大小为 2*r + 1
    sigma_space =15.0# 空间标准差
    sigma_color =10.0# 相似标准差
    w_space = np.zeros((2*r +1,2*r +1))for i inrange(-r-1, r):for j inrange(-r-1, r):
            tmp = i * i + j * j
            w_space[i + r+1, j + r+1]= np.exp(-float(tmp)/(2* sigma_space * sigma_space))
    w_color = np.zeros((1,256))for i inrange(256):
        w_color[0, i]= np.exp(-float(i * i)/(2* sigma_color * sigma_color))# 开始滤波
    height, width, channel = img.shape
    dst_img = img.copy()for h inrange(r, height - r):# s = time.time()   0.3sfor w inrange(r, height - r):for c inrange(channel):# 通道遍历
                p_c = img[h, w, c]# 像素值
                p_win = img[h-r:h+r+1, w-r:w+r+1, c]# 窗口内所有像素
                c_w = np.abs(p_win - p_c).astype(int)
                c_w = w_color[0, c_w]
                w_tmp = w_space * c_w
                p_sum = p_win * w_tmp
                p_sum = np.sum(p_sum)/ np.sum(w_tmp)
                dst_img[h, w, c]= p_sum

    return dst_img

四、主要参考链接

https://zhuanlan.zhihu.com/p/455913104 (ConvNeXt: A ConvNet for the 2020s)
https://zhuanlan.zhihu.com/p/349644858 (如何白嫖GPU)
https://blog.csdn.net/u011447962/article/details/123510680 (CVPR 2022 | RepLKNet)
https://github.com/gbstack/CVPR-2022-papers#SG (CVPR2022 Papers (Papers/Codes/Demos))

标签： python 人工智能计算机视觉

本文转载自: https://blog.csdn.net/qq_38204686/article/details/124638835
版权归原作者 das白 所有，如有侵权，请联系我们删除。

python 参加某图像去噪比赛有感