项目实战解析：基于深度学习搭建卷积神经网络模型算法，实现图像识别分类

项目交流QQ群(源码获取，问题解答)：617172764

文章目录

前言

随着人工智能的不断发展，深度学习这门技术也越来越重要，很多人都开启了学习机器学习，本文将通过项目开发实例，带领大家从零开始设计实现一款基于深度学习的图像识别算法。
学习本章内容，你需要掌握以下基础知识：

Python 基础语法
计算机视觉库(OpenCV)
深度学习框架（TensorFlow）
卷积神经网络（CNN）

一、基础知识介绍

Python Python 是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本语言。学习链接：Python学习
OpenCV OpenCV 是一个开源的跨平台计算机视觉库。实现了图像处理和计算机视觉方面的很多通用算法。在本设计中它主要集中在图像的采集以及图像预处理等功能。学习链接：OpenCV学习
TensorFlow TensorFlow 是谷歌开源的计算框架，TensorFlow 框架可以很好地支持深度学习的各种算法。本课设图像识别部分采用的深度学习（deep learning）算法就是用 TensorFlow 框架实现的。学习链接：TensorFlow学习
CNN 卷积神经网络（Convolutional Neural Networks, CNN）是一类包含卷积计算且具有深度结构的前馈神经网络（Feedforward Neural Networks），是深度学习（deep learning）的代表算法之一。学习链接：CNN学习

二、数据集收集

在进行图像识别前，首先需要收集数据集（数据集下载），其次对于数据集做预处理，然后才能通过深度卷积神经网络来进行特征学习，得到估计分类模型。对于数据集的要求，在卷积神经网络（CNN）中，由于对输入图像向量的权值参数的数量是固定的，所以在用卷积网络（CNN）对数据集进行模型训练前需要进行图像预处理，保证输入的图像尺寸是固定一致的。
在这里插入图片描述

图一分类网络模型流程图

本案例以实现垃圾识别分类作为最终实现目标，收集数据集包含四类图片，分别为厨余垃圾、可回收垃圾、有毒垃圾、其它垃圾，每类图片数据集规模为200张（学习者可以根据自己需求选择数据集类型及规模。数据集下载

在这里插入图片描述

图二数据集目录

数据集图片预处理代码如下：

#数据图片rename#数据集路径：(self.image_path = "./picture/")defrename(self):
        listdir = os.listdir(self.image_path)
        i =0while i <len(listdir):
            images_list_dir = os.listdir(os.path.join(self.image_path, listdir[i]))
            j =0while j <len(images_list_dir):
                old_name = os.path.join(self.image_path, listdir[i], images_list_dir[j])
                new_name = os.path.join(self.image_path,"%d-%d"%(i, j)+".jpg")
                os.rename(old_name, new_name)
                j +=1
            i +=1for p inrange(len(listdir)):
            tmp_path = os.path.join(self.image_path, listdir[p])if os.path.exists(tmp_path):
                os.removedirs(tmp_path)

#图片resizedefresize_img(self):
        listdir = os.listdir(self.image_path)forfilein listdir:
            file_path = os.path.join(self.image_path,file)try:
                imread = cv2.imread(file_path)
                resize = cv2.resize(imread,(200,200))
                cv2.imwrite(os.path.join(self.image_path,file), resize)except Exception:
                os.remove(file_path)continue

在这里插入图片描述

图三预处理后数据集示例

#转存图片信息到csv文件#csv生成路径：(csv_file_saved_path = "./picture/")deftrain_data_to_csv(self):
        files = os.listdir(self.image_path)
        data =[]forfilein files:
            data.append({"path": self.image_path +file,"label":file[0]})

        frame = pd.DataFrame(data, columns=['path','label'])
        dummies = pd.get_dummies(frame['label'],'label')
        concat = pd.concat([frame, dummies],1)
        concat.to_csv(csv_file_saved_path +"train.csv")

在这里插入图片描述

图四数据集转存为csv示例

三、模型训练

模型训练代码如下：

#模型训练算法defbuild_model():with tf.name_scope("input"):
        x = tf.placeholder(tf.float32,[None,200,200,3],"x")
        y = tf.placeholder(tf.float32,[None,5],"y")with tf.variable_scope("conv_layer_1"):
        conv1 = tf.layers.conv2d(x,64,[3,3], activation=tf.nn.relu, name='conv1')
        max1 = tf.layers.max_pooling2d(conv1,[2,2],[2,2])
        bn1 = tf.layers.batch_normalization(max1, name='bn1')
        output1 = tf.layers.dropout(bn1, name='droput')with tf.variable_scope("conv_layer_2"):
        conv2 = tf.layers.conv2d(output1,64,[3,3], activation=tf.nn.relu, name='conv2')
        max2 = tf.layers.max_pooling2d(conv2,[2,2],[2,2], name='max2')
        bn2 = tf.layers.batch_normalization(max2)
        output2 = tf.layers.dropout(bn2, name='dropout')with tf.variable_scope("conv_layer_3"):
        conv3 = tf.layers.conv2d(output2,64,[3,3], activation=tf.nn.relu, name='conv3')
        max3 = tf.layers.max_pooling2d(conv3,[2,2],[2,2], name='max3')
        bn3 = tf.layers.batch_normalization(max3, name='bn3')
        output3 = bn3

    with tf.variable_scope("conv_layer_4"):
        conv4 = tf.layers.conv2d(output3,32,[3,3], activation=tf.nn.relu, name='conv4')
        max4 = tf.layers.max_pooling2d(conv4,[2,2],[2,2], name='max4')
        bn4 = tf.layers.batch_normalization(max4, name='bn4')
        output = bn4
        flatten = tf.layers.flatten(output,'flatten')with tf.variable_scope("fc_layer1"):
        fc1 = tf.layers.dense(flatten,256, activation=tf.nn.relu)
        fc_bn1 = tf.layers.batch_normalization(fc1, name='bn1')
        dropout1 = tf.layers.dropout(fc_bn1,0.5)with tf.variable_scope("fc_layer2"):
        fc2 = tf.layers.dense(dropout1,128, activation=tf.nn.relu)
        dropout2 = tf.layers.dropout(fc2)with tf.variable_scope("fc_layer3"):
        fc3 = tf.layers.dense(dropout2,64)
        dropout3 = tf.layers.dropout(fc3)with tf.variable_scope("fc_layer4"):
        fc4 = tf.layers.dense(dropout3,32)with tf.variable_scope("fc_layer5"):
        fc5 = tf.layers.dense(fc4,5)

    softmax = tf.nn.softmax(fc5, name='softmax')
    predict = tf.argmax(softmax, axis=1)
    loss = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits_v2(logits=fc5, labels=y, name='loss'))
    tf.summary.scalar("loss", loss)
    accuracy = tf.reduce_mean(tf.cast(tf.equal(predict, tf.argmax(y, axis=1)), tf.float32))
    tf.summary.scalar("acc", accuracy)
    merged = tf.summary.merge_all()return x, y, predict, loss, accuracy, merged, softmax

在这里插入图片描述

图五神经网络结构图

#模型训练deftrain(self):
        saver = tf.train.Saver(max_to_keep=3)
        sess = tf.InteractiveSession(config=self.config)
        sess.run(tf.global_variables_initializer())
        writer = tf.summary.FileWriter("./log", graph=sess.graph)for i inrange(100):
            j =1
            all_loss_ =0
            all_acc_ =0while j <= self.batches and self.has_next_batch:
                train_x, train_y = self.next_batch()
                _, loss_, accuracy_, merged_ = sess.run([ self.opt, self.loss, self.accuracy, self.merged ],
                                                        feed_dict={self.x: train_x, self.y: train_y})
                all_loss_ += loss_
                all_acc_ += accuracy_
                print("\repoch %d-- batch: %d-->    "%(i, j)+"="* j +">"+"-"*(
                        self.batches - j)+"\t\t loss: %.4f, acc: %.4f"%(
                          loss_, accuracy_),)
                j +=1
                writer.add_summary(merged_, i * self.batches + j -1)print("\n===epoch %d===    >    mean loss is : %.4f, mean acc is : %.4f"%(
                i, all_loss_ / self.batches, all_acc_ / self.batches))
            test_x, test_y = self.get_test_data()
            test_loss_, test_acc_ = sess.run([ self.loss, self.accuracy ],
                                             feed_dict={self.x: test_x[0:16], self.y: test_y[0:16]})print("===epoch %d===    >    test loss is : %.4f, test acc is : %.4f"%(
                i, test_loss_, test_acc_))
            self.start =0
            self.has_next_batch =Trueif i %5==0:
                saver.save(sess,"./h5_dell/mode.ckpt", i)
        sess.close()

在这里插入图片描述

图六训练模型输出文件

四、图像识别分类

图像识别分类代码：

#利用模型实时识别图像defpredict_value(self,type='image', image_path=None):
        saver = tf.train.Saver()
        sess = tf.InteractiveSession()
        saver.restore(sess, tf.train.latest_checkpoint("./h5_dell1/"))iftype=='image':assert image_path isnotNone
            image = cv2.imread(image_path)
            image = cv2.resize(image,(200,200))
            image = np.asarray(image, np.float32)/255.
            image = np.reshape(image,(1, image.shape[0], image.shape[1], image.shape[2]))[ predict, probab ]= sess.run([ self.predict, self.probab ], feed_dict={self.x: image})# predict = sess.run(self.predict, feed_dict={self.x: image})# print("what? 1：",np.max(probab))# print("what? 2：",predict[0])return predict[0]if(np.max(probab)<1):print("recognise fail")
                predict=4print(predict)eliftype=='video':
            capture = cv2.VideoCapture(0)whileTrue:
                ret, frame = capture.read()
                resize = cv2.resize(frame,(200,200))
                x_ = np.asarray(resize, np.float32)/255.
                x_ = np.reshape(x_,[1, x_.shape[0], x_.shape[1], x_.shape[2]])[ predict, probab ]= sess.run([ self.predict, self.probab ], feed_dict={self.x: x_})if predict ==0:
                    cv2.putText(frame,"0 probab: %.3f"% np.max(probab),(10,50), cv2.FONT_HERSHEY_SIMPLEX,2,(0,0,255),2, cv2.LINE_AA)elif predict ==1:
                    cv2.putText(frame,"1 probab: %.3f"% np.max(probab),(10,50), cv2.FONT_HERSHEY_SIMPLEX,2,(0,255,255),2, cv2.LINE_AA)elif predict ==2:
                    cv2.putText(frame,"2 probab: %.3f"% np.max(probab),(10,50), cv2.FONT_HERSHEY_SIMPLEX,2,(0,255,0),2, cv2.LINE_AA)elif predict ==3:
                    cv2.putText(frame,"3 probab: %.3f"% np.max(probab),(10,50), cv2.FONT_HERSHEY_SIMPLEX,2,(255,0,255),2, cv2.LINE_AA)elif predict ==4:
                    cv2.putText(frame,"4 probab: %.3f"% np.max(probab),(10,50), cv2.FONT_HERSHEY_SIMPLEX,2,(255,0,255),2, cv2.LINE_AA)if predict==3:print("1111")print(predict)

                cv2.imshow("recognized", frame)
                key = cv2.waitKey(1)if key ==27:break
            cv2.destroyAllWindows()
            capture.release()

在这里插入图片描述

图七蔬菜类图像识别结果

在这里插入图片描述

图八易拉罐类图片实现效果

总结

本章节内容以实际项目案例介绍神经网络图像识别算法的搭建及使用详细步骤，介绍卷积神经网络实现图像识别分类的详细过程，以及实现效果的展示，希望文章能够帮助到大家，有问题欢迎加群（IT项目交流群-3群：617172764）沟通交流。

标签：深度学习神经网络图像处理

本文转载自: https://blog.csdn.net/qq_37037348/article/details/125029992
版权归原作者 忘尘的世界 所有，如有侵权，请联系我们删除。

项目实战解析：基于深度学习搭建卷积神经网络模型算法，实现图像识别分类

文章目录

前言

一、基础知识介绍

二、数据集收集

三、模型训练

四、图像识别分类

总结

发表评论

“项目实战解析：基于深度学习搭建卷积神经网络模型算法，实现图像识别分类”的评论:

关于作者

overfit同步小助手

相关阅读

文章导航