d2l-ai深度学习日记之预备知识(一)

引言

    笔者目前在大三阶段,想跟着研究生老师学习,以便创造更多的深造机会,故学习深度学习.我使用教材d2l-zh进行学习.这篇文章主要是学习预备知识.在此之前,我已经有了python等语言的基本基础.

这个博客《d2l-ai深度学习日记》将记录我在深度学习领域的学习与探索，特别是基于《动手学深度学习》这本经典教材的学习过程。在这个过程中，我不仅希望总结所学，还希望通过分享心得，与志同道合的朋友一起交流成长。这不仅是对知识的沉淀，也是我备战研究生考试、追逐学术进阶之路的一部分。

一.下载

在Releases · d2l-ai/d2l-zh (github.com)中下载

d2l-zh-pytorch-2.0.0.pdf进行阅读

二.配置环境

配置环境我不多赘述,详情请看李沐-深度学习环境配置 d2l、pytorch、Miniconda_李沐的d2l包安装-CSDN博客,我在后面都会使用此文章里面的jupter notebook进行学习

三.阅读前言

在我进行计算机学习的今天,正是ChatGpt等语言大模型火热的时间,并且当前国内就业形式严峻,想要考研深造,不得不接触深度学习,神经网络等等生涩难懂,以前研究生才会学习的知识.在写这篇文章之前,我已经稍微接触后了解了一下神经网络等相关知识,尝试地参加了一些相关的比赛等等,但是还是感觉完全不理解,所有接下来进行系统的学习.

四.数据操作

1.安装依赖

首先安装依赖进行绑定,由于python已经更新到3.11,故直接安装最新版本,如果安装d21出错,则直接像下面一样在后面加一个--user

# #安装依赖
!pip install torch
!pip install torchvision
!pip install d2l --user
!pip install pandas
import torch
import pandas as pd
import os

2.张量tensor

学习深度学习,首先要学习一个数据结构张量torch,很多地方都把张量说得很复杂,但说白了就是高维数组,三维数组,四维数组,在python中方便使用的一个高维数组.

下面给出一些基础的张量tensor操作,不多做解释,基本上自己一运行就理解了

张量的操作:

x=torch.arange(12)
print(x)
print(x.shape)
print(x.size)
print(x.numel)

输出:

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
torch.Size([12])
<built-in method size of Tensor object at 0x000001EDBA344230>
<built-in method numel of Tensor object at 0x000001EDBA344230>

用法显而易见,有意思的是这个numel是number elements of x is缩写

张量的操作:

x=x.reshape(3,4)
print(x)
print(x.shape)
print(x.size)
print(x.numel)

输出:

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
torch.Size([3, 4])
<built-in method size of Tensor object at 0x000001EDBBF89190>
<built-in method numel of Tensor object at 0x000001EDBBF89190>

张量的操作:

x=torch.zeros(3,4)
print(x)
x=torch.ones(3,4)
print(x)
x=torch.rand(3,4)
print(x)

输出:

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
tensor([[0.2749, 0.6663, 0.1761, 0.8468],
        [0.2809, 0.9169, 0.2569, 0.2211],
        [0.1384, 0.9475, 0.9901, 0.0693]])
tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])

张量的操作:

x=torch.tensor([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
print(x)
y=torch.tensor([[12,11,10],[9,8,7],[6,5,4],[3,2,1]])
print(y)
print("x+y:""\n",x+y,"\n""x-y:""\n",x-y,"\n""x*y:""\n",x*y,"\n""x/y:""\n",x/y,"\n""x**y:""\n",x**y)
print("\n",torch.exp(x))

输出:

tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])
tensor([[12, 11, 10],
        [ 9,  8,  7],
        [ 6,  5,  4],
        [ 3,  2,  1]])
x+y:
 tensor([[13, 13, 13],
        [13, 13, 13],
        [13, 13, 13],
        [13, 13, 13]]) 
x-y:
 tensor([[-11,  -9,  -7],
        [ -5,  -3,  -1],
        [  1,   3,   5],
        [  7,   9,  11]]) 
x*y:
 tensor([[12, 22, 30],
        [36, 40, 42],
        [42, 40, 36],
        [30, 22, 12]]) 
x/y:
 tensor([[ 0.0833,  0.1818,  0.3000],
        [ 0.4444,  0.6250,  0.8571],
        [ 1.1667,  1.6000,  2.2500],
        [ 3.3333,  5.5000, 12.0000]]) 
x**y:
 tensor([[     1,   2048,  59049],
        [262144, 390625, 279936],
        [117649,  32768,   6561],
        [  1000,    121,     12]])
 tensor([[2.7183e+00, 7.3891e+00, 2.0086e+01],
        [5.4598e+01, 1.4841e+02, 4.0343e+02],
        [1.0966e+03, 2.9810e+03, 8.1031e+03],
        [2.2026e+04, 5.9874e+04, 1.6275e+05]])

张量操作:

print(x)
print(y)
print(torch.cat((x, y), dim=0))
print(torch.cat((x, y), dim=1))
print(x==y)
print(x.sum(),y.sum())

输出:

tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])
tensor([[12, 11, 10],
        [ 9,  8,  7],
        [ 6,  5,  4],
        [ 3,  2,  1]])
tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12],
        [12, 11, 10],
        [ 9,  8,  7],
        [ 6,  5,  4],
        [ 3,  2,  1]])
tensor([[ 1,  2,  3, 12, 11, 10],
        [ 4,  5,  6,  9,  8,  7],
        [ 7,  8,  9,  6,  5,  4],
        [10, 11, 12,  3,  2,  1]])
tensor([[False, False, False],
        [False, False, False],
        [False, False, False],
        [False, False, False]])
tensor(78) tensor(78)

张量的操作:

a=torch.arange(3).reshape(3,1)
b=torch.arange(2).reshape(1,2)
c=a+b
print(a,"\n",b)
c,a==b,a>b

输出:

tensor([[0],
        [1],
        [2]]) 
 tensor([[0, 1]])
(tensor([[0, 1],
         [1, 2],
         [2, 3]]),
 tensor([[ True, False],
         [False,  True],
         [False, False]]),
 tensor([[False, False],
         [ True, False],
         [ True,  True]]))

张量操作

print(x)
print(x[-1])
print(x[0:2])
print(x[0][0:2])
print(x[1,2])
print(x[0:2])

输出:

tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])
tensor([10, 11, 12])
tensor([[1, 2, 3],
        [4, 5, 6]])
tensor([1, 2])
tensor(6)
tensor([[1, 2, 3],
        [4, 5, 6]])

张量操作:

print(id(x))
print(id(y))

输出:

2120572508720
2120572507472

3.异常值处理

除了张量外,再就是使用panda对文件进行操作,我在数学建模比赛中,这个包用得挺多的,书中使用这个包对数据集进行了异常值处理,但是书中讲的不清楚,并且代码有错误

我们⾸先也和书中一样创建⼀个⼈⼯数据集，并存储在CSV（逗号分隔值）⽂件 ../data/house_tiny. csv中.

#建立人工数据集
data = pd.DataFrame({
    'NumRooms': [3, 2, None, 4, None, 1],
    'Alley': ['Pave', None, 'Pave', 'NA', 'NA', None],
    'Price': [127500, 106000, 178100, None, 140000, 115000],
    'LotSize': [None, 8500, 9600, 7200, 7800, None],
    'YearBuilt': [2000, 1995, 2010, None, 2005, None]
})
print(data)

输出:

    NumRooms Alley     Price  LotSize  YearBuilt
0       3.0  Pave  127500.0      NaN     2000.0
1       2.0  None  106000.0   8500.0     1995.0
2       NaN  Pave  178100.0   9600.0     2010.0
3       4.0    NA       NaN   7200.0        NaN
4       NaN    NA  140000.0   7800.0     2005.0
5       1.0  None  115000.0      NaN        NaN

我们对该数据集做以下处理:

1.删除缺失值最多的列

将预处理后的数据集转换为张量格式

首先我们得到每一列数据集缺失值数量,并得到其中缺失值数量最多的一栏(None这里不算缺失值,而是和Pave一样,是正确填入的属性)

missing_counts = data.isnull().sum()  # 统计每列缺失值的数量
print("每列缺失值的数量:\n",missing_counts)
column_to_drop = missing_counts.idxmax()  # 找到缺失值最多的列
print("缺失值最多的列",column_to_drop)

输出:

每列缺失值的数量:
 NumRooms     2
Alley        2
Price        1
LotSize      2
YearBuilt    2
dtype: int64
缺失值最多的列 NumRooms

这里有多列缺失值都有2,故取第一列(取最先遍历到的一列)

删除这一列

data_cleaned = data.drop(columns=[column_to_drop])  # 删除缺失值最多的列

处理数值列的缺失值:

data_cleaned['Price'] = data_cleaned['Price'].fillna(data_cleaned['Price'].mean()) 
data_cleaned['LotSize'] = data_cleaned['LotSize'].fillna(data_cleaned['LotSize'].mean()) 
data_cleaned['YearBuilt'] = data_cleaned['YearBuilt'].fillna(data_cleaned['YearBuilt'].mean()) 
print("数值预处理后:\n",data_cleaned)

输出:

  Alley     Price  LotSize  YearBuilt
0  Pave  127500.0   8275.0     2000.0
1  None  106000.0   8500.0     1995.0
2  Pave  178100.0   9600.0     2010.0
3    NA  133320.0   7200.0     2002.5
4    NA  140000.0   7800.0     2005.0
5  None  115000.0   8275.0     2002.5

处理逻辑为,将Price,LotSize,YearBuilt这三列数据内容为数值的缺失值,使用其他未缺失的数据计算出来的平均值进行填充.

再对Alley字符串一栏进行处理:

data_cleaned = pd.get_dummies(data_cleaned, columns=['Alley'],dummy_na=True)

处理分类列:
       Price  LotSize  YearBuilt  Alley_NA  Alley_Pave  Alley_nan
0  127500.0   8275.0     2000.0     False        True      False
1  106000.0   8500.0     1995.0     False       False       True
2  178100.0   9600.0     2010.0     False        True      False
3  133320.0   7200.0     2002.5      True       False      False
4  140000.0   7800.0     2005.0      True       False      False
5  115000.0   8275.0     2002.5     False       False       True

这里使用独热编码进行分类,简单理解就是把这种给一个属性的不同状态进行编码,譬如NA就编码为1,即Alley_NA,Pave就编码为2,即Alley_Pave等等.这样就把字符串类型转为了布尔类型.但是现在还是无法直接将该Pandas DataFrame 转换为 PyTorch 张量,因为它同时拥有浮点数和布尔数两种类型,现在将布尔数转化为浮点数,即可:

data_cleaned['Alley_NA'] = data_cleaned['Alley_NA'].astype(float)
data_cleaned['Alley_Pave'] = data_cleaned['Alley_Pave'].astype(float)
data_cleaned['Alley_nan'] = data_cleaned['Alley_nan'].astype(float)
tensor_data = torch.tensor(data_cleaned.values, dtype=torch.float32)
print("\n转换为张量格式：\n", tensor_data)

值得注意的是,dtype没有float32这个参数,它的float和torch.float32是一个东西,但是有float64这个东西.

输出:

处理后的数据：
    NumRooms   Price  Alley_Pave  Alley_nan
0       3.0  127500        True      False
1       2.0  106000       False       True
2       4.0  178100       False       True
3       3.0  140000       False       True

五.线性代数

我使用的教材里面的线性代数,粗略地讲了一下线性代数的一些基本概念,我认为其中有用的就是几个进行运算的函数

1.矩阵逆置,矩阵Hadamard积(按元素乘法):

A=torch.arange(12,dtype=float).reshape(3,4)
A,A.T,A * A

输出:

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]], dtype=torch.float64),
 tensor([[ 0.,  4.,  8.],
         [ 1.,  5.,  9.],
         [ 2.,  6., 10.],
         [ 3.,  7., 11.]], dtype=torch.float64),
 tensor([[  0.,   1.,   4.,   9.],
         [ 16.,  25.,  36.,  49.],
         [ 64.,  81., 100., 121.]], dtype=torch.float64))

值得注意的是,在jupter notebook中,写在.ipy文件的每个单元格的最后一行的变量(其他行不会,只有最后一行),会被直接输出,不用print,我之前都写了一大堆print

2.矩阵利用sum函数降维

X_sum_axis0=X.sum(axis=0)
X_sum_axis1=X.sum(axis=1)
X_sum_axis01=X.sum(axis=[0,1])
X,X_sum_axis0,X_sum_axis1,X_sum_axis01,X.mean()

输出:

(tensor([[[ 0.,  1.,  2.,  3.],
          [ 4.,  5.,  6.,  7.],
          [ 8.,  9., 10., 11.]],
 
         [[12., 13., 14., 15.],
          [16., 17., 18., 19.],
          [20., 21., 22., 23.]]], dtype=torch.float64),
 tensor([[12., 14., 16., 18.],
         [20., 22., 24., 26.],
         [28., 30., 32., 34.]], dtype=torch.float64),
 tensor([[12., 15., 18., 21.],
         [48., 51., 54., 57.]], dtype=torch.float64),
 tensor([60., 66., 72., 78.], dtype=torch.float64),
 tensor(11.5000, dtype=torch.float64),

3.向量点积

x=torch.tensor([1,2,3,4,5,6,7,8,9])
y=torch.tensor([9,8,7,6,5,4,3,2,1])
x, y, torch.dot(x, y)

输出:

(tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]),
 tensor([9, 8, 7, 6, 5, 4, 3, 2, 1]),
 tensor(165))

4.矩阵-向量积

A=torch.arange(12).reshape(3,4)
x=torch.arange(4)
A,x,torch.mv(A,x)

输出:

(tensor([[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11]]),
 tensor([0, 1, 2, 3]),
 tensor([14, 38, 62]))

这里向量的长度要和矩阵后一个维度的长度相同

5.矩阵-矩阵乘法

A=torch.arange(12,dtype=float).reshape(3,4)
B = torch.ones((4, 3),dtype=float) 
A,B,torch.mm(A, B)

输出:

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]], dtype=torch.float64),
 tensor([[1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.]], dtype=torch.float64),
 tensor([[ 6.,  6.,  6.],
         [22., 22., 22.],
         [38., 38., 38.]], dtype=torch.float64))

用int类型进行矩阵乘法的时候总会报错,转化为浮点数就好了

6.范数

我一般把范数理解为距离的推广,不过写代码也不需要理解得很深,会调用这个函数就行:

u = torch.tensor([3.0, -4.0]) 
torch.norm(u)

输出:

tensor(5.)

六.总结

    初步学习了张量以及简单的函数引用,并通过线性代数,进一步地理解了一维和二维张量,即向量和矩阵通过代码的相关操作和计算

标签：深度学习人工智能 python

本文转载自: https://blog.csdn.net/Wyh666a/article/details/142439244
版权归原作者 吴耀好 所有，如有侵权，请联系我们删除。

d2l-ai深度学习日记之预备知识(一)

引言

一.下载

二.配置环境

三.阅读前言

四.数据操作

1.安装依赖

2.张量tensor

3.异常值处理

五.线性代数

1.矩阵逆置,矩阵Hadamard积(按元素乘法):

2.矩阵利用sum函数降维

3.向量点积

4.矩阵-向量积

5.矩阵-矩阵乘法

6.范数

六.总结

发表评论

“d2l-ai深度学习日记之预备知识(一)”的评论:

关于作者

overfit同步小助手

相关阅读

文章导航