0


分部署存储Ceph

文章目录

Ceph

一.deploy-ceph部署

投入使用ceph前,要知道一个很现实得问题,ceph对低版本内核得客户端使用非常不友好,低内核是指小于等于3.10.0-862,默认的centos7.5及以下的系统都是小于此类内核,无法正常使用ceph的文件存储(cephFS)块存储(RBD)。

ceph部署一定要考虑好版本问题,经测试如果想使用ceph16版本,那你的客户端操作系统内核小于3.10.0-862根本用不了,常见的centos7.5以下默认没升级的内核都是小于3.10.0-862,所以这一大批服务器使用ceph提供的rdb存储都会有问题,而且ceph已经不提供centos的16版本的ceph-common组件,也就是说ceph集群部署16版本,常见的客户端centos7系统只能使用15版本的ceph-common,虽说也可以使用,但也存在一定隐患毕竟不是同一版本客户端软件,目前推荐使用ceph15的最高版本,15版本的安装与16相同,只是ceph源不同。

以上说法不正确,ceph版本选择和客户端内核没有关系,是所有版本的ceph都不友好支持内核小于等于3.10.0-862(CentOS7.5)

环境

ubuntu 18.04b版本

ceph 16.10版本
主机名IP部署 内容ceph-master01public IP:172.26.156.217 内部通讯IP: 10.0.0.217mon,mgr,osd,ceph-deployceph-master02public IP:172.26.156.218 内部通讯IP:10.0.0.218mon,mgr,osdceph-master03public IP:172.26.156.219 内部通讯IP:10.0.0.219mon,mgr,osd

1.系统环境初始化

1.1 修改主机名,DNS解析

master01:
hostnamectl  set-hostname  ceph-master01
vi /etc/hostname
ceph-master01

master02:
hostnamectl  set-hostname  ceph-master02
vi /etc/hostname
ceph-master02

master03:
hostnamectl  set-hostname  ceph-master03
vi /etc/hostname
ceph-master03

vi /etc/hosts
10.0.0.217 ceph-master01.example.local ceph-master01
10.0.0.218 ceph-master02.example.local ceph-master02
10.0.0.219 ceph-master03.example.local ceph-master03

1.2 时间同步

所有服务器执行

#修改时区
timedatectl set-timezone Asia/Shanghai

#时间同步
root@ubuntu:~# apt install ntpdate
root@ubuntu:~# ntpdate  ntp.aliyun.com1 Sep 20:54:39 ntpdate[9120]: adjust time server 203.107.6.88 offset 0.003441 sec
root@ubuntu:~# crontab  -e 
crontab: installing new crontab
root@ubuntu:~# crontab  -l 
* * * * * ntpdate  ntp.aliyun.com

1.3 配置apt基础源与ceph源

所有服务器执行如下命令自动替换

#基础源sed-i"s@http://.*archive.ubuntu.com@http://mirrors.tuna.tsinghua.edu.cn@g" /etc/apt/sources.list
sed-i"s@http://.*security.ubuntu.com@http://mirrors.tuna.tsinghua.edu.cn@g" /etc/apt/sources.list
#ceph源echo"deb https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic main">> /etc/apt/sources.list.d/ceph.list
echo"deb https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus  bionic main">> /etc/apt/sources.list.d/ceph.list

#导入ceph源key,不然不能使用ceph源wget-q -O- 'https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc'|sudo apt-key add -

# ceph仓库为https的话需要安装下面,不然无法使用https源aptinstall-y apt-transport-https ca-certificates curl software-properties-common

apt update

1.4关闭selinux与防火墙

# ufw disable

1.5 创建 ceph 集群部署用户cephadmin

推荐使用指定的普通用户部署和运行 ceph 集群,普通用户只要能以非交互方式执行 sudo

命令执行一些特权命令即可,新版的 ceph-deploy 可以指定包含 root 的在内只要可以执

行 sudo 命令的用户,不过仍然推荐使用普通用户,ceph 集群安装完成后会自动创建

ceph 用户(ceph 集群默认会使用 ceph 用户运行各服务进程, ceph-osd ),因此推荐

使用除了 ceph 用户之外的比如 cephusercephadmin 这样的普通用户去部署和 管理

ceph 集群。

在包含 ceph-deploy 节点的存储节点、mon 节点和 mgr 节点等创建 cephadmin 用户.

groupadd-r-g2088 cephadmin &&useradd-r-m-s /bin/bash -u2088-g2088 cephadmin &&echo cephadmin:chinadci888. | chpasswd

各服务器允许 cephadmin 用户以 sudo 执行特权命令:

~# echo "cephadmin ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers

1.6分发密钥

deploy节点要与所有服务器mon,mgr,osd节点免密,本文这里只有三台服务器,mon,mgr,osd都混合一起部署,所以只免密了三台服务器

master01(deploy节点):

su - cephadmin
ssh-keygen
ssh-copy-id   cephadmin@ceph-master01
ssh-copy-id   cephadmin@ceph-master02
ssh-copy-id   cephadmin@ceph-master03

2. ceph部署

2.1 安装ceph 部署工具

cephadmin@ceph-master01:~$ apt-cache madison ceph-deploy
ceph-deploy |2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic/main amd64 Packages
ceph-deploy |2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic/main i386 Packages
ceph-deploy |1.5.38-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe amd64 Packages
ceph-deploy |1.5.38-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe i386 Packages
cephadmin@ceph-master01:~$ sudoaptinstall ceph-deploy

2.2 初始化 mon 节点

Ubuntu 各服务器需要单独安装 Python2(mon,mgr,osd节点所有服务器必须做):

cephadmin@ceph-master01:~$ sudoaptinstall python2.7 -y
cephadmin@ceph-master01:~$ sudoln-sv /usr/bin/python2.7 /usr/bin/python2

ceph-master01:

ceph-deploy new --cluster-network 10.0.0.0/24 --public-network 172.26.0.0/16 ceph-master01 ceph-master02 ceph-master03

–cluster-network: 集群内部之间通讯网络

–public-network:业务客户端使用网络,单独使用网络,规避

~$ mkdir /etc/ceph-cluster
~$ sudochown  cephadmin:cephadmin /etc/ceph-cluster
~$ cd /etc/ceph-cluster/
cephadmin@ceph-master01:/etc/ceph-cluster$ ceph-deploy new --cluster-network 10.0.0.0/24 --public-network 172.26.0.0/16 ceph-master01  ceph-master02 ceph-master03
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy new --cluster-network 10.0.0.0/24 --public-network 172.26.0.0/16 ceph-master01 ceph-master02 ceph-master03
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       :<ceph_deploy.conf.cephdeploy.Conf instance at 0x7efd0a772e10>[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  ssh_copykey                   : True
[ceph_deploy.cli][INFO  ]  mon                           :['ceph-master01', 'ceph-master02', 'ceph-master03'][ceph_deploy.cli][INFO  ]  func                          :<function new at 0x7efd07a2bbd0>[ceph_deploy.cli][INFO  ]  public_network                :172.26.0.0/16
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  cluster_network               :10.0.0.0/24
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  fsid                          : None
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[ceph-master01][DEBUG ] connection detected need forsudo[ceph-master01][DEBUG ] connected to host: ceph-master01 
[ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ]find the location of an executable
[ceph-master01][INFO  ] Running command: sudo /bin/ip link show
[ceph-master01][INFO  ] Running command: sudo /bin/ip addr show
[ceph-master01][DEBUG ] IP addresses found: [u'172.26.156.217', u'10.0.0.217'][ceph_deploy.new][DEBUG ] Resolving host ceph-master01
[ceph_deploy.new][DEBUG ] Monitor ceph-master01 at 172.26.156.217
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[ceph-master02][DEBUG ] connected to host: ceph-master01 
[ceph-master02][INFO  ] Running command: ssh-CT-oBatchMode=yes ceph-master02
[ceph-master02][DEBUG ] connection detected need forsudo[ceph-master02][DEBUG ] connected to host: ceph-master02 
[ceph-master02][DEBUG ] detect platform information from remote host[ceph-master02][DEBUG ] detect machine type[ceph-master02][DEBUG ]find the location of an executable
[ceph-master02][INFO  ] Running command: sudo /bin/ip link show
[ceph-master02][INFO  ] Running command: sudo /bin/ip addr show
[ceph-master02][DEBUG ] IP addresses found: [u'10.0.0.218', u'172.26.156.218'][ceph_deploy.new][DEBUG ] Resolving host ceph-master02
[ceph_deploy.new][DEBUG ] Monitor ceph-master02 at 172.26.156.218
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[ceph-master03][DEBUG ] connected to host: ceph-master01 
[ceph-master03][INFO  ] Running command: ssh-CT-oBatchMode=yes ceph-master03
[ceph-master03][DEBUG ] connection detected need forsudo[ceph-master03][DEBUG ] connected to host: ceph-master03 
[ceph-master03][DEBUG ] detect platform information from remote host[ceph-master03][DEBUG ] detect machine type[ceph-master03][DEBUG ]find the location of an executable
[ceph-master03][INFO  ] Running command: sudo /bin/ip link show
[ceph-master03][INFO  ] Running command: sudo /bin/ip addr show
[ceph-master03][DEBUG ] IP addresses found: [u'172.26.156.219', u'10.0.0.219'][ceph_deploy.new][DEBUG ] Resolving host ceph-master03
[ceph_deploy.new][DEBUG ] Monitor ceph-master03 at 172.26.156.219
[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-master01', 'ceph-master02', 'ceph-master03'][ceph_deploy.new][DEBUG ] Monitor addrs are [u'172.26.156.217', u'172.26.156.218', u'172.26.156.219'][ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...

cephadmin@ceph-master01:/etc/ceph-cluster$ ll
total 36
drwxr-xr-x  2 cephadmin cephadmin  4096 Sep  216:50 ./
drwxr-xr-x 91 root      root       4096 Sep  216:22 ../
-rw-rw-r--  1 cephadmin cephadmin   326 Sep  216:50 ceph.conf
-rw-rw-r--  1 cephadmin cephadmin 17603 Sep  216:50 ceph-deploy-ceph.log
-rw-------  1 cephadmin cephadmin    73 Sep  216:50 ceph.mon.keyring

此步骤必须执行,否 ceph 集群的后续安装步骤会报错。

cephadmin@ceph-master01:/etc/ceph-cluster$ ceph-deploy install --no-adjust-repos --nogpgcheck ceph-master01 ceph-master02 ceph-master03

--no-adjust-repos #不修改已有的 apt 仓库源(默认会使用官方仓库) --nogpgcheck#不进行校验 
cephadmin@ceph-master01:/etc/ceph-cluster$ ceph-deploy install --no-adjust-repos --nogpgcheck ceph-master01 ceph-master02 ceph-master03
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy install --no-adjust-repos --nogpgcheck ceph-master01 ceph-master02 ceph-master03
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  testing                       : None
[ceph_deploy.cli][INFO  ]  cd_conf                       :<ceph_deploy.conf.cephdeploy.Conf instance at 0x7f59e4913e60>[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  dev_commit                    : None
[ceph_deploy.cli][INFO  ]  install_mds                   : False
[ceph_deploy.cli][INFO  ]  stable                        : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  adjust_repos                  : False
[ceph_deploy.cli][INFO  ]  func                          :<function install at 0x7f59e51c5b50>[ceph_deploy.cli][INFO  ]  install_mgr                   : False
[ceph_deploy.cli][INFO  ]  install_all                   : False
[ceph_deploy.cli][INFO  ]  repo                          : False
[ceph_deploy.cli][INFO  ]host:['ceph-master01', 'ceph-master02', 'ceph-master03'][ceph_deploy.cli][INFO  ]  install_rgw                   : False
[ceph_deploy.cli][INFO  ]  install_tests                 : False
[ceph_deploy.cli][INFO  ]  repo_url                      : None
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  install_osd                   : False
[ceph_deploy.cli][INFO  ]  version_kind                  : stable
[ceph_deploy.cli][INFO  ]  install_common                : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  dev                           : master
[ceph_deploy.cli][INFO  ]  nogpgcheck                    : True
[ceph_deploy.cli][INFO  ]  local_mirror                  : None
[ceph_deploy.cli][INFO  ]  release                       : None
[ceph_deploy.cli][INFO  ]  install_mon                   : False
[ceph_deploy.cli][INFO  ]  gpg_url                       : None
[ceph_deploy.install][DEBUG ] Installing stable version mimic on cluster ceph hosts ceph-master01 ceph-master02 ceph-master03
[ceph_deploy.install][DEBUG ] Detecting platform forhost ceph-master01 ...
[ceph-master01][DEBUG ] connection detected need forsudo[ceph-master01][DEBUG ] connected to host: ceph-master01 
[ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph_deploy.install][INFO  ] Distro info: Ubuntu 18.04 bionic
[ceph-master01][INFO  ] installing Ceph on ceph-master01
[ceph-master01][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update
[ceph-master01][DEBUG ] Hit:1 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease
[ceph-master01][DEBUG ] Hit:2 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease
[ceph-master01][DEBUG ] Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease
[ceph-master01][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease
[ceph-master01][DEBUG ] Hit:5 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease
[ceph-master01][DEBUG ] Hit:6 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease
[ceph-master01][DEBUG ] Reading package lists...
[ceph-master01][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ca-certificates apt-transport-https
[ceph-master01][DEBUG ] Reading package lists...
[ceph-master01][DEBUG ] Building dependency tree...
[ceph-master01][DEBUG ] Reading state information...
[ceph-master01][DEBUG ] ca-certificates is already the newest version (20211016~18.04.1).
[ceph-master01][DEBUG ] apt-transport-https is already the newest version (1.6.14).
[ceph-master01][DEBUG ]0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.
[ceph-master01][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update
[ceph-master01][DEBUG ] Hit:1 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease
[ceph-master01][DEBUG ] Hit:2 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease
[ceph-master01][DEBUG ] Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease
[ceph-master01][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease
[ceph-master01][DEBUG ] Hit:5 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease
[ceph-master01][DEBUG ] Hit:6 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease
[ceph-master01][DEBUG ] Reading package lists...
[ceph-master01][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ceph ceph-osd ceph-mds ceph-mon radosgw
[ceph-master01][DEBUG ] Reading package lists...
[ceph-master01][DEBUG ] Building dependency tree...
[ceph-master01][DEBUG ] Reading state information...
[ceph-master01][DEBUG ] ceph is already the newest version (16.2.10-1bionic).
[ceph-master01][DEBUG ] ceph-mds is already the newest version (16.2.10-1bionic).
[ceph-master01][DEBUG ] ceph-mon is already the newest version (16.2.10-1bionic).
[ceph-master01][DEBUG ] ceph-osd is already the newest version (16.2.10-1bionic).
[ceph-master01][DEBUG ] radosgw is already the newest version (16.2.10-1bionic).
[ceph-master01][DEBUG ]0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.
[ceph-master01][INFO  ] Running command: sudo ceph --version[ceph-master01][DEBUG ] ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable)[ceph_deploy.install][DEBUG ] Detecting platform forhost ceph-master02 ...
[ceph-master02][DEBUG ] connection detected need forsudo[ceph-master02][DEBUG ] connected to host: ceph-master02 
[ceph-master02][DEBUG ] detect platform information from remote host[ceph-master02][DEBUG ] detect machine type[ceph_deploy.install][INFO  ] Distro info: Ubuntu 18.04 bionic
[ceph-master02][INFO  ] installing Ceph on ceph-master02
[ceph-master02][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update
[ceph-master02][DEBUG ] Hit:1 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease
[ceph-master02][DEBUG ] Hit:2 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease
[ceph-master02][DEBUG ] Get:3 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease [8,572 B][ceph-master02][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease
[ceph-master02][DEBUG ] Get:5 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease [8,560 B][ceph-master02][DEBUG ] Hit:6 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease
[ceph-master02][DEBUG ] Fetched 17.1 kB in 1s (13.1 kB/s)[ceph-master02][DEBUG ] Reading package lists...
[ceph-master02][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ca-certificates apt-transport-https
[ceph-master02][DEBUG ] Reading package lists...
[ceph-master02][DEBUG ] Building dependency tree...
[ceph-master02][DEBUG ] Reading state information...
[ceph-master02][DEBUG ] ca-certificates is already the newest version (20211016~18.04.1).
[ceph-master02][DEBUG ] apt-transport-https is already the newest version (1.6.14).
[ceph-master02][DEBUG ]0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.
[ceph-master02][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update
[ceph-master02][DEBUG ] Hit:1 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease
[ceph-master02][DEBUG ] Hit:2 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease
[ceph-master02][DEBUG ] Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease
[ceph-master02][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease
[ceph-master02][DEBUG ] Get:5 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease [8,572 B][ceph-master02][DEBUG ] Get:6 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease [8,560 B][ceph-master02][DEBUG ] Fetched 17.1 kB in 1s (12.5 kB/s)[ceph-master02][DEBUG ] Reading package lists...
[ceph-master02][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ceph ceph-osd ceph-mds ceph-mon radosgw
[ceph-master02][DEBUG ] Reading package lists...
[ceph-master02][DEBUG ] Building dependency tree...
[ceph-master02][DEBUG ] Reading state information...
[ceph-master02][DEBUG ] ceph is already the newest version (16.2.10-1bionic).
[ceph-master02][DEBUG ] ceph-mds is already the newest version (16.2.10-1bionic).
[ceph-master02][DEBUG ] ceph-mon is already the newest version (16.2.10-1bionic).
[ceph-master02][DEBUG ] ceph-osd is already the newest version (16.2.10-1bionic).
[ceph-master02][DEBUG ] radosgw is already the newest version (16.2.10-1bionic).
[ceph-master02][DEBUG ]0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.
[ceph-master02][INFO  ] Running command: sudo ceph --version[ceph-master02][DEBUG ] ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable)[ceph_deploy.install][DEBUG ] Detecting platform forhost ceph-master03 ...
[ceph-master03][DEBUG ] connection detected need forsudo[ceph-master03][DEBUG ] connected to host: ceph-master03 
[ceph-master03][DEBUG ] detect platform information from remote host[ceph-master03][DEBUG ] detect machine type[ceph_deploy.install][INFO  ] Distro info: Ubuntu 18.04 bionic
[ceph-master03][INFO  ] installing Ceph on ceph-master03
[ceph-master03][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update
[ceph-master03][DEBUG ] Hit:1 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease
[ceph-master03][DEBUG ] Hit:2 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease
[ceph-master03][DEBUG ] Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease
[ceph-master03][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease
[ceph-master03][DEBUG ] Get:5 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease [8,572 B][ceph-master03][DEBUG ] Get:6 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease [8,560 B][ceph-master03][DEBUG ] Fetched 17.1 kB in 2s (8,636 B/s)[ceph-master03][DEBUG ] Reading package lists...
[ceph-master03][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ca-certificates apt-transport-https
[ceph-master03][DEBUG ] Reading package lists...
[ceph-master03][DEBUG ] Building dependency tree...
[ceph-master03][DEBUG ] Reading state information...
[ceph-master03][DEBUG ] ca-certificates is already the newest version (20211016~18.04.1).
[ceph-master03][DEBUG ] apt-transport-https is already the newest version (1.6.14).
[ceph-master03][DEBUG ]0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.
[ceph-master03][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update
[ceph-master03][DEBUG ] Hit:1 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease
[ceph-master03][DEBUG ] Hit:2 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease
[ceph-master03][DEBUG ] Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease
[ceph-master03][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease
[ceph-master03][DEBUG ] Get:5 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease [8,572 B][ceph-master03][DEBUG ] Get:6 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease [8,560 B][ceph-master03][DEBUG ] Fetched 17.1 kB in 1s (14.3 kB/s)[ceph-master03][DEBUG ] Reading package lists...
[ceph-master03][INFO  ] Running command: sudoenvDEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ceph ceph-osd ceph-mds ceph-mon radosgw
[ceph-master03][DEBUG ] Reading package lists...
[ceph-master03][DEBUG ] Building dependency tree...
[ceph-master03][DEBUG ] Reading state information...
[ceph-master03][DEBUG ] ceph is already the newest version (16.2.10-1bionic).
[ceph-master03][DEBUG ] ceph-mds is already the newest version (16.2.10-1bionic).
[ceph-master03][DEBUG ] ceph-mon is already the newest version (16.2.10-1bionic).
[ceph-master03][DEBUG ] ceph-osd is already the newest version (16.2.10-1bionic).
[ceph-master03][DEBUG ] radosgw is already the newest version (16.2.10-1bionic).
[ceph-master03][DEBUG ]0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.
[ceph-master03][INFO  ] Running command: sudo ceph --version[ceph-master03][DEBUG ] ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable)

此 过 程 会 在 指 定 的 ceph node 节 点 按 照 串 行 的 方 式 逐 个 服 务 器 安 装 ceph-base

ceph-common 等组件包:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-G4FTQuMs-1667266646244)(Ceph.assets/image-20220905102551098.png)]

2.3 安装ceph-mon服务

2.3.1 ceph-mon节点安装ceph-mon
cephadmin@ceph-master01:/etc/ceph-cluster# apt-cache madison ceph-mon
  ceph-mon |16.2.10-1bionic | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic/main amd64 Packages
  ceph-mon |14.2.22-1bionic | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic/main amd64 Packages
  ceph-mon |12.2.13-0ubuntu0.18.04.10 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates/main amd64 Packages
  ceph-mon |12.2.13-0ubuntu0.18.04.10 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security/main amd64 Packages
  ceph-mon |12.2.4-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/main amd64 Packages
cephadmin@ceph-master01:/etc/ceph-cluster$ 
root@ceph-master01:~# apt install ceph-mon
root@ceph-master02:~# apt install ceph-mon
root@ceph-master03:~# apt install ceph-mon

#可能已经安装完毕

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-S0YzZbkN-1667266646253)(Ceph.assets/image-20220905104407863.png)]

2.3.2 ceph 集群添加 ceph-mon 服务
cephadmin@ceph-master01:/etc/ceph-cluster# pwd
/etc/ceph-cluster
cephadmin@ceph-master01:/etc/ceph-cluster# cat ceph.conf [global]
fsid = f69afe6f-e559-4df7-998a-c5dc3e300209
public_network =172.26.0.0/16
cluster_network =10.0.0.0/24
mon_initial_members = ceph-master01, ceph-master02, ceph-master03
mon_host =172.26.156.217,172.26.156.218,172.26.156.219     #通过配置文件将mon服务加入节点
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy mon create-initial[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy mon create-initial
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create-initial
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       :<ceph_deploy.conf.cephdeploy.Conf instance at 0x7fe450df12d0>[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          :<function mon at 0x7fe450dcebd0>[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  keyrings                      : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-master01 ceph-master02 ceph-master03
[ceph_deploy.mon][DEBUG ] detecting platform forhost ceph-master01 ...
[ceph-master01][DEBUG ] connection detected need forsudo[ceph-master01][DEBUG ] connected to host: ceph-master01 
[ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ]find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 18.04 bionic
[ceph-master01][DEBUG ] determining if provided host has same hostnamein remote
[ceph-master01][DEBUG ] get remote short hostname[ceph-master01][DEBUG ] deploying mon to ceph-master01
[ceph-master01][DEBUG ] get remote short hostname[ceph-master01][DEBUG ] remote hostname: ceph-master01
[ceph-master01][DEBUG ]write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-master01][DEBUG ] create the mon path if it does not exist
[ceph-master01][DEBUG ] checking fordone path: /var/lib/ceph/mon/ceph-ceph-master01/done
[ceph-master01][DEBUG ]done path does not exist: /var/lib/ceph/mon/ceph-ceph-master01/done
[ceph-master01][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-master01.mon.keyring
[ceph-master01][DEBUG ] create the monitor keyring file[ceph-master01][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs-i ceph-master01 --keyring /var/lib/ceph/tmp/ceph-ceph-master01.mon.keyring --setuser64045--setgroup64045[ceph-master01][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-master01.mon.keyring
[ceph-master01][DEBUG ] create a donefile to avoid re-doing the mon deployment
[ceph-master01][DEBUG ] create the init path if it does not exist
[ceph-master01][INFO  ] Running command: sudo systemctl enable ceph.target
[ceph-master01][INFO  ] Running command: sudo systemctl enable ceph-mon@ceph-master01
[ceph-master01][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/[email protected] → /lib/systemd/system/[email protected].
[ceph-master01][INFO  ] Running command: sudo systemctl start ceph-mon@ceph-master01
[ceph-master01][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master01.asok mon_status
[ceph-master01][DEBUG ] ********************************************************************************
[ceph-master01][DEBUG ] status for monitor: mon.ceph-master01
[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"election_epoch":0, 
[ceph-master01][DEBUG ]"extra_probe_peers":[[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addrvec":[[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"172.26.156.218:3300", 
[ceph-master01][DEBUG ]"nonce":0, 
[ceph-master01][DEBUG ]"type":"v2"[ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"172.26.156.218:6789", 
[ceph-master01][DEBUG ]"nonce":0, 
[ceph-master01][DEBUG ]"type":"v1"[ceph-master01][DEBUG ]}[ceph-master01][DEBUG ]][ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addrvec":[[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"172.26.156.219:3300", 
[ceph-master01][DEBUG ]"nonce":0, 
[ceph-master01][DEBUG ]"type":"v2"[ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"172.26.156.219:6789", 
[ceph-master01][DEBUG ]"nonce":0, 
[ceph-master01][DEBUG ]"type":"v1"[ceph-master01][DEBUG ]}[ceph-master01][DEBUG ]][ceph-master01][DEBUG ]}[ceph-master01][DEBUG ]], 
[ceph-master01][DEBUG ]"feature_map":{[ceph-master01][DEBUG ]"mon":[[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"features":"0x3f01cfb9fffdffff", 
[ceph-master01][DEBUG ]"num":1, 
[ceph-master01][DEBUG ]"release":"luminous"[ceph-master01][DEBUG ]}[ceph-master01][DEBUG ]][ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]"features":{[ceph-master01][DEBUG ]"quorum_con":"0", 
[ceph-master01][DEBUG ]"quorum_mon":[], 
[ceph-master01][DEBUG ]"required_con":"0", 
[ceph-master01][DEBUG ]"required_mon":[][ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]"monmap":{[ceph-master01][DEBUG ]"created":"2022-09-05T02:52:15.915768Z", 
[ceph-master01][DEBUG ]"disallowed_leaders: ":"", 
[ceph-master01][DEBUG ]"election_strategy":1, 
[ceph-master01][DEBUG ]"epoch":0, 
[ceph-master01][DEBUG ]"features":{[ceph-master01][DEBUG ]"optional":[], 
[ceph-master01][DEBUG ]"persistent":[][ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]"fsid":"f69afe6f-e559-4df7-998a-c5dc3e300209", 
[ceph-master01][DEBUG ]"min_mon_release":0, 
[ceph-master01][DEBUG ]"min_mon_release_name":"unknown", 
[ceph-master01][DEBUG ]"modified":"2022-09-05T02:52:15.915768Z", 
[ceph-master01][DEBUG ]"mons":[[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"172.26.156.217:6789/0", 
[ceph-master01][DEBUG ]"crush_location":"{}", 
[ceph-master01][DEBUG ]"name":"ceph-master01", 
[ceph-master01][DEBUG ]"priority":0, 
[ceph-master01][DEBUG ]"public_addr":"172.26.156.217:6789/0", 
[ceph-master01][DEBUG ]"public_addrs":{[ceph-master01][DEBUG ]"addrvec":[[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"172.26.156.217:3300", 
[ceph-master01][DEBUG ]"nonce":0, 
[ceph-master01][DEBUG ]"type":"v2"[ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"172.26.156.217:6789", 
[ceph-master01][DEBUG ]"nonce":0, 
[ceph-master01][DEBUG ]"type":"v1"[ceph-master01][DEBUG ]}[ceph-master01][DEBUG ]][ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]"rank":0, 
[ceph-master01][DEBUG ]"weight":0[ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"0.0.0.0:0/1", 
[ceph-master01][DEBUG ]"crush_location":"{}", 
[ceph-master01][DEBUG ]"name":"ceph-master02", 
[ceph-master01][DEBUG ]"priority":0, 
[ceph-master01][DEBUG ]"public_addr":"0.0.0.0:0/1", 
[ceph-master01][DEBUG ]"public_addrs":{[ceph-master01][DEBUG ]"addrvec":[[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"0.0.0.0:0", 
[ceph-master01][DEBUG ]"nonce":1, 
[ceph-master01][DEBUG ]"type":"v1"[ceph-master01][DEBUG ]}[ceph-master01][DEBUG ]][ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]"rank":1, 
[ceph-master01][DEBUG ]"weight":0[ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"0.0.0.0:0/2", 
[ceph-master01][DEBUG ]"crush_location":"{}", 
[ceph-master01][DEBUG ]"name":"ceph-master03", 
[ceph-master01][DEBUG ]"priority":0, 
[ceph-master01][DEBUG ]"public_addr":"0.0.0.0:0/2", 
[ceph-master01][DEBUG ]"public_addrs":{[ceph-master01][DEBUG ]"addrvec":[[ceph-master01][DEBUG ]{[ceph-master01][DEBUG ]"addr":"0.0.0.0:0", 
[ceph-master01][DEBUG ]"nonce":2, 
[ceph-master01][DEBUG ]"type":"v1"[ceph-master01][DEBUG ]}[ceph-master01][DEBUG ]][ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]"rank":2, 
[ceph-master01][DEBUG ]"weight":0[ceph-master01][DEBUG ]}[ceph-master01][DEBUG ]], 
[ceph-master01][DEBUG ]"stretch_mode": false, 
[ceph-master01][DEBUG ]"tiebreaker_mon":""[ceph-master01][DEBUG ]}, 
[ceph-master01][DEBUG ]"name":"ceph-master01", 
[ceph-master01][DEBUG ]"outside_quorum":[[ceph-master01][DEBUG ]"ceph-master01"[ceph-master01][DEBUG ]], 
[ceph-master01][DEBUG ]"quorum":[], 
[ceph-master01][DEBUG ]"rank":0, 
[ceph-master01][DEBUG ]"state":"probing", 
[ceph-master01][DEBUG ]"stretch_mode": false, 
[ceph-master01][DEBUG ]"sync_provider":[][ceph-master01][DEBUG ]}[ceph-master01][DEBUG ] ********************************************************************************
[ceph-master01][INFO  ] monitor: mon.ceph-master01 is running
[ceph-master01][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master01.asok mon_status
[ceph_deploy.mon][DEBUG ] detecting platform forhost ceph-master02 ...
[ceph-master02][DEBUG ] connection detected need forsudo[ceph-master02][DEBUG ] connected to host: ceph-master02 
[ceph-master02][DEBUG ] detect platform information from remote host[ceph-master02][DEBUG ] detect machine type[ceph-master02][DEBUG ]find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 18.04 bionic
[ceph-master02][DEBUG ] determining if provided host has same hostnamein remote
[ceph-master02][DEBUG ] get remote short hostname[ceph-master02][DEBUG ] deploying mon to ceph-master02
[ceph-master02][DEBUG ] get remote short hostname[ceph-master02][DEBUG ] remote hostname: ceph-master02
[ceph-master02][DEBUG ]write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-master02][DEBUG ] create the mon path if it does not exist
[ceph-master02][DEBUG ] checking fordone path: /var/lib/ceph/mon/ceph-ceph-master02/done
[ceph-master02][DEBUG ]done path does not exist: /var/lib/ceph/mon/ceph-ceph-master02/done
[ceph-master02][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-master02.mon.keyring
[ceph-master02][DEBUG ] create the monitor keyring file[ceph-master02][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs-i ceph-master02 --keyring /var/lib/ceph/tmp/ceph-ceph-master02.mon.keyring --setuser64045--setgroup64045[ceph-master02][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-master02.mon.keyring
[ceph-master02][DEBUG ] create a donefile to avoid re-doing the mon deployment
[ceph-master02][DEBUG ] create the init path if it does not exist
[ceph-master02][INFO  ] Running command: sudo systemctl enable ceph.target
[ceph-master02][INFO  ] Running command: sudo systemctl enable ceph-mon@ceph-master02
[ceph-master02][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/[email protected] → /lib/systemd/system/[email protected].
[ceph-master02][INFO  ] Running command: sudo systemctl start ceph-mon@ceph-master02
[ceph-master02][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master02.asok mon_status
[ceph-master02][DEBUG ] ********************************************************************************
[ceph-master02][DEBUG ] status for monitor: mon.ceph-master02
[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"election_epoch":1, 
[ceph-master02][DEBUG ]"extra_probe_peers":[[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addrvec":[[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"172.26.156.217:3300", 
[ceph-master02][DEBUG ]"nonce":0, 
[ceph-master02][DEBUG ]"type":"v2"[ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"172.26.156.217:6789", 
[ceph-master02][DEBUG ]"nonce":0, 
[ceph-master02][DEBUG ]"type":"v1"[ceph-master02][DEBUG ]}[ceph-master02][DEBUG ]][ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addrvec":[[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"172.26.156.219:3300", 
[ceph-master02][DEBUG ]"nonce":0, 
[ceph-master02][DEBUG ]"type":"v2"[ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"172.26.156.219:6789", 
[ceph-master02][DEBUG ]"nonce":0, 
[ceph-master02][DEBUG ]"type":"v1"[ceph-master02][DEBUG ]}[ceph-master02][DEBUG ]][ceph-master02][DEBUG ]}[ceph-master02][DEBUG ]], 
[ceph-master02][DEBUG ]"feature_map":{[ceph-master02][DEBUG ]"mon":[[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"features":"0x3f01cfb9fffdffff", 
[ceph-master02][DEBUG ]"num":1, 
[ceph-master02][DEBUG ]"release":"luminous"[ceph-master02][DEBUG ]}[ceph-master02][DEBUG ]][ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]"features":{[ceph-master02][DEBUG ]"quorum_con":"0", 
[ceph-master02][DEBUG ]"quorum_mon":[], 
[ceph-master02][DEBUG ]"required_con":"0", 
[ceph-master02][DEBUG ]"required_mon":[][ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]"monmap":{[ceph-master02][DEBUG ]"created":"2022-09-05T02:52:20.691459Z", 
[ceph-master02][DEBUG ]"disallowed_leaders: ":"", 
[ceph-master02][DEBUG ]"election_strategy":1, 
[ceph-master02][DEBUG ]"epoch":0, 
[ceph-master02][DEBUG ]"features":{[ceph-master02][DEBUG ]"optional":[], 
[ceph-master02][DEBUG ]"persistent":[][ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]"fsid":"f69afe6f-e559-4df7-998a-c5dc3e300209", 
[ceph-master02][DEBUG ]"min_mon_release":0, 
[ceph-master02][DEBUG ]"min_mon_release_name":"unknown", 
[ceph-master02][DEBUG ]"modified":"2022-09-05T02:52:20.691459Z", 
[ceph-master02][DEBUG ]"mons":[[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"172.26.156.217:6789/0", 
[ceph-master02][DEBUG ]"crush_location":"{}", 
[ceph-master02][DEBUG ]"name":"ceph-master01", 
[ceph-master02][DEBUG ]"priority":0, 
[ceph-master02][DEBUG ]"public_addr":"172.26.156.217:6789/0", 
[ceph-master02][DEBUG ]"public_addrs":{[ceph-master02][DEBUG ]"addrvec":[[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"172.26.156.217:3300", 
[ceph-master02][DEBUG ]"nonce":0, 
[ceph-master02][DEBUG ]"type":"v2"[ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"172.26.156.217:6789", 
[ceph-master02][DEBUG ]"nonce":0, 
[ceph-master02][DEBUG ]"type":"v1"[ceph-master02][DEBUG ]}[ceph-master02][DEBUG ]][ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]"rank":0, 
[ceph-master02][DEBUG ]"weight":0[ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"172.26.156.218:6789/0", 
[ceph-master02][DEBUG ]"crush_location":"{}", 
[ceph-master02][DEBUG ]"name":"ceph-master02", 
[ceph-master02][DEBUG ]"priority":0, 
[ceph-master02][DEBUG ]"public_addr":"172.26.156.218:6789/0", 
[ceph-master02][DEBUG ]"public_addrs":{[ceph-master02][DEBUG ]"addrvec":[[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"172.26.156.218:3300", 
[ceph-master02][DEBUG ]"nonce":0, 
[ceph-master02][DEBUG ]"type":"v2"[ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"172.26.156.218:6789", 
[ceph-master02][DEBUG ]"nonce":0, 
[ceph-master02][DEBUG ]"type":"v1"[ceph-master02][DEBUG ]}[ceph-master02][DEBUG ]][ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]"rank":1, 
[ceph-master02][DEBUG ]"weight":0[ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"0.0.0.0:0/2", 
[ceph-master02][DEBUG ]"crush_location":"{}", 
[ceph-master02][DEBUG ]"name":"ceph-master03", 
[ceph-master02][DEBUG ]"priority":0, 
[ceph-master02][DEBUG ]"public_addr":"0.0.0.0:0/2", 
[ceph-master02][DEBUG ]"public_addrs":{[ceph-master02][DEBUG ]"addrvec":[[ceph-master02][DEBUG ]{[ceph-master02][DEBUG ]"addr":"0.0.0.0:0", 
[ceph-master02][DEBUG ]"nonce":2, 
[ceph-master02][DEBUG ]"type":"v1"[ceph-master02][DEBUG ]}[ceph-master02][DEBUG ]][ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]"rank":2, 
[ceph-master02][DEBUG ]"weight":0[ceph-master02][DEBUG ]}[ceph-master02][DEBUG ]], 
[ceph-master02][DEBUG ]"stretch_mode": false, 
[ceph-master02][DEBUG ]"tiebreaker_mon":""[ceph-master02][DEBUG ]}, 
[ceph-master02][DEBUG ]"name":"ceph-master02", 
[ceph-master02][DEBUG ]"outside_quorum":[], 
[ceph-master02][DEBUG ]"quorum":[], 
[ceph-master02][DEBUG ]"rank":1, 
[ceph-master02][DEBUG ]"state":"electing", 
[ceph-master02][DEBUG ]"stretch_mode": false, 
[ceph-master02][DEBUG ]"sync_provider":[][ceph-master02][DEBUG ]}[ceph-master02][DEBUG ] ********************************************************************************
[ceph-master02][INFO  ] monitor: mon.ceph-master02 is running
[ceph-master02][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master02.asok mon_status
[ceph_deploy.mon][DEBUG ] detecting platform forhost ceph-master03 ...
[ceph-master03][DEBUG ] connection detected need forsudo[ceph-master03][DEBUG ] connected to host: ceph-master03 
[ceph-master03][DEBUG ] detect platform information from remote host[ceph-master03][DEBUG ] detect machine type[ceph-master03][DEBUG ]find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 18.04 bionic
[ceph-master03][DEBUG ] determining if provided host has same hostnamein remote
[ceph-master03][DEBUG ] get remote short hostname[ceph-master03][DEBUG ] deploying mon to ceph-master03
[ceph-master03][DEBUG ] get remote short hostname[ceph-master03][DEBUG ] remote hostname: ceph-master03
[ceph-master03][DEBUG ]write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-master03][DEBUG ] create the mon path if it does not exist
[ceph-master03][DEBUG ] checking fordone path: /var/lib/ceph/mon/ceph-ceph-master03/done
[ceph-master03][DEBUG ]done path does not exist: /var/lib/ceph/mon/ceph-ceph-master03/done
[ceph-master03][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-master03.mon.keyring
[ceph-master03][DEBUG ] create the monitor keyring file[ceph-master03][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs-i ceph-master03 --keyring /var/lib/ceph/tmp/ceph-ceph-master03.mon.keyring --setuser64045--setgroup64045[ceph-master03][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-master03.mon.keyring
[ceph-master03][DEBUG ] create a donefile to avoid re-doing the mon deployment
[ceph-master03][DEBUG ] create the init path if it does not exist
[ceph-master03][INFO  ] Running command: sudo systemctl enable ceph.target
[ceph-master03][INFO  ] Running command: sudo systemctl enable ceph-mon@ceph-master03
[ceph-master03][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/[email protected] → /lib/systemd/system/[email protected].
[ceph-master03][INFO  ] Running command: sudo systemctl start ceph-mon@ceph-master03
[ceph-master03][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master03.asok mon_status
[ceph-master03][DEBUG ] ********************************************************************************
[ceph-master03][DEBUG ] status for monitor: mon.ceph-master03
[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"election_epoch":0, 
[ceph-master03][DEBUG ]"extra_probe_peers":[[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addrvec":[[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"172.26.156.217:3300", 
[ceph-master03][DEBUG ]"nonce":0, 
[ceph-master03][DEBUG ]"type":"v2"[ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"172.26.156.217:6789", 
[ceph-master03][DEBUG ]"nonce":0, 
[ceph-master03][DEBUG ]"type":"v1"[ceph-master03][DEBUG ]}[ceph-master03][DEBUG ]][ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addrvec":[[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"172.26.156.218:3300", 
[ceph-master03][DEBUG ]"nonce":0, 
[ceph-master03][DEBUG ]"type":"v2"[ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"172.26.156.218:6789", 
[ceph-master03][DEBUG ]"nonce":0, 
[ceph-master03][DEBUG ]"type":"v1"[ceph-master03][DEBUG ]}[ceph-master03][DEBUG ]][ceph-master03][DEBUG ]}[ceph-master03][DEBUG ]], 
[ceph-master03][DEBUG ]"feature_map":{[ceph-master03][DEBUG ]"mon":[[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"features":"0x3f01cfb9fffdffff", 
[ceph-master03][DEBUG ]"num":1, 
[ceph-master03][DEBUG ]"release":"luminous"[ceph-master03][DEBUG ]}[ceph-master03][DEBUG ]][ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]"features":{[ceph-master03][DEBUG ]"quorum_con":"0", 
[ceph-master03][DEBUG ]"quorum_mon":[], 
[ceph-master03][DEBUG ]"required_con":"0", 
[ceph-master03][DEBUG ]"required_mon":[][ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]"monmap":{[ceph-master03][DEBUG ]"created":"2022-09-05T02:52:25.483539Z", 
[ceph-master03][DEBUG ]"disallowed_leaders: ":"", 
[ceph-master03][DEBUG ]"election_strategy":1, 
[ceph-master03][DEBUG ]"epoch":0, 
[ceph-master03][DEBUG ]"features":{[ceph-master03][DEBUG ]"optional":[], 
[ceph-master03][DEBUG ]"persistent":[][ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]"fsid":"f69afe6f-e559-4df7-998a-c5dc3e300209", 
[ceph-master03][DEBUG ]"min_mon_release":0, 
[ceph-master03][DEBUG ]"min_mon_release_name":"unknown", 
[ceph-master03][DEBUG ]"modified":"2022-09-05T02:52:25.483539Z", 
[ceph-master03][DEBUG ]"mons":[[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"172.26.156.219:6789/0", 
[ceph-master03][DEBUG ]"crush_location":"{}", 
[ceph-master03][DEBUG ]"name":"ceph-master03", 
[ceph-master03][DEBUG ]"priority":0, 
[ceph-master03][DEBUG ]"public_addr":"172.26.156.219:6789/0", 
[ceph-master03][DEBUG ]"public_addrs":{[ceph-master03][DEBUG ]"addrvec":[[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"172.26.156.219:3300", 
[ceph-master03][DEBUG ]"nonce":0, 
[ceph-master03][DEBUG ]"type":"v2"[ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"172.26.156.219:6789", 
[ceph-master03][DEBUG ]"nonce":0, 
[ceph-master03][DEBUG ]"type":"v1"[ceph-master03][DEBUG ]}[ceph-master03][DEBUG ]][ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]"rank":0, 
[ceph-master03][DEBUG ]"weight":0[ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"0.0.0.0:0/1", 
[ceph-master03][DEBUG ]"crush_location":"{}", 
[ceph-master03][DEBUG ]"name":"ceph-master01", 
[ceph-master03][DEBUG ]"priority":0, 
[ceph-master03][DEBUG ]"public_addr":"0.0.0.0:0/1", 
[ceph-master03][DEBUG ]"public_addrs":{[ceph-master03][DEBUG ]"addrvec":[[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"0.0.0.0:0", 
[ceph-master03][DEBUG ]"nonce":1, 
[ceph-master03][DEBUG ]"type":"v1"[ceph-master03][DEBUG ]}[ceph-master03][DEBUG ]][ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]"rank":1, 
[ceph-master03][DEBUG ]"weight":0[ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"0.0.0.0:0/2", 
[ceph-master03][DEBUG ]"crush_location":"{}", 
[ceph-master03][DEBUG ]"name":"ceph-master02", 
[ceph-master03][DEBUG ]"priority":0, 
[ceph-master03][DEBUG ]"public_addr":"0.0.0.0:0/2", 
[ceph-master03][DEBUG ]"public_addrs":{[ceph-master03][DEBUG ]"addrvec":[[ceph-master03][DEBUG ]{[ceph-master03][DEBUG ]"addr":"0.0.0.0:0", 
[ceph-master03][DEBUG ]"nonce":2, 
[ceph-master03][DEBUG ]"type":"v1"[ceph-master03][DEBUG ]}[ceph-master03][DEBUG ]][ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]"rank":2, 
[ceph-master03][DEBUG ]"weight":0[ceph-master03][DEBUG ]}[ceph-master03][DEBUG ]], 
[ceph-master03][DEBUG ]"stretch_mode": false, 
[ceph-master03][DEBUG ]"tiebreaker_mon":""[ceph-master03][DEBUG ]}, 
[ceph-master03][DEBUG ]"name":"ceph-master03", 
[ceph-master03][DEBUG ]"outside_quorum":[[ceph-master03][DEBUG ]"ceph-master03"[ceph-master03][DEBUG ]], 
[ceph-master03][DEBUG ]"quorum":[], 
[ceph-master03][DEBUG ]"rank":0, 
[ceph-master03][DEBUG ]"state":"probing", 
[ceph-master03][DEBUG ]"stretch_mode": false, 
[ceph-master03][DEBUG ]"sync_provider":[][ceph-master03][DEBUG ]}[ceph-master03][DEBUG ] ********************************************************************************
[ceph-master03][INFO  ] monitor: mon.ceph-master03 is running
[ceph-master03][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master03.asok mon_status
[ceph_deploy.mon][INFO  ] processing monitor mon.ceph-master01
[ceph-master01][DEBUG ] connection detected need forsudo[ceph-master01][DEBUG ] connected to host: ceph-master01 
[ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ]find the location of an executable
[ceph-master01][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master01.asok mon_status
[ceph_deploy.mon][WARNIN] mon.ceph-master01 monitor is not yet in quorum, tries left: 5[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[ceph-master01][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master01.asok mon_status
[ceph_deploy.mon][WARNIN] mon.ceph-master01 monitor is not yet in quorum, tries left: 4[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying
[ceph-master01][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master01.asok mon_status
[ceph_deploy.mon][INFO  ] mon.ceph-master01 monitor has reached quorum![ceph_deploy.mon][INFO  ] processing monitor mon.ceph-master02
[ceph-master02][DEBUG ] connection detected need forsudo[ceph-master02][DEBUG ] connected to host: ceph-master02 
[ceph-master02][DEBUG ] detect platform information from remote host[ceph-master02][DEBUG ] detect machine type[ceph-master02][DEBUG ]find the location of an executable
[ceph-master02][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master02.asok mon_status
[ceph_deploy.mon][INFO  ] mon.ceph-master02 monitor has reached quorum![ceph_deploy.mon][INFO  ] processing monitor mon.ceph-master03
[ceph-master03][DEBUG ] connection detected need forsudo[ceph-master03][DEBUG ] connected to host: ceph-master03 
[ceph-master03][DEBUG ] detect platform information from remote host[ceph-master03][DEBUG ] detect machine type[ceph-master03][DEBUG ]find the location of an executable
[ceph-master03][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master03.asok mon_status
[ceph_deploy.mon][INFO  ] mon.ceph-master03 monitor has reached quorum![ceph_deploy.mon][INFO  ] all initial monitors are running and have formed quorum
[ceph_deploy.mon][INFO  ] Running gatherkeys...
[ceph_deploy.gatherkeys][INFO  ] Storing keys in temp directory /tmp/tmpP6crY0
[ceph-master01][DEBUG ] connection detected need forsudo[ceph-master01][DEBUG ] connected to host: ceph-master01 
[ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ] get remote short hostname[ceph-master01][DEBUG ] fetch remote file[ceph-master01][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25--cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph-master01.asok mon_status
[ceph-master01][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25--cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-master01/keyring auth get client.admin
[ceph-master01][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25--cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-master01/keyring auth get client.bootstrap-mds
[ceph-master01][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25--cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-master01/keyring auth get client.bootstrap-mgr
[ceph-master01][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25--cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-master01/keyring auth get client.bootstrap-osd
[ceph-master01][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25--cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-master01/keyring auth get client.bootstrap-rgw
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.client.admin.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-mds.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-mgr.keyring
[ceph_deploy.gatherkeys][INFO  ] keyring 'ceph.mon.keyring' already exists
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-osd.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-rgw.keyring
[ceph_deploy.gatherkeys][INFO  ] Destroy temp directory /tmp/tmpP6crY0
2.3.2 验证mon节点

验证在 mon 定节点已经自动安装并启动了 ceph-mon 服务,ceph-mon服务的作用之一就是验证权限,会在ceph-deploy 节点初始化目录会生成 ceph.bootstrap-mds/mgr/osd/rgw 服务的 keyring 认证文件,这

些初始化文件拥有对 ceph 集群的最高权限,所以一定要保存好,后续需要发送给各个服务节点。

cephadmin@ceph-master01:/etc/ceph-cluster# ps -ef | grep ceph-mon
ceph       281791010:52 ?        00:00:05 /usr/bin/ceph-mon -f--cluster ceph --id ceph-master01 --setuser ceph --setgroup ceph
cephadm+   2851928038011:10 pts/0    00:00:00 grep--color=auto ceph-mon
cephadmin@ceph-master01:/etc/ceph-cluster# systemctl  status ceph-mon.target 
● ceph-mon.target - ceph target allowing to start/stop all [email protected] instances at once
   Loaded: loaded (/lib/systemd/system/ceph-mon.target; enabled; vendor preset: enabled)
   Active: active since Mon 2022-09-05 09:46:11 CST; 1h 24min ago
cephadmin@ceph-master01:/etc/ceph-cluster# ll  
total 248
drwxr-xr-x  2 cephadmin cephadmin   4096 Sep  510:52 ./
drwxr-xr-x 92 root      root        4096 Sep  5 09:46 ../
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-mds.keyring
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-mgr.keyring
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-osd.keyring
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-rgw.keyring
-rw-------  1 cephadmin cephadmin    151 Sep  510:52 ceph.client.admin.keyring
-rw-rw-r--  1 cephadmin cephadmin    326 Sep  216:50 ceph.conf
-rw-rw-r--  1 cephadmin cephadmin 209993 Sep  510:52 ceph-deploy-ceph.log
-rw-------  1 cephadmin cephadmin     73 Sep  216:50 ceph.mon.keyring

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-j5pGvdIW-1667266646255)(Ceph.assets/image-20220905111057439.png)]

执行ceph -s 发现有健康告警

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-D50XaoWI-1667266646256)(Ceph.assets/image-20220906174218646.png)]

在其中一个mon节点执行:

ceph config set mon auth_allow_insecure_global_id_reclaim false

2.4 分发admin 秘钥

在 ceph-deploy 节点把配置文件和 admin 密钥拷贝至 Ceph 集群需要执行 ceph 管理命令的

节点,从而不需要后期通过 ceph 命令对 ceph 集群进行管理配置的时候每次都需要指定

ceph-mon 节点地址和 ceph.client.admin.keyring 文件,另外各 ceph-mon 节点也需要同步

ceph 的集群配置文件与认证文件。

cephadmin@ceph-master01:~# sudo apt install ceph-common -y #node 节点在初始化时已经安装

发送admin密钥到deploy节点,默认分发到/etc/ceph/下, ceph.client.admin.keyring只需要存放在要执行ceph客户端命令下即可,同k8s kubeconfig文件,传到日常管理的ceph-deploy下

cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy admin ceph-master01 

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-wA98sP60-1667266646257)(Ceph.assets/image-20220905155019458.png)]

认情况下ceph.client.admin.keyring文件的权限为600,属主和属组为root,如果在集群内节点使用cephadmin用户直接直接ceph命令,将会提示无法找到

/etc/ceph/ceph.client.admin.keyring

文件,因为权限不足

cephadmin@ceph-master01:~# sudo setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring
cephadmin@ceph-master02:~# sudo setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring
cephadmin@ceph-master03:~# sudo setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring

2.5 部署manager

ceph 的 Luminious(12) 及以上版本有 manager 节点,早期的版本没有。

2.5.1 部署 ceph-mgr 节点

因为此节点是monitor节点,所有的ceph包已经安装了,如果mgr节点与monitor节点不是一台服务器就会安装

cephadmin@ceph-master01:~# sudo apt install ceph-mgr
Reading package lists... Done
Building dependency tree       
Reading state information... Done
ceph-mgr is already the newest version (16.2.10-1bionic).
0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.

cephadmin@ceph-master02:~# sudo apt install ceph-mgr
Reading package lists... Done
Building dependency tree       
Reading state information... Done
ceph-mgr is already the newest version (16.2.10-1bionic).
0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.

cephadmin@ceph-master03:~# sudo apt install ceph-mgr
Reading package lists... Done
Building dependency tree       
Reading state information... Done
ceph-mgr is already the newest version (16.2.10-1bionic).
0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.

创建mgr节点

cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy mgr create ceph-master01 ceph-master02 ceph-master03[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy mgr create ceph-master01 ceph-master02 ceph-master03
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  mgr                           :[('ceph-master01', 'ceph-master01'), ('ceph-master02', 'ceph-master02'), ('ceph-master03', 'ceph-master03')][ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       :<ceph_deploy.conf.cephdeploy.Conf instance at 0x7f97e641fe60>[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          :<function mgr at 0x7f97e687f250>[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-master01:ceph-master01 ceph-master02:ceph-master02 ceph-master03:ceph-master03
[ceph-master01][DEBUG ] connection detected need forsudo[ceph-master01][DEBUG ] connected to host: ceph-master01 
[ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph_deploy.mgr][INFO  ] Distro info: Ubuntu 18.04 bionic
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-master01
[ceph-master01][DEBUG ]write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-master01][WARNIN] mgr keyring does not exist yet, creating one
[ceph-master01][DEBUG ] create a keyring file[ceph-master01][DEBUG ] create path recursively if it doesn't exist
[ceph-master01][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-master01 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-master01/keyring
[ceph-master01][INFO  ] Running command: sudo systemctl enable ceph-mgr@ceph-master01
[ceph-master01][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/[email protected] → /lib/systemd/system/[email protected].
[ceph-master01][INFO  ] Running command: sudo systemctl start ceph-mgr@ceph-master01
[ceph-master01][INFO  ] Running command: sudo systemctl enable ceph.target
[ceph-master02][DEBUG ] connection detected need for sudo
[ceph-master02][DEBUG ] connected to host: ceph-master02 
[ceph-master02][DEBUG ] detect platform information from remote host
[ceph-master02][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO  ] Distro info: Ubuntu 18.04 bionic
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-master02
[ceph-master02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-master02][WARNIN] mgr keyring does not exist yet, creating one
[ceph-master02][DEBUG ] create a keyring file
[ceph-master02][DEBUG ] create path recursively if it doesn't exist
[ceph-master02][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-master02 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-master02/keyring
[ceph-master02][INFO  ] Running command: sudo systemctl enable ceph-mgr@ceph-master02
[ceph-master02][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/[email protected] → /lib/systemd/system/[email protected].
[ceph-master02][INFO  ] Running command: sudo systemctl start ceph-mgr@ceph-master02
[ceph-master02][INFO  ] Running command: sudo systemctl enable ceph.target
[ceph-master03][DEBUG ] connection detected need forsudo[ceph-master03][DEBUG ] connected to host: ceph-master03 
[ceph-master03][DEBUG ] detect platform information from remote host[ceph-master03][DEBUG ] detect machine type[ceph_deploy.mgr][INFO  ] Distro info: Ubuntu 18.04 bionic
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-master03
[ceph-master03][DEBUG ]write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-master03][WARNIN] mgr keyring does not exist yet, creating one
[ceph-master03][DEBUG ] create a keyring file[ceph-master03][DEBUG ] create path recursively if it doesn't exist
[ceph-master03][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-master03 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-master03/keyring
[ceph-master03][INFO  ] Running command: sudo systemctl enable ceph-mgr@ceph-master03
[ceph-master03][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/[email protected] → /lib/systemd/system/[email protected].
[ceph-master03][INFO  ] Running command: sudo systemctl start ceph-mgr@ceph-master03
[ceph-master03][INFO  ] Running command: sudo systemctl enable ceph.target
2.5.2 验证ceph-mgr节点
cephadmin@ceph-master01:/etc/ceph-cluster# ps -ef | grep ceph-mgr
cephadmin@ceph-master01:/etc/ceph-cluster# systemctl  status  ceph-mgr@ceph-master01

cephadmin@ceph-master02:/etc/ceph-cluster# ps -ef | grep ceph-mgr
cephadmin@ceph-master02:/etc/ceph-cluster# systemctl  status  ceph-mgr@ceph-master02

cephadmin@ceph-master03:/etc/ceph-cluster# ps -ef | grep ceph-mgr
cephadmin@ceph-master03:/etc/ceph-cluster# systemctl  status  ceph-mgr@ceph-master03

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-AlnDS95i-1667266646258)(Ceph.assets/image-20220905174031363.png)]

2.6 部署osd

2.6.1 初始化存储节点

deploy节点操作,安装指定版本的ceph包,本文这里由于node节点与master节点部署在一起,已经安装过了,新node节点接入可以执行

cephadmin@ceph-master01:~# ceph-deploy install --release pacific  ceph-master01 ceph-master02 ceph-master03

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-oC0HFAkb-1667266646260)(Ceph.assets/image-20220905194727177.png)]

列出 ceph node 节点各个磁盘:

cephadmin@ceph-master01:~# ceph-deploy disk list ceph-master01  ceph-master02 ceph-master03#也可以使用fdisk -l 查看node节点所有未分区使用的磁盘

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-dUyBBgYt-1667266646261)(Ceph.assets/image-20220905200800064.png)]

使用 ceph-deploy disk zap 擦除各 ceph node 的 ceph 数据磁盘

ceph-master01 ceph-master02 ceph-master03的存储节点磁盘擦除过程如下,可以反复擦除执行

ceph-deploy disk zap ceph-master01 /dev/sdb
ceph-deploy disk zap ceph-master01 /dev/sdc
ceph-deploy disk zap ceph-master01 /dev/sdd
ceph-deploy disk zap ceph-master02 /dev/sdb
ceph-deploy disk zap ceph-master02 /dev/sdc
ceph-deploy disk zap ceph-master02 /dev/sdd
ceph-deploy disk zap ceph-master03 /dev/sdb
ceph-deploy disk zap ceph-master03 /dev/sdc
ceph-deploy disk zap ceph-master03 /dev/sdd

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-t1YGzJXR-1667266646262)(Ceph.assets/image-20220905201426609.png)]

2.6.2 OSD与磁盘部署关系

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-vdy9BHF4-1667266646264)(Ceph.assets/image-20220906175748320.png)]

#服务器上有两块ssd盘时,可以分别把block-db,block-wal存放到ssd盘中
ceph-deploy osd create {node}--data /dev/sdc --block-db /dev/sda --block-wal /dev/sdb
#服务器上只有一块硬盘时,只指定db的话存放ssd盘,没有指定waf存放位置,waf也会自动写到更快速的ssd盘上,和db共用
ceph-deploy osd create {node}--data /path/to/data --block-db /dev/sda 
#第三种无意义
ceph-deploy osd create {node}--data /path/to/data --block-wal /dev/sda 

这里采用最简单的第一种方案 单块磁盘,高性能的ceph集群可以使用第二种方案,ssd存放元数据与waf日志

2.6.3 添加OSD
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master01 --data /dev/sdb
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master01 --data /dev/sdc
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master01 --data /dev/sdd
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master02 --data /dev/sdb
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master02 --data /dev/sdc
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master02 --data /dev/sdd
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master03 --data /dev/sdb
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master03 --data /dev/sdc
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master03 --data /dev/sdd

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-KupkkkzY-1667266646266)(Ceph.assets/image-20220906180142755.png)]

2.6.4 验证ceph集群
cephadmin@ceph-master01:/etc/ceph-cluster# ceph -s 
  cluster:
    id:     f69afe6f-e559-4df7-998a-c5dc3e300209
    health: HEALTH_OK
    
  services:
    mon: 3 daemons, quorum ceph-master01,ceph-master02,ceph-master03 (age 31m)
    mgr: ceph-master03(active, since 27h), standbys: ceph-master01, ceph-master02
    osd: 9 osds: 9 up (since 27h), 9in(since 28h)
    
  data:
    pools:   2 pools, 33 pgs
    objects: 1 objects, 100 MiB
    usage:   370 MiB used, 450 GiB / 450 GiB avail
    pgs:     33 active+clean

2.7 测试上传与下载数据

存取数据时,客户端必须首先连接至 RADOS 集群上某存储池,然后根据对象名称由相关的 
CRUSH 规则完成数据对象寻址。于是,为了测试集群的数据存取功能,这里首先创建一个 
用于测试的存储池 mypool,并设定其 PG 数量为 32 个。
$ ceph -h#一个更底层的客户端命令 
$ rados -h#客户端命令
创建 pool
cephadmin@ceph-master01:~# ceph osd pool create mypool 32 32 
pool 'mypool' created

cephadmin@ceph-master01:/etc/ceph-cluster# sudo ceph osd pool ls  
device_health_metrics
mypool
或者: 
cephadmin@ceph-master01:/etc/ceph-cluster# rados lspools mypool
device_health_metrics
mypool
或者:
cephadmin@ceph-master01:/etc/ceph-cluster# ceph osd lspools1 device_health_metrics
2 mypool
上传数据

当前的 ceph 环境还没还没有部署使用块存储和文件系统使用 ceph,也没有使用对象存储的客户端,但是 ceph 的 rados 命令可以实现访问 ceph 对象存储的功能:

cephadmin@ceph-master01:~# sudo rados put msg1 /var/log/syslog  --pool=mypool
列出数据
cephadmin@ceph-master01:/etc/ceph-cluster# rados ls --pool=mypool
msg1
文件信息
cephadmin@ceph-master01:/etc/ceph-cluster# ceph osd map mypool msg1 
osdmap e114 pool 'mypool'(2) object 'msg1' -> pg 2.c833d430 (2.10) -> up ([15,13,0], p15) acting ([15,13,0], p15)

表示文件放在了存储池 id 为 2 的 c833d430 的 PG 上,10 为当前 PG 的 id, 2.10 表示数据是在 id 为 2 的存储池当中 id 为 10 的 PG 中存储,在线的 OSD 编号 15,13,10,主 OSD 为 5,活动的 OSD 15,13,10,三个 OSD 表示数据放一共 3 个副本,PG 中的 OSD 是 ceph 的 crush算法计算出三份数据保存在哪些 OSD。

下载文件
cephadmin@ceph-master01:/etc/ceph-cluster# sudo rados get msg1 --pool=mypool /opt/my.txt 
cephadmin@ceph-master01:/etc/ceph-cluster# ll /opt/my.txt 
-rw-r--r-- 1 root root 155733 Sep  720:51 /opt/my.txt
cephadmin@ceph-master01:/etc/ceph-cluster# head  /opt/my.txt
Sep  7 06:25:06 ceph-master01 rsyslogd:  [origin software="rsyslogd"swVersion="8.32.0" x-pid="998" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Sep  7 06:26:01 ceph-master01 CRON[10792]: (root) CMD (ntpdate  ntp.aliyun.com)
Sep  7 06:26:01 ceph-master01 CRON[10791]: (CRON) info (No MTA installed, discarding output)
Sep  7 06:27:01 ceph-master01 CRON[10794]: (root) CMD (ntpdate  ntp.aliyun.com)
Sep  7 06:27:01 ceph-master01 CRON[10793]: (CRON) info (No MTA installed, discarding output)
Sep  7 06:28:01 ceph-master01 CRON[10797]: (root) CMD (ntpdate  ntp.aliyun.com)
Sep  7 06:28:01 ceph-master01 CRON[10796]: (CRON) info (No MTA installed, discarding output)
Sep  7 06:29:01 ceph-master01 CRON[10799]: (root) CMD (ntpdate  ntp.aliyun.com)
Sep  7 06:29:01 ceph-master01 CRON[10798]: (CRON) info (No MTA installed, discarding output)
Sep  7 06:30:01 ceph-master01 CRON[10801]: (root) CMD (ntpdate  ntp.aliyun.com)
修改文件

修改文件只能下载后修改再上传覆盖

cephadmin@ceph-master01:/etc/ceph-cluster# sudo rados put msg1 /etc/passwd --pool=mypoo
删除文件
cephadmin@ceph-master01:/etc/ceph-cluster# sudo rados rm msg1 --pool=mypool
cephadmin@ceph-master01:/etc/ceph-cluster# rados ls --pool=mypool

3. Ceph RBD 使用详解

3.1 RBD架构图

Ceph 可以同时提供 RADOSGW(对象存储网关)、RBD(块存储)、Ceph FS(文件系统存储), RBD 即 RADOS Block Device 的简称,RBD 块存储是常用的存储类型之一,RBD 块设备类 似磁盘可以被挂载,RBD 块设备具有快照、多副本、克隆和一致性等特性,数据以条带化的方式存储在 Ceph 集群的多个 OSD 中。

条带化技术就是一种自动的将 I/O 的负载均衡到多个物理磁盘上的技术,条带化技术就是 将一块连续的数据分成很多小部分并把他们分别存储到不同磁盘上去。这就能使多个进程同 时访问数据的多个不同部分而不会造成磁盘冲突,而且在需要对这种数据进行顺序访问的时 候可以获得最大程度上的 I/O 并行能力,从而获得非常好的性能。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-FUeRhnZn-1667266646267)(Ceph.assets/image-20220916200520085.png)]

3.2 创建存储池

#创建存储池
root@ceph-master01:~# ceph osd pool create rbd-data1 32 32
pool 'rbd-data1' created
#存储池启用 rbd
root@ceph-master01:~# ceph osd pool application enable rbd-data1 rbd 
enabled application 'rbd' on pool 'rbd-data1'#初始化 rbd
root@ceph-master01:~# rbd pool init -p rbd-data1

3.3 创建img镜像

rbd 存储池并不能直接用于块设备,而是需要事先在其中按需创建映像(image),并把映 像文件作为块设备使用。rbd 命令可用于创建、查看及删除块设备相在的映像(image),以及克隆映像、创建快照、将映像回滚到快照和查看快照等管理操作。例如,下面的命令能 够在指定的 RBD 即 rbd-data1 创建一个名为 myimg1 的映像.

3.3.1 创建镜像
root@ceph-master01:~# rbd create data-img1 --size 3G --pool rbd-data1 --image-format 2 --image-feature layering#列出镜像
root@ceph-master01:~# rbd ls --pool rbd-data1 -l
NAME       SIZE   PARENT  FMT  PROT  LOCK
data-img1  3 GiB            2
3.3.2 列出镜像详细信息
root@ceph-master01:~# rbd --image data-img1 --pool rbd-data1 info 
rbd image 'data-img1':
    size 3 GiB in768 objects   
    order 22(4 MiB objects)#3G 768个objects,每个objects为4M 
    snapshot_count: 0  
    id: 284d64e8f879d  # 镜像id
    block_name_prefix: rbd_data.284d64e8f879d
    format: 2  
    features: layering  #镜像特性
    op_features: 
    flags: 
    create_timestamp: Fri Sep 1620:34:47 2022  
    access_timestamp: Fri Sep 1620:34:47 2022
    modify_timestamp: Fri Sep 1620:34:47 2022#已json显示详细信息
root@ceph-master01:~# rbd ls --pool rbd-data1 -l --format json --pretty-format[{"image":"data-img1",
        "id":"284d64e8f879d",
        "size":3221225472,
        "format":2}]
3.3.3 :镜像的特性

RBD默认开启的特性包括: layering/exlcusive lock/object map/fast diff/deep flatten

#启用指定存储池中的指定镜像的特性
$ rbd feature enable exclusive-lock --pool rbd-data1 --image data-img1 
$ rbd feature enable object-map --pool rbd-data1 --image data-img1 
$ rbd feature enable fast-diff --pool rbd-data1 --image data-img1
#关闭指定存储池中的指定镜像的特性
$ rbd feature disable fast-diff --pool rbd-data1 --image data-img1
#验证镜像特性
$ rbd --image data-img1 --pool rbd-data1 info

3.4 客户端使用RBD

客户端使用RBD需要两个条件:

一.安装

二.ceph用户

3.4.1 客户端安装 ceph-common

客户端要想挂载使用 ceph RBD,需要安装 ceph 客户端组件 ceph-common,但是 ceph-common 不在 cenos 的 yum 仓库,因此需要单独配置 yum 源,并且centos只能安装最高的版本为Octopus版(15版本)

#配置 yum 源: 
$ yum install epel-release 
$ yum install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm -y#下载ceph-common
$ yum install-y ceph-common
#验证ceph-common[root@zd_spring_156_101 ~]# rpm -qa | grep ceph-common
python3-ceph-common-15.2.17-0.el7.x86_64
ceph-common-15.2.17-0.el7.x86_64
3.4.2 同步账户认证文件

#scp至客户端服务器的/etc/ceph目录下,客户端默认会读取

[cephadmin@ceph-deploy ceph-cluster]$ scp ceph.conf ceph.client.admin.keyring [email protected]:/etc/ceph/
3.4.3 客户端映射镜像
#映射rbd [root@xianchaonode1 ~]# rbd -p rbd-data1 map data-img1 
/dev/rbd0

#客户端验证映射镜像[root@xianchaonode1 ~]# lsblk  
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
rbd0   253:0    0     3G  0 disk 
sr0     11:0    14.2G  0 rom  
sda      8:0    0   200G  0 disk 
├─sda2   8:2    0199.8G  0 part /
└─sda1   8:1    0   200M  0 part /boot
3.4.4 客户端挂载使用
#初始化磁盘[root@xianchaonode1 ~]# mkfs.xfs /dev/rbd0
Discarding blocks...Done.
meta-data=/dev/rbd0              isize=512agcount=8, agsize=98304 blks
         =sectsz=512attr=2, projid32bit=1=crc=1finobt=0, sparse=0
data     =bsize=4096blocks=786432, imaxpct=25=sunit=16swidth=16 blks
naming   =version 2bsize=4096   ascii-ci=0ftype=1
log      =internal log           bsize=4096blocks=2560, version=2=sectsz=512sunit=16 blks, lazy-count=1
realtime =none                   extsz=4096blocks=0, rtextents=0[root@xianchaonode1 ~]# mount /dev/rbd0  /mnt/[root@xianchaonode1 ~]# echo 111 >> /mnt/test.txt[root@xianchaonode1 ~]# cat /mnt/test.txt111[root@xianchaonode1 ~]# df -h 
Filesystem                                                                       Size  Used Avail Use% Mounted on
devtmpfs                                                                         7.9G     07.9G   0% /dev
tmpfs                                                                            7.9G     07.9G   0% /dev/shm
tmpfs                                                                            7.9G  795M  7.1G  10% /run
tmpfs                                                                            7.9G     07.9G   0% /sys/fs/cgroup
/dev/sda2                                                                        200G   62G  138G  31% /             
tmpfs                                                                            1.6G     01.6G   0% /run/user/0
/dev/rbd0                                                                        3.0G   33M  3.0G   2% /mnt
[root@xianchaonode1 ~]# 

4.CephFS使用详解

ceph FS 即 ceph filesystem,可以实现文件系统共享功能(POSIX 标准), 客户端通过 ceph协议挂载并使用 ceph 集群作为数据存储服务器,http://docs.ceph.org.cn/cephfs/。 Ceph FS 需要运行 Meta Data Services(MDS)服务,其守护进程为 ceph-mds,ceph-mds 进程管理与 cephFS 上存储的文件相关的元数据,并协调对 ceph 存储集群的访问。

   在linux系统使用 ls 等操作查看某个目录下的文件的时候,会有保存在磁盘上的分区表 记录文件的名称、创建日期、大小、inode 及存储位置等元数据信息,在 cephfs 由于数据 是被打散为若干个离散的 object 进行分布式存储,因此并没有统一保存文件的元数据,而且将文件的元数据保存到一个单独的存储出 matedata pool,但是客户端并不能直接访问 matedata pool 中的元数据信息,而是在读写数的时候有 MDS(matadata server)进行处理, 读数据的时候有 MDS从 matedata pool加载元数据然后缓存在内存(用于后期快速响应其它 客户端的请求)并返回给客户端,写数据的时候有MDS 缓存在内存并同步到matedata pool。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-WzMVOQBS-1667266646268)(Ceph.assets/1663655096199.png)]

4.1 部署MDS服务

如果要使用 cephFS,需要部署 MDS 服务,可以部署在mon节点,

root@ceph-master01:~# apt-cache madison ceph-mds
root@ceph-master01:~# apt install ceph-mds
root@ceph-master01:~# ceph-deploy mds create  ceph-master01 ceph-master02 ceph-master03

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-oztheZFb-1667266646270)(Ceph.assets/image-20220920143149139.png)]

#检查主从状态
ceph -s
ceph fs status

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-CfQbTB9J-1667266646272)(Ceph.assets/image-20220921150902744.png)]

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-HO23s5I4-1667266646274)(Ceph.assets/image-20220921150923614.png)]

4.2 创建CephFS mdetadata和data存储池

使用 CephFS 之前需要事先于集群中创建一个文件系统,并为其分别指定元数据和数据相关的存储池。下面创建一个名为 cephfs 的文件系统用于测试,它使用 cephfs-metadata 为 元数据存储池,使用 cephfs-data 为数据存储池.

root@ceph-master01:~# ceph osd pool create cephfs-metadata 32 32
pool 'cephfs-metadata' created
root@ceph-master01:~# ceph osd pool create cephfs-data 64 64
pool 'cephfs-data' created

4.3 创建 cephFS 并验证

root@ceph-master01:~# ceph fs new mycephfs cephfs-metadata cephfs-data
new fs with metadata pool 5 and data pool 6
root@ceph-master01:~# ceph fs ls
name: mycephfs, metadata pool: cephfs-metadata, data pools: [cephfs-data ]
root@ceph-master01:~# ceph fs status mycephfs
mycephfs - 0 clients
========
      POOL         TYPE     USED  AVAIL  
cephfs-metadata  metadata     0    142G  
  cephfs-data      data       0    142G  

4.4 创建cephFS客户端账户

#创建账户
root@ceph-master01:/etc/ceph-cluster# ceph auth add client.yanyan mon 'allow r' mds 'allow rw' osd 'allow rwx pool=cephfs-data'
added key for client.yanyan

#验证账户
root@ceph-master01:/etc/ceph-cluster# ceph auth get client.yanyan[client.yanyan]
    key = AQDnhSljvlhoLxAAWrV9uY1kXq5/C0jAziaB9Q==
    caps mds ="allow rw"
    caps mon ="allow r"
    caps osd ="allow rwx pool=cephfs-data"
exported keyring for client.yanyan
root@ceph-master01:/etc/ceph-cluster# ceph auth get client.yanyan -o ceph.client.yanyan.keyring
exported keyring for client.yanyan

root@ceph-master01:/etc/ceph-cluster# ll
total 416
drwxr-xr-x  2 cephadmin cephadmin   4096 Sep 2017:21 ./
drwxr-xr-x 92 root      root        4096 Sep  5 09:46 ../
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-mds.keyring
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-mgr.keyring
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-osd.keyring
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-rgw.keyring
-rw-------  1 cephadmin cephadmin    151 Sep  510:52 ceph.client.admin.keyring
-rw-r--r--  1 root      root         150 Sep 2017:21 ceph.client.yanyan.keyring
-rw-rw-r--  1 cephadmin cephadmin    398 Sep  720:01 ceph.conf
-rw-rw-r--  1 cephadmin cephadmin 368945 Sep  720:02 ceph-deploy-ceph.log
-rw-------  1 cephadmin cephadmin     73 Sep  216:50 ceph.mon.keyring
-rw-r--r--  1 root      root           9 Sep 1213:06 pass.txt
-rw-r--r--  1 root      root        1645 Oct 162015 release.asc
root@ceph-master01:/etc/ceph-cluster# ceph auth print-key client.yanyan > yanyan.key
root@ceph-master01:/etc/ceph-cluster# ll
total 420
drwxr-xr-x  2 cephadmin cephadmin   4096 Sep 2017:21 ./
drwxr-xr-x 92 root      root        4096 Sep  5 09:46 ../
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-mds.keyring
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-mgr.keyring
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-osd.keyring
-rw-------  1 cephadmin cephadmin    113 Sep  510:52 ceph.bootstrap-rgw.keyring
-rw-------  1 cephadmin cephadmin    151 Sep  510:52 ceph.client.admin.keyring
-rw-r--r--  1 root      root         150 Sep 2017:21 ceph.client.yanyan.keyring
-rw-rw-r--  1 cephadmin cephadmin    398 Sep  720:01 ceph.conf
-rw-rw-r--  1 cephadmin cephadmin 368945 Sep  720:02 ceph-deploy-ceph.log
-rw-------  1 cephadmin cephadmin     73 Sep  216:50 ceph.mon.keyring
-rw-r--r--  1 root      root           9 Sep 1213:06 pass.txt
-rw-r--r--  1 root      root        1645 Oct 162015 release.asc
-rw-r--r--  1 root      root          40 Sep 2017:21 yanyan.key

root@ceph-master01:/etc/ceph-cluster# cat ceph.client.yanyan.keyring[client.yanyan]
    key = AQDnhSljvlhoLxAAWrV9uY1kXq5/C0jAziaB9Q==
    caps mds ="allow rw"
    caps mon ="allow r"
    caps osd ="allow rwx pool=cephfs-data"
root@ceph-master01:/etc/ceph-cluster# 

4.5 安装ceph客户端

#以centos客户端
yum install epel-release -y
yum install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm
yum install ceph-common -y

4.6 同步认证文件

root@ceph-master01:~# cd /etc/ceph-cluster/
root@ceph-master01:/etc/ceph-cluster# scp ceph.conf ceph.client.yanyan.keyring yanyan.key [email protected]:/etc/ceph/

客户端权限认证

[root@zd_spring_156_101 ceph]# ceph  --user yanyan  -s 

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-FLTarN9h-1667266646275)(Ceph.assets/image-20220921151432682.png)]

4.7 客户端安装ceph-common

#配置 yum 源: 
$ yum install epel-release 
$ yum install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm -y#下载ceph-common
$ yum install-y ceph-common
#验证ceph-common[root@zd_spring_156_101 ~]# rpm -qa | grep ceph-common
python3-ceph-common-15.2.17-0.el7.x86_64
ceph-common-15.2.17-0.el7.x86_64

4.8 cephfs挂载使用

客户端挂载有两种方式,一是内核空间一是用户空间,内核空间挂载需要内核支持 ceph模块(内核版本3.10.0-862以上,centos7.5默认内核),用户空间挂载需要安装 ceph-fuse,如果内核本较低而没有 ceph 模块(验证centos7.5及以上默认内核基本都有ceph模块,centos7.3以下默认内核未测试),那么可以安装 ceph-fuse 挂载,但是推荐使用内核模块挂载。

4.8.1 内核空间挂载使用ceph-fs
#客户端通过 key 挂载(不需要安装ceph-common)[root@other165 ~]# cat /etc/ceph/yanyan.key 
AQDnhSljvlhoLxAAWrV9uY1kXq5/C0jAziaB9Q==[root@other165 ~]# [root@other165 ~]# mount -t ceph 172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789:/ /mnt  -o name=yanyan,secret=AQDnhSljvlhoLxAAWrV9uY1kXq5/C0jAziaB9Q==#客户端通过 key 文件挂载(需要安装ceph-common)[root@other165 ~]# mount -t ceph 172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789:/ /mnt  -o name=yanyan,secretfile=/etc/ceph/yanyan.key

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-kfopbFkk-1667266646277)(Ceph.assets/image-20221008205025259.png)]

4.8.2 开机自动挂载
# cat /etc/fstab172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789:/ /mnt ceph defaults,name=yanyan,secretfile=/etc/ceph/yanyan.key,_netdev 00

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ngmsFgtW-1667266646278)(Ceph.assets/image-20221008215623597.png)]

4.9用户空间挂载 ceph-fs

如果内核本较低而没有 ceph 模块,那么可以安装 ceph-fuse 挂载,但是推荐使用内核模块

挂载。

4.9.1 安装ceph-fuse
#配置 yum 源: 
$ yum install epel-release 
$ yum install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm -y#下载ceph-common
$ yum install ceph-fuse  -y
4.9.2 ceph-fuse 挂载 ceph
#默认读取/etc/ceph/下
ceph-fuse --name client.yanyan -m172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789 /mnt

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-fMlygoGo-1667266646282)(Ceph.assets/image-20221008214532720.png)]

4.9.3 开机自动挂载

指定用户会自动根据用户名称加载授权文件及配置文件 ceph.conf

vim /etc/fstab
none /data fuse.ceph ceph.id=yanyan,ceph.conf=/etc/ceph/ceph.conf,_netdev,defaults 00

5.k8s使用ceph案例

5.1 RBD静态存储

5.1.1 使用pv/pvc挂载RBD
apiVersion: v1 
kind: PersistentVolume 
metadata: 
  name: ceph-pv 
spec:   
   capacity:     
     storage: 1Gi   
   accessModes:     
   - ReadWriteOnce   
   rbd:     
         monitors:      
         - '172.26.156.217:6789'
         - '172.26.156.218:6789'
         - '172.26.156.219:6789' 
         pool: k8stest     #需要创建
         image: rbda       #需要创建
         user: admin       #需要创建
         secretRef:       
             name: ceph-secret   #需要创建  
         fsType: xfs     
         readOnly: false   
   persistentVolumeReclaimPolicy: Recycle
---
kind: PersistentVolumeClaim 
apiVersion: v1 
metadata:   
  name: ceph-pvc 
spec:   
  accessModes:     
  - ReadWriteOnce   
  resources:     
   requests:       
    storage: 1Gi
5.1.2 直接使用pod挂载RBD
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 1
  selector:
    matchLabels: #rs or deployment
      app: ng-deploy-80
  template:
    metadata:
      labels:
        app: ng-deploy-80
    spec:
      nodeName: xianchaonode1
      containers:
      - name: ng-deploy-80
        image: nginx
        ports:
        - containerPort: 80

        volumeMounts:
        - name: rbd-data1
          mountPath: /usr/share/nginx/html/rbd
      volumes:
        - name: rbd-data1
          rbd:
            monitors:
            - '172.26.156.217:6789'
            - '172.26.156.218:6789'
            - '172.26.156.219:6789' 
            pool: shijie-rbd-pool1
            image: shijie-img-img1
            fsType: xfs
            readOnly: false
            user: magedu-shijie
            secretRef:
              name: ceph-secret-magedu-shijie

5.1 RBD动态存储类

存储卷可以通过 kube-controller-manager 组件动态创建,适用于有状态服务需要多个存储卷的场合。 将 ceph admin 用户 key 文件定义为 k8s secret,用于 k8s 调用 ceph admin 权限动态创建存储卷,即不再需要提前创建好 image 而是 k8s 在需要使用的时候再调用 ceph 创建。

5.1.1 创建rbd pool
root@ceph-master01:/etc/ceph# ceph osd pool create k8s-rbd 32 32 
pool 'k8s-rbd' created
root@ceph-master01:/etc/ceph# ceph osd pool application enable  k8s-rbd rbd 
enabled application 'rbd' on pool 'k8s-rbd'
root@ceph-master01:/etc/ceph# rbd pool init -p k8s-rbd 
5.1.2 创建 admin 用户 secret:

用于k8s有权限创建rbd

#查看ceph adminbase64密钥
root@ceph-master01:/etc/ceph# ceph auth print-key client.admin | base64QVFCM1pCVmpMOE4wRUJBQVJlRzBxM3JwVkYvOERkbk11cnlaTkE9PQ==#ceph admin 用户 secret 文件内容[root@xianchaomaster1 pod-rbd]# vi case1-secret-admin.yaml 
apiVersion: v1
kind: Secret
metadata:
  name: ceph-secret-admin
  namespace: default
type: "kubernetes.io/rbd"
data:
  key: QVFCM1pCVmpMOE4wRUJBQVJlRzBxM3JwVkYvOERkbk11cnlaTkE9PQ==
5.1.3 创建普通用户的 secret

用于访问存储卷进行数据读写

root@ceph-master01:/etc/ceph# ceph auth get-or-create client.k8s-rbd mon 'allow r' osd 'allow * pool=k8s-rbd'[client.k8s-rbd]
    key = AQAMgkZjDyhsMhAAEH8F0Gwe3L+aiP/wAkqdyA==
root@ceph-master01:/etc/ceph# ceph auth print-key client.k8s-rbd | base64QVFBTWdrWmpEeWhzTWhBQUVIOEYwR3dlM0wrYWlQL3dBa3FkeUE9PQ==vi case2-secret-client.yaml
apiVersion: v1
kind: Secret
metadata:
  name: k8s-rbd
type: "kubernetes.io/rbd"
data:
  key: QVFBTWdrWmpEeWhzTWhBQUVIOEYwR3dlM0wrYWlQL3dBa3FkeUE9PQ==
5.1.4 创建存储类

创建动态存储类,为pod提供动态pv

vi case3-ceph-storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ceph-storage-class
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"#设置为默认存储类
provisioner: kubernetes.io/rbd
reclaimPolicy: Retain      #默认是Delete,危险
parameters:
  monitors: 172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789
  adminId: admin
  adminSecretName: ceph-secret-admin
  adminSecretNamespace: default
  pool: k8s-rbd
  userId: k8s-rbd
  userSecretName: k8s-rbd
5.1.5 创建基于存储类的PVC
vi case4-mysql-pvc.yaml  
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-data-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ceph-storage-class
  resources:
    requests:
      storage: '5Gi'#验证 PV/PVC:
kubectl  get pvc
kubectl  get pv

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-rznglWmQ-1667266646282)(Ceph.assets/image-20221018204619317.png)]

#验证ceph是否自动创建image

rbd ls--pool  k8s-rbd

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-i6tRcInr-1667266646285)(Ceph.assets/image-20221018204739363.png)]

5.1.6 运行单机mysql pod验证
vi case5-mysql-deploy-svc.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mysql
spec:
  selector:
    matchLabels:
      app: mysql
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - image: mysql:5.6.46
        name: mysql
        env:
          # Use secret in real usage
        - name: MYSQL_ROOT_PASSWORD
          value: 123456
        ports:
        - containerPort: 3306
          name: mysql
        volumeMounts:
        - name: mysql-persistent-storage
          mountPath: /var/lib/mysql
      volumes:
      - name: mysql-persistent-storage
        persistentVolumeClaim:
          claimName: mysql-data-pvc
---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: mysql-service-label
  name: mysql-service
spec:
  type: NodePort
  ports:
  - name: http
    port: 3306
    protocol: TCP
    targetPort: 3306
    nodePort: 33306
  selector:
    app: mysql

#连接验证,创建test库

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-SHZqtwMD-1667266646285)(Ceph.assets/image-20221018214026839.png)]

#删除mysql pod 重新创建,验证rbd数据持久

 kubectl delete -f case5-mysql-deploy-svc.yaml
 kubectl apply -f case5-mysql-deploy-svc.yaml

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-2X56RbzN-1667266646286)(Ceph.assets/image-20221018214657678.png)]

#将pod调度到指定的其他node节点,验证能否挂载rbd

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ZpxjM6be-1667266646287)(Ceph.assets/image-20221018215338030.png)]

 kubectl delete -f case5-mysql-deploy-svc.yaml
 kubectl apply -f case5-mysql-deploy-svc.yaml

依然可以挂载成功

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-pLkQl96p-1667266646289)(Ceph.assets/image-20221018215454280.png)]

5.2 cephFS静态存储

5.2.1 使用pv/pvc挂载cephFS

注意的是,一个cephFS pool共享多个目录,需要在cephfs中提前创建好子目录分给各个deployment挂载,找一台linux主机提前挂载此cephfs,创建/data2目录,不然pod只能挂载cepfFS的/目录,mount -t ceph 172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789:/ /mnt -o name=admin,secret=AQB3ZBVjL8N0EBAAReG0q3rpVF/8DdnMuryZNA==

#创建pv
apiVersion: v1
kind: PersistentVolume
metadata:
  name: cephfs-pv
  labels:
    app: static-cephfs-pv
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteMany
  cephfs:
    monitors:
      - 172.26.156.217:6789
      - 172.26.156.218:6789
      - 172.26.156.219:6789
    path: /data2/    #需要提前在cephFS pool中创建好/data2
    user: admin
    secretRef:
      name: ceph-secret-admin
    readOnly: false
  persistentVolumeReclaimPolicy: Recycle
  storageClassName: slow
---
#创建pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephfs-pvc-claim
spec:
  selector:
    matchLabels:
      app: static-cephfs-pv
  storageClassName: slow
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
---
#deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx2
spec:
  selector:
    matchLabels:
      k8s-app: nginx2
  replicas: 2
  template:
    metadata:
      labels:
        k8s-app: nginx2
    spec:
      containers:
      - name: nginx
        image: nginx
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 80
          protocol: TCP
        volumeMounts:
        - name: pvc-recycle
          mountPath: /usr/share/nginx/html/nginx2
      volumes:
      - name: pvc-recycle
        persistentVolumeClaim:
          claimName: cephfs-pvc-claim
---
kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: nginx2
  name: ng-deploy-80-service
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
    nodePort: 23380
  selector:
    k8s-app: nginx2
5.2 直接使用pod挂载cephFS

不需要创建pv/pvc,直接创建deployment挂载cephFS

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 2
  selector:
    matchLabels: #rs or deployment
      app: ng-deploy-80
  template:
    metadata:
      labels:
        app: ng-deploy-80
    spec:
      containers:
      - name: ng-deploy-80
        image: nginx
        ports:
        - containerPort: 80

        volumeMounts:
        - name: magedu-staticdata-cephfs
          mountPath: /usr/share/nginx/html/cephfs
      volumes:
        - name: magedu-staticdata-cephfs
          cephfs:
            monitors:
            - '172.26.156.217:6789'
            - '172.26.156.218:6789'
            - '172.26.156.219:6789'
            path: /
            user: admin
            secretRef:
              name: ceph-secret-admin
---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: ng-deploy-80-service-label
  name: ng-deploy-80-service
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
    nodePort: 33380
  selector:
    app: ng-deploy-80

5.4 cephFS动态存储类

虽然官方并没有直接提供对Cephfs StorageClass的支持,但是社区给出了类似的解决方案 external-storage/ cephfs。

测试发现Cephfs StorageClass k8s1.20版本之后已经不能使用。按照这种方式会报错以下截图,网上的解决方案需要在kube-apiserver.yaml配置文件中添加–feature-gates=RemoveSelfLink=false,这个参数在k8s1.20版本之后已经移除,后续使用ceph-csi方式。

Cephfs StorageClass部署方案(不成功):

https://www.cnblogs.com/leffss/p/15630641.html

https://www.cnblogs.com/estarhaohao/p/15965785.html

github issues: https://github.com/kubernetes/kubernetes/issues/94660

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-XzYqtsxT-1667266646290)(Ceph.assets/image-20221020143238817.png)]

5.5 ceph-csi 动态存储

Ceph-CSI RBD

https://www.modb.pro/db/137721

Ceph-CSI CephFS

最新版本3.7遇到问题,遂使用CSI-3.4版本

git clone https://github.com/ceph/ceph-csi.git -b release-v3.4
cd ceph-csi/deploy/cephfs/kubernetes

修改 ConfigMap 对象配置,clusterID 是 ceph fsid 。

vi csi-config-map.yaml 
---
apiVersion: v1
kind: ConfigMap
data:
  config.json: |-
    [{"clusterID":"f69afe6f-e559-4df7-998a-c5dc3e300209",
        "monitors":["172.26.156.217:6789","172.26.156.218:6789","172.26.156.219:6789"]}]
metadata:
  name: ceph-csi-config

ceph-csi 默认部署在 default 命名空间,这里改到 kube-system 。

sed-i"s/namespace: default/namespace: kube-system/g"$(grep-rl"namespace: default" ./)

部署 ceph-csi CephFS ,镜像的仓库是 k8s.gcr.io , 部分镜像拉取失败,可在dockerhub上search替换

kubectl get po -n kube-system  |grep csi-cephfs
csi-cephfsplugin-8xt97                          3/3     Running   0          6d10h
csi-cephfsplugin-bmxwr                          3/3     Running   0          6d10h
csi-cephfsplugin-n74cd                          3/3     Running   0          6d10h
csi-cephfsplugin-provisioner-79d84c9598-fb6bg   6/6     Running   0          6d10h
csi-cephfsplugin-provisioner-79d84c9598-g579j   6/6     Running   0          6d10h
csi-cephfsplugin-provisioner-79d84c9598-n8w2j   6/6     Running   0          6d10h

*创建 CephFS storageClass*

ceph-csi 需要 cephx 凭据才能与 Ceph 集群通信,这里使用的是 admin 用户。

vi secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: csi-cephfs-secret
  namespace: default
stringData:
  adminID: admin
  adminKey: AQB3ZBVjL8N0EBAAReG0q3rpVF/8DdnMuryZNA==

创建 storageClass 对象,这里使用的 Ceph FS name 是 mycephfs(ceph中新建cephfs时的名字,他不是一个pool) 。

#ceph fs new mycephfs cephfs-metadata cephfs-data

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ceph-csi-cephfs
provisioner: cephfs.csi.ceph.com
parameters:
  clusterID: f69afe6f-e559-4df7-998a-c5dc3e300209
  fsName: mycephfs
  csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/provisioner-secret-namespace: default
  csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/controller-expand-secret-namespace: default
  csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/node-stage-secret-namespace: default
reclaimPolicy: Retain
allowVolumeExpansion: true
mountOptions:
  - discard

创建 PVC

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: csi-cephfs-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: ceph-csi-cephfs

自动创建了pv,并且pv/pvc绑定

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-zw7FnGs5-1667266646291)(Ceph.assets/image-20221020203241780.png)]

创建测试的 Deployment

vi Deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cephfs-test
  labels:
    component: cephfs-test
spec:
  replicas: 2
  strategy:
    type: Recreate
  selector:
    matchLabels:
      component: cephfs-test
  template:
    metadata:
      labels:
        component: cephfs-test
    spec:
      containers:
      - name: nginx
        image: nginx
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 80
        volumeMounts:
        - name: config
          mountPath: "/data"
      volumes:
      - name: config
        persistentVolumeClaim:
          claimName: csi-cephfs-pvc
          readOnly: false

csi-cephfs 默认会创建一个名为 csi 的子文件系统

# ceph fs subvolumegroup ls cephfs
[
    {
        "name": "_deleting"
    },
    {
        "name": "csi"
    }
]

所有使用 csi-cephfs 创建的 PV ,都是在子文件系统 csi 的目录下

 kubectl get pv|grep default/csi-cephfs-pvc
pvc-0f36fd44-40f1-4ac3-aebe-0264a2fb50ea   1Gi        RWX            Delete           Bound    default/csi-cephfs-pvc              ceph-csi-cephfs            6d11h

# kubectl describe  pv pvc-0f36fd44-40f1-4ac3-aebe-0264a2fb50ea  | egrep 'subvolumeName|subvolumePath'subvolumeName=csi-vol-056e44c5-eddf-11eb-a990-a63fe71a40b6
                           subvolumePath=/volumes/csi/csi-vol-056e44c5-eddf-11eb-a990-a63fe71a40b6/e423daf3-017b-4a7e-8713-bd05bab695ee

# cd /mnt/cephfs-test/# tree -L 4 ./
./
└── volumes
    ├── csi
    │   ├── csi-vol-056e44c5-eddf-11eb-a990-a63fe71a40b6
    │   │   └── e423daf3-017b-4a7e-8713-bd05bab695ee
    │   └── csi-vol-1ac1f4c1-ef8a-11eb-a990-a63fe71a40b6
    │       └── 3773a567-a8cb-4bae-9181-38f4e3065436
    ├── _csi:csi-vol-056e44c5-eddf-11eb-a990-a63fe71a40b6.meta
    ├── _csi:csi-vol-1ac1f4c1-ef8a-11eb-a990-a63fe71a40b6.meta
    └── _deleting

7 directories, 2 files

二.理解

1.存储数据, object, pg,pgp, pool, osd, 存储磁盘的关系

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-M99cBbC2-1667266646292)(Ceph.assets/微信图片_20220908100157.jpg)]

2.Filestore、BlueStore、journal理解

ceph后端支持多种存储引擎,我们常听到的其中两种存储引擎就是FileStore与BlueStore,在L版(包括L版)之前,默认使用的filestore作为默认的存储引擎,但是由于FileStore存在一些缺陷,重新设计开发了BlueStore,L版之后默认使用的存储引擎就是BlueStore了。

BlueStore 对于整块数据的写入,数据直接 AIO 的方式写入磁盘,避免了 filestore的先写日志,后 apply到实际磁盘的两次写盘。对于随机IO,直接 WAL 的形式,直接写入 RocksDB 高性能的 KV 存储中。

所以我们在网上看到的各种优化资源文档,部署ceph时需要给ceph单独设定一块ssd盘为Journal刷写日志,但这种只针对L版本之前使用,在L版本之后已经没有Journal预刷写日志了。L版本之后使用下面的优化方式

#服务器上有两块ssd盘时,可以分别把block-db,block-wal存放到ssd盘中
ceph-deploy osd create ceph-node1 --data /dev/sdc --block-db /dev/sda --block-wal /dev/sdb
#只有一块硬盘时,只指定db的话存放ssd盘,没有指定waf存放位置,waf也会自动写到更快速的ssd盘上,和db共用一块。
ceph-deploy osd create ceph-node1 --data /dev/sdb--block-db /dev/sda 

三.需求

1.删除OSD的正确方式

Luminous 之前版本

1.调整osd的crush weight

  • 调整osd的crush weight
ceph osd crush reweight osd.0 0.5
ceph osd crush reweight osd.0 0.2
ceph osd crush reweight osd.0 0

说明:这个地方如果想慢慢的调整就分几次将crush 的weight 减低到0 ,这个过程实际上是让数据不分布在这个节点上,让数据慢慢的分布到其他节点上,直到最终为没有分布在这个osd,并且迁移完成。这个地方不光调整了osd 的crush weight ,实际上同时调整了host 的 weight ,这样会调整集群的整体的crush 分布,在osd 的crush 为0 后, 再对这个osd的任何删除相关操作都不会影响到集群的数据的分布。

  • 停止osd进程
 systemctl  stop [email protected]

停止osd的进程,这个是通知集群这个osd进程不在了,不提供服务了,因为本身没权重,就不会影响到整体的分布,也就没有迁移。

  • 将节点状态标记为out
ceph osd out osd.0

将osd退出集群,这个是通知集群这个osd不再映射数据了,不提供服务了,因为本身没权重,就不会影响到整体的分布,也就没有迁移。

  • 从crush中移除节点
ceph osd crush remove osd.0

这个是从crush中删除,因为OSD权重已经是0了 所以没影响主机的权重,也就没有迁移了。

  • 删除节点
ceph osd rm osd.0

这个是从集群里面删除这个OSD的记录。

  • 删除OSD认证(不删除编号会占住)
ceph auth del osd.0

这个是从认证当中去删除这个OSD的信息。

经过验证,此种方式只触发了一次迁移,虽然只是一个步骤先后上的调整,对于生产环境的的集群来说,迁移的量要少了一次,实际生产环境当中节点是有自动out的功能,这个可以考虑自己去控制,只是监控的密度需要加大,毕竟这个是一个需要监控的集群,完全让其自己处理数据的迁移是不可能的,带来的故障只会更多。

Luminous 之后版本

1.调整osd的crush weight

  • 调整osd的crush weight
ceph osd crush reweight osd.0 0.5
ceph osd crush reweight osd.0 0.2
ceph osd crush reweight osd.0 0

说明:这个地方如果想慢慢的调整就分几次将crush 的weight 减低到0 ,这个过程实际上是让数据不分布在这个节点上,让数据慢慢的分布到其他节点上,直到最终为没有分布在这个osd,并且迁移完成。这个地方不光调整了osd 的crush weight ,实际上同时调整了host 的 weight ,这样会调整集群的整体的crush 分布,在osd 的crush 为0 后, 再对这个osd的任何删除相关操作都不会影响到集群的数据的分布。

  • 停止osd进程
 systemctl  stop [email protected]

停止osd的进程,这个是通知集群这个osd进程不在了,不提供服务了,因为本身没权重,就不会影响到整体的分布,也就没有迁移。

  • 将节点状态标记为out
ceph osd out osd.0

将osd退出集群,这个是通知集群这个osd不再映射数据了,不提供服务了,因为本身没权重,就不会影响到整体的分布,也就没有迁移。

移除设备

ceph osd purge {id} --yes-i-really-mean-it

若 OSD 的配置信息存在于 ceph.conf 配置文件中,管理员在删除 OSD 之后手

动将其删除。

2.ceph 旧OSD 节点 格式化 数据,加入新 ceph集群

旧有的ceph osd , 想要格式化之后加入新的ceph节点。
查询 osd 旧有 数据,后面的步骤会用到。

[root@ceph-207 ~]# ceph-volume lvm list====== osd.1 =======[block] /dev/ceph-58ef1d0f-272b-4273-82b1-689946254645/osd-block-e0efe172-778e-46e1-baa2-cd56408aac34

block device /dev/ceph-58ef1d0f-272b-4273-82b1-689946254645/osd-block-e0efe172-778e-46e1-baa2-cd56408aac34
block uuid hCx4XW-OjKC-OC8Y-jEg2-NKYo-Pb6f-y9Nfl3
cephx lockbox secret
cluster fsid b7e4cb56-9cc8-4e44-ab87-24d4253d0951
cluster name ceph
crush device class None
encrypted 0
osd fsid e0efe172-778e-46e1-baa2-cd56408aac34
osd id1
osdspec affinity
type block
vdo 0
devices /dev/sdb
直接加入集群报错:

ceph-volume lvm activate 1 e0efe172-778e-46e1-baa2-cd56408aac34
目前遇到了两类报错:

osd.1 21 heartbeat_check: no reply from 192.168.8.206:6804 osd.0 ever on either front or back, first ping sent 2020-11-26T16:00:04.842947+0800 (oldest deadline 2020-11-26T16:00:24.842947+0800)

stderr: Calculated size of logical volume is 0 extents. Needs to be larger.

--> Was unable to complete a new OSD, will rollback changes

 

格式化数据,重新加入新的ceph集群:

 

1、停止osd 服务 , @ 后面的 1 为 ceph-volume lvm list 命令查询出的 osd id

systemctl stop ceph-osd@1
2、重处理 osd lvm, 1 还是 osd id

ceph-volume lvm zap --osd-id 13、查询 lvs 信息, 删除 lv、pg 等信息

[root@ceph-207 ~]# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
osd-block-e0efe172-778e-46e1-baa2-cd56408aac34 ceph-58ef1d0f-272b-4273-82b1-689946254645 -wi-a----- <16.00g
home cl -wi-ao---- <145.12g
root cl -wi-ao---- 50.00g
swap cl -wi-ao---- <3.88g
[root@ceph-207 ~]# vgremove ceph-58ef1d0f-272b-4273-82b1-689946254645
Do you really want to remove volume group "ceph-58ef1d0f-272b-4273-82b1-689946254645" containing 1 logical volumes? [y/n]: y
Do you really want to remove active logical volume ceph-58ef1d0f-272b-4273-82b1-689946254645/osd-block-e0efe172-778e-46e1-baa2-cd56408aac34? [y/n]: y
Logical volume "osd-block-e0efe172-778e-46e1-baa2-cd56408aac34" successfully remove
4、将主机上的磁盘重新加入新的ceph集群

ceph-volume lvm create --data /dev/sdb
5、查询下 osd tree , 磁盘挂载情况

[root@ceph-207 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-10.07997 root default
-30.03508host ceph-206
0 hdd 0.01559 osd.0 up 1.000001.000001 hdd 0.01949 osd.1 up 1.000001.00000-50.04489host ceph-207
2 hdd 0.01559 osd.2 up 1.000001.00000[root@ceph-207 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 200G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 199G 0 part
├─cl-root 253:0 0 50G 0 lvm /
├─cl-swap 253:1 03.9G 0 lvm [SWAP]
└─cl-home 253:2 0145.1G 0 lvm /home
sdb 8:16 0 16G 0 disk
└─ceph--c221ed63--d87a--4bbd--a503--d8f2ed9e806b-osd--block--530376b8--c7bc--4d64--bc0c--4f8692559562 253:3 0 16G 0 lvm
sr0

3.ceph如何修改配置文件

默认生成的ceph.conf文件如果需要改动的话需要加一些参数,如果配置文件变化也是通过ceph-deploy进行推送。请不要直接修改某个节点的"/etc/ceph/ceph.conf"文件,而是在部署机下修改ceph.conf,采用推送的方式更加方便安全。

vi /etc/ceph-cluster/ceph.conf
[global]
fsid = f69afe6f-e559-4df7-998a-c5dc3e300209
public_network =172.26.0.0/16
cluster_network =10.0.0.0/24
mon_initial_members = ceph-master01, ceph-master02, ceph-master03
mon_host =172.26.156.217,172.26.156.218,172.26.156.219
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

[mon]
mon_clock_drift_allowed =0.10
mon clock drift warn backoff =10

亲测,参数名称要不要下划线都可以

改完后将配置推到集群所有的机器

cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy --overwrite-conf config push  ceph-master01  ceph-master02 ceph-master03

重启所有机器

root@ceph-master01:~# systemctl   restart [email protected]
root@ceph-master02:~# systemctl   restart [email protected]
root@ceph-master03:~# systemctl   restart [email protected]

查看集群状态

cephadmin@ceph-master01:/etc/ceph-cluster# ceph -s 
  cluster:
    id:     f69afe6f-e559-4df7-998a-c5dc3e300209
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph-master01,ceph-master02,ceph-master03 (age 3m)
    mgr: ceph-master03(active, since 26h), standbys: ceph-master01, ceph-master02
    osd: 9 osds: 9 up (since 26h), 9in(since 27h)
 
  data:
    pools:   2 pools, 33 pgs
    objects: 1 objects, 100 MiB
    usage:   370 MiB used, 450 GiB / 450 GiB avail
    pgs:     33 active+clean

四.故障记录

1. bash: python2: command not found

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-bJQ1m94F-1667266646295)(Ceph.assets/image-20220902164421447.png)]

原因:ceph-master02节点没有安装python2.7

解决方案:

cephadmin@ceph-master01:~$ sudoaptinstall python2.7 -y
cephadmin@ceph-master01:~$ sudoln-sv /usr/bin/python2.7 /usr/bin/python2
2.[ceph_deploy][ERROR ] RuntimeError: AttributeError: module ‘platform’ has no attribute ‘linux_distribution’

ceph-deploy new --cluster-network 10.0.0.0/24 --public-network 172.26.0.0/16 master01

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-fwXxvYVu-1667266646298)(Ceph.assets/image-20220825151515834.png)]

原因:

部署cepe的操作系统为ubuntu20.04,该版本python3.7后不再支持platform.linux_distribution

解决办法:

修改/usr/lib/python3/dist-packages/ceph_deploy/hosts/remotes.py文件为如下所示

defplatform_information(_linux_distribution=None):""" detect platform information from remote host """"""
    linux_distribution = _linux_distribution or platform.linux_distribution
    distro, release, codename = linux_distribution()
    """
    distro = release = codename =Nonetry:
          linux_distribution = _linux_distribution or platform.linux_distribution
          distro, release, codename = linux_distribution()except AttributeError:pass

修改前:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-qC4Z6XWq-1667266646299)(Ceph.assets/image-20220825152858037.png)]

修改后:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-66O37hn8-1667266646300)(Ceph.assets/image-20220825152510116.png)]

##### 3.apt-cache madison ceph-deploy 为1.5.38的低版本
cephadmin@ceph-master01:~$ sudoapt-cache madison ceph-deploy 
ceph-deploy |1.5.38-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe amd64 Packages
ceph-deploy |1.5.38-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe i386 Packages
cephadmin@ceph-master01:~$ 
4. RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdd
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy disk zap ceph-master01 /dev/sdd [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy disk zap ceph-master01 /dev/sdd
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  debug                         : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : zap
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       :<ceph_deploy.conf.cephdeploy.Conf instance at 0x7f8d53187280>[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]host: ceph-master01
[ceph_deploy.cli][INFO  ]  func                          :<function disk at 0x7f8d5315d350>[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  disk                          :['/dev/sdd'][ceph_deploy.osd][DEBUG ] zapping /dev/sdd on ceph-master01
[ceph-master01][DEBUG ] connection detected need forsudo[ceph-master01][DEBUG ] connected to host: ceph-master01 
[ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ]find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: Ubuntu 18.04 bionic
[ceph-master01][DEBUG ] zeroing last few blocks of device
[ceph-master01][DEBUG ]find the location of an executable
[ceph-master01][INFO  ] Running command: sudo /usr/sbin/ceph-volume lvm zap /dev/sdd
[ceph-master01][WARNIN] --> Zapping: /dev/sdd
[ceph-master01][WARNIN] --> Zapping lvm member /dev/sdd. lv_path is /dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357
[ceph-master01][WARNIN] Running command: /bin/dd if=/dev/zero of=/dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357 bs=1M count=10conv=fsync
[ceph-master01][WARNIN]  stderr: 10+0 records in[ceph-master01][WARNIN]10+0 records out
[ceph-master01][WARNIN]  stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0201594 s, 520 MB/s
[ceph-master01][WARNIN] -->--destroy was not specified, but zapping a whole device will remove the partition table
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN] -->  RuntimeError: could not complete wipefs on device: /dev/sdd
[ceph-master01][ERROR ] RuntimeError: command returned non-zero exit status: 1[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdd

原因:该osd没有完全剔除,不能zap擦除

解决方案:1.取消磁盘挂载(可能是挂载正在使用中)2.完全移除osd

5. 使用zap擦除格式化磁盘时,报错[ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdd
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy disk zap ceph-master01 /dev/sdd[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy disk zap ceph-master01 /dev/sdd
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  debug                         : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : zap
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       :<ceph_deploy.conf.cephdeploy.Conf instance at 0x7f8880b7c280>[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]host: ceph-master01
[ceph_deploy.cli][INFO  ]  func                          :<function disk at 0x7f8880b52350>[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  disk                          :['/dev/sdd'][ceph_deploy.osd][DEBUG ] zapping /dev/sdd on ceph-master01
[ceph-master01][DEBUG ] connection detected need forsudo[ceph-master01][DEBUG ] connected to host: ceph-master01 
[ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ]find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: Ubuntu 18.04 bionic
[ceph-master01][DEBUG ] zeroing last few blocks of device
[ceph-master01][DEBUG ]find the location of an executable
[ceph-master01][INFO  ] Running command: sudo /usr/sbin/ceph-volume lvm zap /dev/sdd
[ceph-master01][WARNIN] --> Zapping: /dev/sdd
[ceph-master01][WARNIN] --> Zapping lvm member /dev/sdd. lv_path is /dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357
[ceph-master01][WARNIN] Running command: /bin/dd if=/dev/zero of=/dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357 bs=1M count=10conv=fsync
[ceph-master01][WARNIN]  stderr: 10+0 records in[ceph-master01][WARNIN]10+0 records out
[ceph-master01][WARNIN]10485760 bytes (10 MB, 10 MiB) copied, 0.0244706 s, 429 MB/s
[ceph-master01][WARNIN]  stderr: 
[ceph-master01][WARNIN] -->--destroy was not specified, but zapping a whole device will remove the partition table
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN]  stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[ceph-master01][WARNIN] -->  RuntimeError: could not complete wipefs on device: /dev/sdd
[ceph-master01][ERROR ] RuntimeError: command returned non-zero exit status: 1[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdd

解决办法:

手动执行命令擦除磁盘,Device or resource busy,说明磁盘正在使用中

cephadmin@ceph-master01:/etc/ceph-cluster# sudo /usr/sbin/ceph-volume lvm zap /dev/sdd
--> Zapping: /dev/sdd
--> Zapping lvm member /dev/sdd. lv_path is /dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357
Running command: /bin/dd if=/dev/zero of=/dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357 bs=1M count=10conv=fsync
 stderr: 10+0 records in10+0 records out
 stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0241062 s, 435 MB/s
-->--destroy was not specified, but zapping a whole device will remove the partition table
 stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
--> failed to wipefs device, will try again to workaround probable race condition
 stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
--> failed to wipefs device, will try again to workaround probable race condition
 stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
--> failed to wipefs device, will try again to workaround probable race condition
 stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
--> failed to wipefs device, will try again to workaround probable race condition
 stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
--> failed to wipefs device, will try again to workaround probable race condition
 stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
--> failed to wipefs device, will try again to workaround probable race condition
 stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
--> failed to wipefs device, will try again to workaround probable race condition
 stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
--> failed to wipefs device, will try again to workaround probable race condition
-->  RuntimeError: could not complete wipefs on device: /dev/sdd

方式一:彻底清除磁盘

ddif=/dev/zero of=/dev/sdb bs=512K count=1reboot

方式二:dmsetup移除

root@ceph-master01:~# lsblk 
sdi                                                                                               8:128  0 447.1G  0 disk 
└─ceph--3511f2c6--2be6--40fd--901d--3b75e433afa5-osd--block--ca994912--f215--4612--97fa--abe33b07985b
                                                                                                253:7    0 447.1G  0 lvm
# dmsetup移除
root@ceph-master01:~# dmsetup remove ceph--3511f2c6--2be6--40fd--901d--3b75e433afa5-osd--block--ca994912--f215--4612--97fa--abe33b07985b
6. mons are allowing insecure global_id reclaim
如果AUTH_INSECURE_GLOBAL_ID_RECLAIM还没有引发健康警报并且auth_expose_insecure_global_id_reclaim尚未禁用该设置(默认情况下处于启用状态),则当前没有需要升级的客户端已连接,可以安全地禁止不安全的global_id回收:
ceph config set mon auth_allow_insecure_global_id_reclaim false# 如果仍然有需要升级的客户端,则可以使用以下方法暂时使此警报停止:
ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w   # 1 week# 不建议这样做,但是您也可以无限期地禁用此警告:
ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false
7. 1 pool(s) do not have an application enabled

执行ceph -s 出现pool(s) do not have an application enabled告警

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-MwmiQAZc-1667266646300)(Ceph.assets/image-20220907191339251.png)]

原因:执行监控检查命令ceph health detail,发现提示application not enabled on pool ‘mypool’

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-lbwCOjVT-1667266646301)(Ceph.assets/image-20220907191508997.png)]

意思是mypool存储池未设置任何应用(rdb,cephfs等),设置一个即可

cephadmin@ceph-master01:~# ceph osd pool application enable  mypool  rbd
enabled application 'rbd' on pool 'mypool'
cephadmin@ceph-master01:~# ceph -s 
  cluster:
    id:     f69afe6f-e559-4df7-998a-c5dc3e300209
    health: HEALTH_WARN
            clock skew detected on mon.ceph-master02
 
  services:
    mon: 3 daemons, quorum ceph-master01,ceph-master02,ceph-master03 (age 26h)
    mgr: ceph-master03(active, since 26h), standbys: ceph-master01, ceph-master02
    osd: 9 osds: 9 up (since 25h), 9in(since 26h)
 
  data:
    pools:   2 pools, 33 pgs
    objects: 1 objects, 100 MiB
    usage:   370 MiB used, 450 GiB / 450 GiB avail
    pgs:     33 active+clean
8.clock skew detected on mon.ceph-master02

多次执行ntpdate ntp.aliyun.com 同步时间,依然报警事件有误差

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-6EmLKmfn-1667266646302)(Ceph.assets/image-20220907201432575.png)]

原因:

cephadmin@ceph-master01:~# ceph health detail
HEALTH_WARN clock skew detected on mon.ceph-master02
[WRN] MON_CLOCK_SKEW: clock skew detected on mon.ceph-master02
    mon.ceph-master02 clock skew 0.0662306s > max 0.05s (latency 0.00108482s)

服务器误差大于0.05秒,就会出现告警

解决办法:

修改默认参数

mon clock drift allowed #监视器间允许的时钟漂移量

mon clock drift warn backoff #时钟偏移警告的退避指数

cephadmin@ceph-master01:/etc/ceph-cluster# vi /etc/ceph/ceph.conf  [mon]
mon_clock_drift_allowed =0.10
mon clock drift warn backoff =10#重启所有mon节点
root@ceph-master01:~# systemctl   restart [email protected]
root@ceph-master02:~# systemctl   restart [email protected]
root@ceph-master03:~# systemctl   restart [email protected]
9. 低内核无法挂载ceph rbd块存储

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-i5i7K00g-1667266646303)(Ceph.assets/image-20220908212236350.png)]

环境:

ceph 16.2.10

客户端 ceph-common 15.2.17

客户端内核 3.10.0-862

客户端操作系统 centos 7.5

所有的文章都告诉你 “低内核版本挂载rbd存储,rbd创建时只开发layering特性就能挂载”

rbd create myimg2 --size 3G --pool myrbd1 --image-format 2 --image-feature layering

但是低版本内核也分低版本,超低版本,在执行挂载命令 rbd -p myrbd1 map myimg2

centos 7.5 操作系统默认没升级内核的3.10.0-862版本不行。

centos 7.7 操作系统默认没升级内核的3.10.0-1062.4.3版本可以。

10. 1 filesystem is online with fewer MDS than max_mds

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-f29SjgfK-1667266646304)(Ceph.assets/image-20220921140448449.png)]

原因:

mds服务没有创建启动

解决办法:

ceph-deploy mds create  ceph-master01 ceph-master02 ceph-master03
如果AUTH_INSECURE_GLOBAL_ID_RECLAIM还没有引发健康警报并且auth_expose_insecure_global_id_reclaim尚未禁用该设置(默认情况下处于启用状态),则当前没有需要升级的客户端已连接,可以安全地禁止不安全的global_id回收:
ceph config set mon auth_allow_insecure_global_id_reclaim false# 如果仍然有需要升级的客户端,则可以使用以下方法暂时使此警报停止:
ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w   # 1 week# 不建议这样做,但是您也可以无限期地禁用此警告:
ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false
7. 1 pool(s) do not have an application enabled

执行ceph -s 出现pool(s) do not have an application enabled告警

[外链图片转存中…(img-MwmiQAZc-1667266646300)]

原因:执行监控检查命令ceph health detail,发现提示application not enabled on pool ‘mypool’

[外链图片转存中…(img-lbwCOjVT-1667266646301)]

意思是mypool存储池未设置任何应用(rdb,cephfs等),设置一个即可

cephadmin@ceph-master01:~# ceph osd pool application enable  mypool  rbd
enabled application 'rbd' on pool 'mypool'
cephadmin@ceph-master01:~# ceph -s 
  cluster:
    id:     f69afe6f-e559-4df7-998a-c5dc3e300209
    health: HEALTH_WARN
            clock skew detected on mon.ceph-master02
 
  services:
    mon: 3 daemons, quorum ceph-master01,ceph-master02,ceph-master03 (age 26h)
    mgr: ceph-master03(active, since 26h), standbys: ceph-master01, ceph-master02
    osd: 9 osds: 9 up (since 25h), 9in(since 26h)
 
  data:
    pools:   2 pools, 33 pgs
    objects: 1 objects, 100 MiB
    usage:   370 MiB used, 450 GiB / 450 GiB avail
    pgs:     33 active+clean
8.clock skew detected on mon.ceph-master02

多次执行ntpdate ntp.aliyun.com 同步时间,依然报警事件有误差

[外链图片转存中…(img-6EmLKmfn-1667266646302)]

原因:

cephadmin@ceph-master01:~# ceph health detail
HEALTH_WARN clock skew detected on mon.ceph-master02
[WRN] MON_CLOCK_SKEW: clock skew detected on mon.ceph-master02
    mon.ceph-master02 clock skew 0.0662306s > max 0.05s (latency 0.00108482s)

服务器误差大于0.05秒,就会出现告警

解决办法:

修改默认参数

mon clock drift allowed #监视器间允许的时钟漂移量

mon clock drift warn backoff #时钟偏移警告的退避指数

cephadmin@ceph-master01:/etc/ceph-cluster# vi /etc/ceph/ceph.conf  [mon]
mon_clock_drift_allowed =0.10
mon clock drift warn backoff =10#重启所有mon节点
root@ceph-master01:~# systemctl   restart [email protected]
root@ceph-master02:~# systemctl   restart [email protected]
root@ceph-master03:~# systemctl   restart [email protected]
9. 低内核无法挂载ceph rbd块存储

[外链图片转存中…(img-i5i7K00g-1667266646303)]

环境:

ceph 16.2.10

客户端 ceph-common 15.2.17

客户端内核 3.10.0-862

客户端操作系统 centos 7.5

所有的文章都告诉你 “低内核版本挂载rbd存储,rbd创建时只开发layering特性就能挂载”

rbd create myimg2 --size 3G --pool myrbd1 --image-format 2 --image-feature layering

但是低版本内核也分低版本,超低版本,在执行挂载命令 rbd -p myrbd1 map myimg2

centos 7.5 操作系统默认没升级内核的3.10.0-862版本不行。

centos 7.7 操作系统默认没升级内核的3.10.0-1062.4.3版本可以。

10. 1 filesystem is online with fewer MDS than max_mds

[外链图片转存中…(img-f29SjgfK-1667266646304)]

原因:

mds服务没有创建启动

解决办法:

ceph-deploy mds create  ceph-master01 ceph-master02 ceph-master03
标签: ceph

本文转载自: https://blog.csdn.net/weixin_43876317/article/details/127627885
版权归原作者 张现伟的成长之路 所有, 如有侵权,请联系我们删除。

“分部署存储Ceph”的评论:

还没有评论