文章目录
- Ceph
- 一.deploy-ceph部署
- 二.理解
- 三.需求
- 四.故障记录
- 1. bash: python2: command not found
- 2.[ceph_deploy][ERROR ] RuntimeError: AttributeError: module 'platform' has no attribute 'linux_distribution'
- ##### 3.apt-cache madison ceph-deploy 为1.5.38的低版本
- 4. RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdd
- 5. 使用zap擦除格式化磁盘时,报错[ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdd
- 6. mons are allowing insecure global_id reclaim
- 7. 1 pool(s) do not have an application enabled
- 8.clock skew detected on mon.ceph-master02
- 9. 低内核无法挂载ceph rbd块存储
- 10. 1 filesystem is online with fewer MDS than max_mds
- 7. 1 pool(s) do not have an application enabled
- 8.clock skew detected on mon.ceph-master02
- 9. 低内核无法挂载ceph rbd块存储
- 10. 1 filesystem is online with fewer MDS than max_mds
Ceph
一.deploy-ceph部署
投入使用ceph前,要知道一个很现实得问题,ceph对低版本内核得客户端使用非常不友好,低内核是指小于等于3.10.0-862,默认的centos7.5及以下的系统都是小于此类内核,无法正常使用ceph的文件存储(cephFS)块存储(RBD)。
ceph部署一定要考虑好版本问题,经测试如果想使用ceph16版本,那你的客户端操作系统内核小于3.10.0-862根本用不了,常见的centos7.5以下默认没升级的内核都是小于3.10.0-862,所以这一大批服务器使用ceph提供的rdb存储都会有问题,而且ceph已经不提供centos的16版本的ceph-common组件,也就是说ceph集群部署16版本,常见的客户端centos7系统只能使用15版本的ceph-common,虽说也可以使用,但也存在一定隐患毕竟不是同一版本客户端软件,目前推荐使用ceph15的最高版本,15版本的安装与16相同,只是ceph源不同。
以上说法不正确,ceph版本选择和客户端内核没有关系,是所有版本的ceph都不友好支持内核小于等于3.10.0-862(CentOS7.5)
环境
ubuntu 18.04b版本
ceph 16.10版本
主机名 | IP | 部署 内容 |
---|---|---|
ceph-master01 | public IP:172.26.156.217 内部通讯IP: 10.0.0.217 | mon,mgr,osd,ceph-deploy |
ceph-master02 | public IP:172.26.156.218 内部通讯IP:10.0.0.218 | mon,mgr,osd |
ceph-master03 | public IP:172.26.156.219 内部通讯IP:10.0.0.219 | mon,mgr,osd |
1.系统环境初始化
1.1 修改主机名,DNS解析
master01:hostnamectl set-hostname ceph-master01vi /etc/hostnameceph-master01master02:hostnamectl set-hostname ceph-master02vi /etc/hostnameceph-master02master03:hostnamectl set-hostname ceph-master03vi /etc/hostnameceph-master03vi /etc/hosts10.0.0.217 ceph-master01.example.local ceph-master0110.0.0.218 ceph-master02.example.local ceph-master0210.0.0.219 ceph-master03.example.local ceph-master03
1.2 时间同步
所有服务器执行
#修改时区timedatectl set-timezone Asia/Shanghai#时间同步root@ubuntu:~# apt install ntpdateroot@ubuntu:~# ntpdate ntp.aliyun.com 1 Sep 20:54:39 ntpdate[9120]: adjust time server 203.107.6.88 offset 0.003441 secroot@ubuntu:~# crontab -e crontab: installing new crontabroot@ubuntu:~# crontab -l * * * * * ntpdate ntp.aliyun.com
1.3 配置apt基础源与ceph源
所有服务器执行如下命令自动替换
#基础源sed -i "s@http://.*archive.ubuntu.com@http://mirrors.tuna.tsinghua.edu.cn@g" /etc/apt/sources.listsed -i "s@http://.*security.ubuntu.com@http://mirrors.tuna.tsinghua.edu.cn@g" /etc/apt/sources.list#ceph源echo "deb https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic main" >> /etc/apt/sources.list.d/ceph.listecho "deb https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic main" >> /etc/apt/sources.list.d/ceph.list#导入ceph源key,不然不能使用ceph源wget -q -O- 'https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc' | sudo apt-key add -# ceph仓库为https的话需要安装下面,不然无法使用https源apt install -y apt-transport-https ca-certificates curl software-properties-commonapt update
1.4关闭selinux与防火墙
# ufw disable
1.5 创建 ceph 集群部署用户cephadmin
推荐使用指定的普通用户部署和运行 ceph 集群,普通用户只要能以非交互方式执行 sudo
命令执行一些特权命令即可,新版的 ceph-deploy 可以指定包含 root 的在内只要可以执
行 sudo 命令的用户,不过仍然推荐使用普通用户,ceph 集群安装完成后会自动创建
ceph 用户(ceph 集群默认会使用 ceph 用户运行各服务进程,如 ceph-osd 等),因此推荐
使用除了 ceph 用户之外的比如 cephuser**、**cephadmin 这样的普通用户去部署和 管理
ceph 集群。
在包含 ceph-deploy 节点的存储节点、mon 节点和 mgr 节点等创建 cephadmin 用户.
groupadd -r -g 2088 cephadmin && useradd -r -m -s /bin/bash -u 2088 -g 2088 cephadmin && echo cephadmin:chinadci888. | chpasswd
各服务器允许 cephadmin 用户以 sudo 执行特权命令:
~# echo "cephadmin ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
1.6分发密钥
deploy节点要与所有服务器mon,mgr,osd节点免密,本文这里只有三台服务器,mon,mgr,osd都混合一起部署,所以只免密了三台服务器
master01(deploy节点):
su - cephadminssh-keygenssh-copy-id cephadmin@ceph-master01ssh-copy-id cephadmin@ceph-master02ssh-copy-id cephadmin@ceph-master03
2. ceph部署
2.1 安装ceph 部署工具
cephadmin@ceph-master01:~$ apt-cache madison ceph-deployceph-deploy | 2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic/main amd64 Packagesceph-deploy | 2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic/main i386 Packagesceph-deploy | 1.5.38-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe amd64 Packagesceph-deploy | 1.5.38-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe i386 Packagescephadmin@ceph-master01:~$ sudo apt install ceph-deploy
2.2 初始化 mon 节点
Ubuntu 各服务器需要单独安装 Python2(mon,mgr,osd节点所有服务器必须做):
cephadmin@ceph-master01:~$ sudo apt install python2.7 -ycephadmin@ceph-master01:~$ sudo ln -sv /usr/bin/python2.7 /usr/bin/python2
ceph-master01:
ceph-deploy new --cluster-network 10.0.0.0/24 --public-network 172.26.0.0/16 ceph-master01 ceph-master02 ceph-master03
–cluster-network: 集群内部之间通讯网络
–public-network:业务客户端使用网络,单独使用网络,规避
~$ mkdir /etc/ceph-cluster~$ sudo chown cephadmin:cephadmin /etc/ceph-cluster~$ cd /etc/ceph-cluster/cephadmin@ceph-master01:/etc/ceph-cluster$ ceph-deploy new --cluster-network 10.0.0.0/24 --public-network 172.26.0.0/16 ceph-master01 ceph-master02 ceph-master03[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy new --cluster-network 10.0.0.0/24 --public-network 172.26.0.0/16 ceph-master01 ceph-master02 ceph-master03[ceph_deploy.cli][INFO ] ceph-deploy options:[ceph_deploy.cli][INFO ] username : None[ceph_deploy.cli][INFO ] verbose : False[ceph_deploy.cli][INFO ] overwrite_conf : False[ceph_deploy.cli][INFO ] quiet : False[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7efd0a772e10>[ceph_deploy.cli][INFO ] cluster : ceph[ceph_deploy.cli][INFO ] ssh_copykey : True[ceph_deploy.cli][INFO ] mon : ['ceph-master01', 'ceph-master02', 'ceph-master03'][ceph_deploy.cli][INFO ] func : <function new at 0x7efd07a2bbd0>[ceph_deploy.cli][INFO ] public_network : 172.26.0.0/16[ceph_deploy.cli][INFO ] ceph_conf : None[ceph_deploy.cli][INFO ] cluster_network : 10.0.0.0/24[ceph_deploy.cli][INFO ] default_release : False[ceph_deploy.cli][INFO ] fsid : None[ceph_deploy.new][DEBUG ] Creating new cluster named ceph[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds[ceph-master01][DEBUG ] connection detected need for sudo[ceph-master01][DEBUG ] connected to host: ceph-master01 [ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ] find the location of an executable[ceph-master01][INFO ] Running command: sudo /bin/ip link show[ceph-master01][INFO ] Running command: sudo /bin/ip addr show[ceph-master01][DEBUG ] IP addresses found: [u'172.26.156.217', u'10.0.0.217'][ceph_deploy.new][DEBUG ] Resolving host ceph-master01[ceph_deploy.new][DEBUG ] Monitor ceph-master01 at 172.26.156.217[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds[ceph-master02][DEBUG ] connected to host: ceph-master01 [ceph-master02][INFO ] Running command: ssh -CT -o BatchMode=yes ceph-master02[ceph-master02][DEBUG ] connection detected need for sudo[ceph-master02][DEBUG ] connected to host: ceph-master02 [ceph-master02][DEBUG ] detect platform information from remote host[ceph-master02][DEBUG ] detect machine type[ceph-master02][DEBUG ] find the location of an executable[ceph-master02][INFO ] Running command: sudo /bin/ip link show[ceph-master02][INFO ] Running command: sudo /bin/ip addr show[ceph-master02][DEBUG ] IP addresses found: [u'10.0.0.218', u'172.26.156.218'][ceph_deploy.new][DEBUG ] Resolving host ceph-master02[ceph_deploy.new][DEBUG ] Monitor ceph-master02 at 172.26.156.218[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds[ceph-master03][DEBUG ] connected to host: ceph-master01 [ceph-master03][INFO ] Running command: ssh -CT -o BatchMode=yes ceph-master03[ceph-master03][DEBUG ] connection detected need for sudo[ceph-master03][DEBUG ] connected to host: ceph-master03 [ceph-master03][DEBUG ] detect platform information from remote host[ceph-master03][DEBUG ] detect machine type[ceph-master03][DEBUG ] find the location of an executable[ceph-master03][INFO ] Running command: sudo /bin/ip link show[ceph-master03][INFO ] Running command: sudo /bin/ip addr show[ceph-master03][DEBUG ] IP addresses found: [u'172.26.156.219', u'10.0.0.219'][ceph_deploy.new][DEBUG ] Resolving host ceph-master03[ceph_deploy.new][DEBUG ] Monitor ceph-master03 at 172.26.156.219[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-master01', 'ceph-master02', 'ceph-master03'][ceph_deploy.new][DEBUG ] Monitor addrs are [u'172.26.156.217', u'172.26.156.218', u'172.26.156.219'][ceph_deploy.new][DEBUG ] Creating a random mon key...[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...cephadmin@ceph-master01:/etc/ceph-cluster$ lltotal 36drwxr-xr-x 2 cephadmin cephadmin 4096 Sep 2 16:50 ./drwxr-xr-x 91 root root 4096 Sep 2 16:22 ../-rw-rw-r-- 1 cephadmin cephadmin 326 Sep 2 16:50 ceph.conf-rw-rw-r-- 1 cephadmin cephadmin 17603 Sep 2 16:50 ceph-deploy-ceph.log-rw------- 1 cephadmin cephadmin 73 Sep 2 16:50 ceph.mon.keyring
此步骤必须执行,否 ceph 集群的后续安装步骤会报错。
cephadmin@ceph-master01:/etc/ceph-cluster$ ceph-deploy install --no-adjust-repos --nogpgcheck ceph-master01 ceph-master02 ceph-master03
--no-adjust-repos #不修改已有的 apt 仓库源(默认会使用官方仓库) --nogpgcheck #不进行校验
cephadmin@ceph-master01:/etc/ceph-cluster$ ceph-deploy install --no-adjust-repos --nogpgcheck ceph-master01 ceph-master02 ceph-master03[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy install --no-adjust-repos --nogpgcheck ceph-master01 ceph-master02 ceph-master03[ceph_deploy.cli][INFO ] ceph-deploy options:[ceph_deploy.cli][INFO ] verbose : False[ceph_deploy.cli][INFO ] testing : None[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f59e4913e60>[ceph_deploy.cli][INFO ] cluster : ceph[ceph_deploy.cli][INFO ] dev_commit : None[ceph_deploy.cli][INFO ] install_mds : False[ceph_deploy.cli][INFO ] stable : None[ceph_deploy.cli][INFO ] default_release : False[ceph_deploy.cli][INFO ] username : None[ceph_deploy.cli][INFO ] adjust_repos : False[ceph_deploy.cli][INFO ] func : <function install at 0x7f59e51c5b50>[ceph_deploy.cli][INFO ] install_mgr : False[ceph_deploy.cli][INFO ] install_all : False[ceph_deploy.cli][INFO ] repo : False[ceph_deploy.cli][INFO ] host : ['ceph-master01', 'ceph-master02', 'ceph-master03'][ceph_deploy.cli][INFO ] install_rgw : False[ceph_deploy.cli][INFO ] install_tests : False[ceph_deploy.cli][INFO ] repo_url : None[ceph_deploy.cli][INFO ] ceph_conf : None[ceph_deploy.cli][INFO ] install_osd : False[ceph_deploy.cli][INFO ] version_kind : stable[ceph_deploy.cli][INFO ] install_common : False[ceph_deploy.cli][INFO ] overwrite_conf : False[ceph_deploy.cli][INFO ] quiet : False[ceph_deploy.cli][INFO ] dev : master[ceph_deploy.cli][INFO ] nogpgcheck : True[ceph_deploy.cli][INFO ] local_mirror : None[ceph_deploy.cli][INFO ] release : None[ceph_deploy.cli][INFO ] install_mon : False[ceph_deploy.cli][INFO ] gpg_url : None[ceph_deploy.install][DEBUG ] Installing stable version mimic on cluster ceph hosts ceph-master01 ceph-master02 ceph-master03[ceph_deploy.install][DEBUG ] Detecting platform for host ceph-master01 ...[ceph-master01][DEBUG ] connection detected need for sudo[ceph-master01][DEBUG ] connected to host: ceph-master01 [ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph_deploy.install][INFO ] Distro info: Ubuntu 18.04 bionic[ceph-master01][INFO ] installing Ceph on ceph-master01[ceph-master01][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update[ceph-master01][DEBUG ] Hit:1 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease[ceph-master01][DEBUG ] Hit:2 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease[ceph-master01][DEBUG ] Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease[ceph-master01][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease[ceph-master01][DEBUG ] Hit:5 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease[ceph-master01][DEBUG ] Hit:6 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease[ceph-master01][DEBUG ] Reading package lists...[ceph-master01][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ca-certificates apt-transport-https[ceph-master01][DEBUG ] Reading package lists...[ceph-master01][DEBUG ] Building dependency tree...[ceph-master01][DEBUG ] Reading state information...[ceph-master01][DEBUG ] ca-certificates is already the newest version (20211016~18.04.1).[ceph-master01][DEBUG ] apt-transport-https is already the newest version (1.6.14).[ceph-master01][DEBUG ] 0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.[ceph-master01][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update[ceph-master01][DEBUG ] Hit:1 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease[ceph-master01][DEBUG ] Hit:2 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease[ceph-master01][DEBUG ] Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease[ceph-master01][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease[ceph-master01][DEBUG ] Hit:5 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease[ceph-master01][DEBUG ] Hit:6 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease[ceph-master01][DEBUG ] Reading package lists...[ceph-master01][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ceph ceph-osd ceph-mds ceph-mon radosgw[ceph-master01][DEBUG ] Reading package lists...[ceph-master01][DEBUG ] Building dependency tree...[ceph-master01][DEBUG ] Reading state information...[ceph-master01][DEBUG ] ceph is already the newest version (16.2.10-1bionic).[ceph-master01][DEBUG ] ceph-mds is already the newest version (16.2.10-1bionic).[ceph-master01][DEBUG ] ceph-mon is already the newest version (16.2.10-1bionic).[ceph-master01][DEBUG ] ceph-osd is already the newest version (16.2.10-1bionic).[ceph-master01][DEBUG ] radosgw is already the newest version (16.2.10-1bionic).[ceph-master01][DEBUG ] 0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.[ceph-master01][INFO ] Running command: sudo ceph --version[ceph-master01][DEBUG ] ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable)[ceph_deploy.install][DEBUG ] Detecting platform for host ceph-master02 ...[ceph-master02][DEBUG ] connection detected need for sudo[ceph-master02][DEBUG ] connected to host: ceph-master02 [ceph-master02][DEBUG ] detect platform information from remote host[ceph-master02][DEBUG ] detect machine type[ceph_deploy.install][INFO ] Distro info: Ubuntu 18.04 bionic[ceph-master02][INFO ] installing Ceph on ceph-master02[ceph-master02][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update[ceph-master02][DEBUG ] Hit:1 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease[ceph-master02][DEBUG ] Hit:2 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease[ceph-master02][DEBUG ] Get:3 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease [8,572 B][ceph-master02][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease[ceph-master02][DEBUG ] Get:5 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease [8,560 B][ceph-master02][DEBUG ] Hit:6 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease[ceph-master02][DEBUG ] Fetched 17.1 kB in 1s (13.1 kB/s)[ceph-master02][DEBUG ] Reading package lists...[ceph-master02][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ca-certificates apt-transport-https[ceph-master02][DEBUG ] Reading package lists...[ceph-master02][DEBUG ] Building dependency tree...[ceph-master02][DEBUG ] Reading state information...[ceph-master02][DEBUG ] ca-certificates is already the newest version (20211016~18.04.1).[ceph-master02][DEBUG ] apt-transport-https is already the newest version (1.6.14).[ceph-master02][DEBUG ] 0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.[ceph-master02][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update[ceph-master02][DEBUG ] Hit:1 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease[ceph-master02][DEBUG ] Hit:2 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease[ceph-master02][DEBUG ] Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease[ceph-master02][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease[ceph-master02][DEBUG ] Get:5 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease [8,572 B][ceph-master02][DEBUG ] Get:6 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease [8,560 B][ceph-master02][DEBUG ] Fetched 17.1 kB in 1s (12.5 kB/s)[ceph-master02][DEBUG ] Reading package lists...[ceph-master02][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ceph ceph-osd ceph-mds ceph-mon radosgw[ceph-master02][DEBUG ] Reading package lists...[ceph-master02][DEBUG ] Building dependency tree...[ceph-master02][DEBUG ] Reading state information...[ceph-master02][DEBUG ] ceph is already the newest version (16.2.10-1bionic).[ceph-master02][DEBUG ] ceph-mds is already the newest version (16.2.10-1bionic).[ceph-master02][DEBUG ] ceph-mon is already the newest version (16.2.10-1bionic).[ceph-master02][DEBUG ] ceph-osd is already the newest version (16.2.10-1bionic).[ceph-master02][DEBUG ] radosgw is already the newest version (16.2.10-1bionic).[ceph-master02][DEBUG ] 0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.[ceph-master02][INFO ] Running command: sudo ceph --version[ceph-master02][DEBUG ] ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable)[ceph_deploy.install][DEBUG ] Detecting platform for host ceph-master03 ...[ceph-master03][DEBUG ] connection detected need for sudo[ceph-master03][DEBUG ] connected to host: ceph-master03 [ceph-master03][DEBUG ] detect platform information from remote host[ceph-master03][DEBUG ] detect machine type[ceph_deploy.install][INFO ] Distro info: Ubuntu 18.04 bionic[ceph-master03][INFO ] installing Ceph on ceph-master03[ceph-master03][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update[ceph-master03][DEBUG ] Hit:1 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease[ceph-master03][DEBUG ] Hit:2 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease[ceph-master03][DEBUG ] Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease[ceph-master03][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease[ceph-master03][DEBUG ] Get:5 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease [8,572 B][ceph-master03][DEBUG ] Get:6 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease [8,560 B][ceph-master03][DEBUG ] Fetched 17.1 kB in 2s (8,636 B/s)[ceph-master03][DEBUG ] Reading package lists...[ceph-master03][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ca-certificates apt-transport-https[ceph-master03][DEBUG ] Reading package lists...[ceph-master03][DEBUG ] Building dependency tree...[ceph-master03][DEBUG ] Reading state information...[ceph-master03][DEBUG ] ca-certificates is already the newest version (20211016~18.04.1).[ceph-master03][DEBUG ] apt-transport-https is already the newest version (1.6.14).[ceph-master03][DEBUG ] 0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.[ceph-master03][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update[ceph-master03][DEBUG ] Hit:1 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic InRelease[ceph-master03][DEBUG ] Hit:2 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates InRelease[ceph-master03][DEBUG ] Hit:3 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-backports InRelease[ceph-master03][DEBUG ] Hit:4 http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security InRelease[ceph-master03][DEBUG ] Get:5 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic InRelease [8,572 B][ceph-master03][DEBUG ] Get:6 https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic InRelease [8,560 B][ceph-master03][DEBUG ] Fetched 17.1 kB in 1s (14.3 kB/s)[ceph-master03][DEBUG ] Reading package lists...[ceph-master03][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ceph ceph-osd ceph-mds ceph-mon radosgw[ceph-master03][DEBUG ] Reading package lists...[ceph-master03][DEBUG ] Building dependency tree...[ceph-master03][DEBUG ] Reading state information...[ceph-master03][DEBUG ] ceph is already the newest version (16.2.10-1bionic).[ceph-master03][DEBUG ] ceph-mds is already the newest version (16.2.10-1bionic).[ceph-master03][DEBUG ] ceph-mon is already the newest version (16.2.10-1bionic).[ceph-master03][DEBUG ] ceph-osd is already the newest version (16.2.10-1bionic).[ceph-master03][DEBUG ] radosgw is already the newest version (16.2.10-1bionic).[ceph-master03][DEBUG ] 0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.[ceph-master03][INFO ] Running command: sudo ceph --version[ceph-master03][DEBUG ] ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable)
此 过 程 会 在 指 定 的 ceph node 节 点 按 照 串 行 的 方 式 逐 个 服 务 器 安 装 ceph-base
ceph-common 等组件包:
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-G4FTQuMs-1667266646244)(Ceph.assets/image-20220905102551098.png)]
2.3 安装ceph-mon服务
2.3.1 ceph-mon节点安装ceph-mon
cephadmin@ceph-master01:/etc/ceph-cluster# apt-cache madison ceph-mon ceph-mon | 16.2.10-1bionic | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic/main amd64 Packages ceph-mon | 14.2.22-1bionic | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-nautilus bionic/main amd64 Packages ceph-mon | 12.2.13-0ubuntu0.18.04.10 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-updates/main amd64 Packages ceph-mon | 12.2.13-0ubuntu0.18.04.10 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic-security/main amd64 Packages ceph-mon | 12.2.4-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/main amd64 Packagescephadmin@ceph-master01:/etc/ceph-cluster$ root@ceph-master01:~# apt install ceph-monroot@ceph-master02:~# apt install ceph-monroot@ceph-master03:~# apt install ceph-mon
#可能已经安装完毕
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-S0YzZbkN-1667266646253)(Ceph.assets/image-20220905104407863.png)]
2.3.2 ceph 集群添加 ceph-mon 服务
cephadmin@ceph-master01:/etc/ceph-cluster# pwd/etc/ceph-clustercephadmin@ceph-master01:/etc/ceph-cluster# cat ceph.conf [global]fsid = f69afe6f-e559-4df7-998a-c5dc3e300209public_network = 172.26.0.0/16cluster_network = 10.0.0.0/24mon_initial_members = ceph-master01, ceph-master02, ceph-master03mon_host = 172.26.156.217,172.26.156.218,172.26.156.219 #通过配置文件将mon服务加入节点auth_cluster_required = cephxauth_service_required = cephxauth_client_required = cephxcephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy mon create-initial[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon create-initial[ceph_deploy.cli][INFO ] ceph-deploy options:[ceph_deploy.cli][INFO ] username : None[ceph_deploy.cli][INFO ] verbose : False[ceph_deploy.cli][INFO ] overwrite_conf : False[ceph_deploy.cli][INFO ] subcommand : create-initial[ceph_deploy.cli][INFO ] quiet : False[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fe450df12d0>[ceph_deploy.cli][INFO ] cluster : ceph[ceph_deploy.cli][INFO ] func : <function mon at 0x7fe450dcebd0>[ceph_deploy.cli][INFO ] ceph_conf : None[ceph_deploy.cli][INFO ] keyrings : None[ceph_deploy.cli][INFO ] default_release : False[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-master01 ceph-master02 ceph-master03[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-master01 ...[ceph-master01][DEBUG ] connection detected need for sudo[ceph-master01][DEBUG ] connected to host: ceph-master01 [ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ] find the location of an executable[ceph_deploy.mon][INFO ] distro info: Ubuntu 18.04 bionic[ceph-master01][DEBUG ] determining if provided host has same hostname in remote[ceph-master01][DEBUG ] get remote short hostname[ceph-master01][DEBUG ] deploying mon to ceph-master01[ceph-master01][DEBUG ] get remote short hostname[ceph-master01][DEBUG ] remote hostname: ceph-master01[ceph-master01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf[ceph-master01][DEBUG ] create the mon path if it does not exist[ceph-master01][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-master01/done[ceph-master01][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-master01/done[ceph-master01][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-master01.mon.keyring[ceph-master01][DEBUG ] create the monitor keyring file[ceph-master01][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i ceph-master01 --keyring /var/lib/ceph/tmp/ceph-ceph-master01.mon.keyring --setuser 64045 --setgroup 64045[ceph-master01][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-master01.mon.keyring[ceph-master01][DEBUG ] create a done file to avoid re-doing the mon deployment[ceph-master01][DEBUG ] create the init path if it does not exist[ceph-master01][INFO ] Running command: sudo systemctl enable ceph.target[ceph-master01][INFO ] Running command: sudo systemctl enable ceph-mon@ceph-master01[ceph-master01][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/ceph-mon@ceph-master01.service → /lib/systemd/system/ceph-mon@.service.[ceph-master01][INFO ] Running command: sudo systemctl start ceph-mon@ceph-master01[ceph-master01][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master01.asok mon_status[ceph-master01][DEBUG ] ********************************************************************************[ceph-master01][DEBUG ] status for monitor: mon.ceph-master01[ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "election_epoch": 0, [ceph-master01][DEBUG ] "extra_probe_peers": [[ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addrvec": [[ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "172.26.156.218:3300", [ceph-master01][DEBUG ] "nonce": 0, [ceph-master01][DEBUG ] "type": "v2"[ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "172.26.156.218:6789", [ceph-master01][DEBUG ] "nonce": 0, [ceph-master01][DEBUG ] "type": "v1"[ceph-master01][DEBUG ] }[ceph-master01][DEBUG ] ][ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addrvec": [[ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "172.26.156.219:3300", [ceph-master01][DEBUG ] "nonce": 0, [ceph-master01][DEBUG ] "type": "v2"[ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "172.26.156.219:6789", [ceph-master01][DEBUG ] "nonce": 0, [ceph-master01][DEBUG ] "type": "v1"[ceph-master01][DEBUG ] }[ceph-master01][DEBUG ] ][ceph-master01][DEBUG ] }[ceph-master01][DEBUG ] ], [ceph-master01][DEBUG ] "feature_map": {[ceph-master01][DEBUG ] "mon": [[ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "features": "0x3f01cfb9fffdffff", [ceph-master01][DEBUG ] "num": 1, [ceph-master01][DEBUG ] "release": "luminous"[ceph-master01][DEBUG ] }[ceph-master01][DEBUG ] ][ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] "features": {[ceph-master01][DEBUG ] "quorum_con": "0", [ceph-master01][DEBUG ] "quorum_mon": [], [ceph-master01][DEBUG ] "required_con": "0", [ceph-master01][DEBUG ] "required_mon": [][ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] "monmap": {[ceph-master01][DEBUG ] "created": "2022-09-05T02:52:15.915768Z", [ceph-master01][DEBUG ] "disallowed_leaders: ": "", [ceph-master01][DEBUG ] "election_strategy": 1, [ceph-master01][DEBUG ] "epoch": 0, [ceph-master01][DEBUG ] "features": {[ceph-master01][DEBUG ] "optional": [], [ceph-master01][DEBUG ] "persistent": [][ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] "fsid": "f69afe6f-e559-4df7-998a-c5dc3e300209", [ceph-master01][DEBUG ] "min_mon_release": 0, [ceph-master01][DEBUG ] "min_mon_release_name": "unknown", [ceph-master01][DEBUG ] "modified": "2022-09-05T02:52:15.915768Z", [ceph-master01][DEBUG ] "mons": [[ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "172.26.156.217:6789/0", [ceph-master01][DEBUG ] "crush_location": "{}", [ceph-master01][DEBUG ] "name": "ceph-master01", [ceph-master01][DEBUG ] "priority": 0, [ceph-master01][DEBUG ] "public_addr": "172.26.156.217:6789/0", [ceph-master01][DEBUG ] "public_addrs": {[ceph-master01][DEBUG ] "addrvec": [[ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "172.26.156.217:3300", [ceph-master01][DEBUG ] "nonce": 0, [ceph-master01][DEBUG ] "type": "v2"[ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "172.26.156.217:6789", [ceph-master01][DEBUG ] "nonce": 0, [ceph-master01][DEBUG ] "type": "v1"[ceph-master01][DEBUG ] }[ceph-master01][DEBUG ] ][ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] "rank": 0, [ceph-master01][DEBUG ] "weight": 0[ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "0.0.0.0:0/1", [ceph-master01][DEBUG ] "crush_location": "{}", [ceph-master01][DEBUG ] "name": "ceph-master02", [ceph-master01][DEBUG ] "priority": 0, [ceph-master01][DEBUG ] "public_addr": "0.0.0.0:0/1", [ceph-master01][DEBUG ] "public_addrs": {[ceph-master01][DEBUG ] "addrvec": [[ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "0.0.0.0:0", [ceph-master01][DEBUG ] "nonce": 1, [ceph-master01][DEBUG ] "type": "v1"[ceph-master01][DEBUG ] }[ceph-master01][DEBUG ] ][ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] "rank": 1, [ceph-master01][DEBUG ] "weight": 0[ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "0.0.0.0:0/2", [ceph-master01][DEBUG ] "crush_location": "{}", [ceph-master01][DEBUG ] "name": "ceph-master03", [ceph-master01][DEBUG ] "priority": 0, [ceph-master01][DEBUG ] "public_addr": "0.0.0.0:0/2", [ceph-master01][DEBUG ] "public_addrs": {[ceph-master01][DEBUG ] "addrvec": [[ceph-master01][DEBUG ] {[ceph-master01][DEBUG ] "addr": "0.0.0.0:0", [ceph-master01][DEBUG ] "nonce": 2, [ceph-master01][DEBUG ] "type": "v1"[ceph-master01][DEBUG ] }[ceph-master01][DEBUG ] ][ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] "rank": 2, [ceph-master01][DEBUG ] "weight": 0[ceph-master01][DEBUG ] }[ceph-master01][DEBUG ] ], [ceph-master01][DEBUG ] "stretch_mode": false, [ceph-master01][DEBUG ] "tiebreaker_mon": ""[ceph-master01][DEBUG ] }, [ceph-master01][DEBUG ] "name": "ceph-master01", [ceph-master01][DEBUG ] "outside_quorum": [[ceph-master01][DEBUG ] "ceph-master01"[ceph-master01][DEBUG ] ], [ceph-master01][DEBUG ] "quorum": [], [ceph-master01][DEBUG ] "rank": 0, [ceph-master01][DEBUG ] "state": "probing", [ceph-master01][DEBUG ] "stretch_mode": false, [ceph-master01][DEBUG ] "sync_provider": [][ceph-master01][DEBUG ] }[ceph-master01][DEBUG ] ********************************************************************************[ceph-master01][INFO ] monitor: mon.ceph-master01 is running[ceph-master01][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master01.asok mon_status[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-master02 ...[ceph-master02][DEBUG ] connection detected need for sudo[ceph-master02][DEBUG ] connected to host: ceph-master02 [ceph-master02][DEBUG ] detect platform information from remote host[ceph-master02][DEBUG ] detect machine type[ceph-master02][DEBUG ] find the location of an executable[ceph_deploy.mon][INFO ] distro info: Ubuntu 18.04 bionic[ceph-master02][DEBUG ] determining if provided host has same hostname in remote[ceph-master02][DEBUG ] get remote short hostname[ceph-master02][DEBUG ] deploying mon to ceph-master02[ceph-master02][DEBUG ] get remote short hostname[ceph-master02][DEBUG ] remote hostname: ceph-master02[ceph-master02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf[ceph-master02][DEBUG ] create the mon path if it does not exist[ceph-master02][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-master02/done[ceph-master02][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-master02/done[ceph-master02][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-master02.mon.keyring[ceph-master02][DEBUG ] create the monitor keyring file[ceph-master02][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i ceph-master02 --keyring /var/lib/ceph/tmp/ceph-ceph-master02.mon.keyring --setuser 64045 --setgroup 64045[ceph-master02][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-master02.mon.keyring[ceph-master02][DEBUG ] create a done file to avoid re-doing the mon deployment[ceph-master02][DEBUG ] create the init path if it does not exist[ceph-master02][INFO ] Running command: sudo systemctl enable ceph.target[ceph-master02][INFO ] Running command: sudo systemctl enable ceph-mon@ceph-master02[ceph-master02][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/ceph-mon@ceph-master02.service → /lib/systemd/system/ceph-mon@.service.[ceph-master02][INFO ] Running command: sudo systemctl start ceph-mon@ceph-master02[ceph-master02][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master02.asok mon_status[ceph-master02][DEBUG ] ********************************************************************************[ceph-master02][DEBUG ] status for monitor: mon.ceph-master02[ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "election_epoch": 1, [ceph-master02][DEBUG ] "extra_probe_peers": [[ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addrvec": [[ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "172.26.156.217:3300", [ceph-master02][DEBUG ] "nonce": 0, [ceph-master02][DEBUG ] "type": "v2"[ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "172.26.156.217:6789", [ceph-master02][DEBUG ] "nonce": 0, [ceph-master02][DEBUG ] "type": "v1"[ceph-master02][DEBUG ] }[ceph-master02][DEBUG ] ][ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addrvec": [[ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "172.26.156.219:3300", [ceph-master02][DEBUG ] "nonce": 0, [ceph-master02][DEBUG ] "type": "v2"[ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "172.26.156.219:6789", [ceph-master02][DEBUG ] "nonce": 0, [ceph-master02][DEBUG ] "type": "v1"[ceph-master02][DEBUG ] }[ceph-master02][DEBUG ] ][ceph-master02][DEBUG ] }[ceph-master02][DEBUG ] ], [ceph-master02][DEBUG ] "feature_map": {[ceph-master02][DEBUG ] "mon": [[ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "features": "0x3f01cfb9fffdffff", [ceph-master02][DEBUG ] "num": 1, [ceph-master02][DEBUG ] "release": "luminous"[ceph-master02][DEBUG ] }[ceph-master02][DEBUG ] ][ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] "features": {[ceph-master02][DEBUG ] "quorum_con": "0", [ceph-master02][DEBUG ] "quorum_mon": [], [ceph-master02][DEBUG ] "required_con": "0", [ceph-master02][DEBUG ] "required_mon": [][ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] "monmap": {[ceph-master02][DEBUG ] "created": "2022-09-05T02:52:20.691459Z", [ceph-master02][DEBUG ] "disallowed_leaders: ": "", [ceph-master02][DEBUG ] "election_strategy": 1, [ceph-master02][DEBUG ] "epoch": 0, [ceph-master02][DEBUG ] "features": {[ceph-master02][DEBUG ] "optional": [], [ceph-master02][DEBUG ] "persistent": [][ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] "fsid": "f69afe6f-e559-4df7-998a-c5dc3e300209", [ceph-master02][DEBUG ] "min_mon_release": 0, [ceph-master02][DEBUG ] "min_mon_release_name": "unknown", [ceph-master02][DEBUG ] "modified": "2022-09-05T02:52:20.691459Z", [ceph-master02][DEBUG ] "mons": [[ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "172.26.156.217:6789/0", [ceph-master02][DEBUG ] "crush_location": "{}", [ceph-master02][DEBUG ] "name": "ceph-master01", [ceph-master02][DEBUG ] "priority": 0, [ceph-master02][DEBUG ] "public_addr": "172.26.156.217:6789/0", [ceph-master02][DEBUG ] "public_addrs": {[ceph-master02][DEBUG ] "addrvec": [[ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "172.26.156.217:3300", [ceph-master02][DEBUG ] "nonce": 0, [ceph-master02][DEBUG ] "type": "v2"[ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "172.26.156.217:6789", [ceph-master02][DEBUG ] "nonce": 0, [ceph-master02][DEBUG ] "type": "v1"[ceph-master02][DEBUG ] }[ceph-master02][DEBUG ] ][ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] "rank": 0, [ceph-master02][DEBUG ] "weight": 0[ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "172.26.156.218:6789/0", [ceph-master02][DEBUG ] "crush_location": "{}", [ceph-master02][DEBUG ] "name": "ceph-master02", [ceph-master02][DEBUG ] "priority": 0, [ceph-master02][DEBUG ] "public_addr": "172.26.156.218:6789/0", [ceph-master02][DEBUG ] "public_addrs": {[ceph-master02][DEBUG ] "addrvec": [[ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "172.26.156.218:3300", [ceph-master02][DEBUG ] "nonce": 0, [ceph-master02][DEBUG ] "type": "v2"[ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "172.26.156.218:6789", [ceph-master02][DEBUG ] "nonce": 0, [ceph-master02][DEBUG ] "type": "v1"[ceph-master02][DEBUG ] }[ceph-master02][DEBUG ] ][ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] "rank": 1, [ceph-master02][DEBUG ] "weight": 0[ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "0.0.0.0:0/2", [ceph-master02][DEBUG ] "crush_location": "{}", [ceph-master02][DEBUG ] "name": "ceph-master03", [ceph-master02][DEBUG ] "priority": 0, [ceph-master02][DEBUG ] "public_addr": "0.0.0.0:0/2", [ceph-master02][DEBUG ] "public_addrs": {[ceph-master02][DEBUG ] "addrvec": [[ceph-master02][DEBUG ] {[ceph-master02][DEBUG ] "addr": "0.0.0.0:0", [ceph-master02][DEBUG ] "nonce": 2, [ceph-master02][DEBUG ] "type": "v1"[ceph-master02][DEBUG ] }[ceph-master02][DEBUG ] ][ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] "rank": 2, [ceph-master02][DEBUG ] "weight": 0[ceph-master02][DEBUG ] }[ceph-master02][DEBUG ] ], [ceph-master02][DEBUG ] "stretch_mode": false, [ceph-master02][DEBUG ] "tiebreaker_mon": ""[ceph-master02][DEBUG ] }, [ceph-master02][DEBUG ] "name": "ceph-master02", [ceph-master02][DEBUG ] "outside_quorum": [], [ceph-master02][DEBUG ] "quorum": [], [ceph-master02][DEBUG ] "rank": 1, [ceph-master02][DEBUG ] "state": "electing", [ceph-master02][DEBUG ] "stretch_mode": false, [ceph-master02][DEBUG ] "sync_provider": [][ceph-master02][DEBUG ] }[ceph-master02][DEBUG ] ********************************************************************************[ceph-master02][INFO ] monitor: mon.ceph-master02 is running[ceph-master02][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master02.asok mon_status[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-master03 ...[ceph-master03][DEBUG ] connection detected need for sudo[ceph-master03][DEBUG ] connected to host: ceph-master03 [ceph-master03][DEBUG ] detect platform information from remote host[ceph-master03][DEBUG ] detect machine type[ceph-master03][DEBUG ] find the location of an executable[ceph_deploy.mon][INFO ] distro info: Ubuntu 18.04 bionic[ceph-master03][DEBUG ] determining if provided host has same hostname in remote[ceph-master03][DEBUG ] get remote short hostname[ceph-master03][DEBUG ] deploying mon to ceph-master03[ceph-master03][DEBUG ] get remote short hostname[ceph-master03][DEBUG ] remote hostname: ceph-master03[ceph-master03][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf[ceph-master03][DEBUG ] create the mon path if it does not exist[ceph-master03][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-master03/done[ceph-master03][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-master03/done[ceph-master03][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-master03.mon.keyring[ceph-master03][DEBUG ] create the monitor keyring file[ceph-master03][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i ceph-master03 --keyring /var/lib/ceph/tmp/ceph-ceph-master03.mon.keyring --setuser 64045 --setgroup 64045[ceph-master03][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-master03.mon.keyring[ceph-master03][DEBUG ] create a done file to avoid re-doing the mon deployment[ceph-master03][DEBUG ] create the init path if it does not exist[ceph-master03][INFO ] Running command: sudo systemctl enable ceph.target[ceph-master03][INFO ] Running command: sudo systemctl enable ceph-mon@ceph-master03[ceph-master03][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/ceph-mon@ceph-master03.service → /lib/systemd/system/ceph-mon@.service.[ceph-master03][INFO ] Running command: sudo systemctl start ceph-mon@ceph-master03[ceph-master03][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master03.asok mon_status[ceph-master03][DEBUG ] ********************************************************************************[ceph-master03][DEBUG ] status for monitor: mon.ceph-master03[ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "election_epoch": 0, [ceph-master03][DEBUG ] "extra_probe_peers": [[ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addrvec": [[ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "172.26.156.217:3300", [ceph-master03][DEBUG ] "nonce": 0, [ceph-master03][DEBUG ] "type": "v2"[ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "172.26.156.217:6789", [ceph-master03][DEBUG ] "nonce": 0, [ceph-master03][DEBUG ] "type": "v1"[ceph-master03][DEBUG ] }[ceph-master03][DEBUG ] ][ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addrvec": [[ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "172.26.156.218:3300", [ceph-master03][DEBUG ] "nonce": 0, [ceph-master03][DEBUG ] "type": "v2"[ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "172.26.156.218:6789", [ceph-master03][DEBUG ] "nonce": 0, [ceph-master03][DEBUG ] "type": "v1"[ceph-master03][DEBUG ] }[ceph-master03][DEBUG ] ][ceph-master03][DEBUG ] }[ceph-master03][DEBUG ] ], [ceph-master03][DEBUG ] "feature_map": {[ceph-master03][DEBUG ] "mon": [[ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "features": "0x3f01cfb9fffdffff", [ceph-master03][DEBUG ] "num": 1, [ceph-master03][DEBUG ] "release": "luminous"[ceph-master03][DEBUG ] }[ceph-master03][DEBUG ] ][ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] "features": {[ceph-master03][DEBUG ] "quorum_con": "0", [ceph-master03][DEBUG ] "quorum_mon": [], [ceph-master03][DEBUG ] "required_con": "0", [ceph-master03][DEBUG ] "required_mon": [][ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] "monmap": {[ceph-master03][DEBUG ] "created": "2022-09-05T02:52:25.483539Z", [ceph-master03][DEBUG ] "disallowed_leaders: ": "", [ceph-master03][DEBUG ] "election_strategy": 1, [ceph-master03][DEBUG ] "epoch": 0, [ceph-master03][DEBUG ] "features": {[ceph-master03][DEBUG ] "optional": [], [ceph-master03][DEBUG ] "persistent": [][ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] "fsid": "f69afe6f-e559-4df7-998a-c5dc3e300209", [ceph-master03][DEBUG ] "min_mon_release": 0, [ceph-master03][DEBUG ] "min_mon_release_name": "unknown", [ceph-master03][DEBUG ] "modified": "2022-09-05T02:52:25.483539Z", [ceph-master03][DEBUG ] "mons": [[ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "172.26.156.219:6789/0", [ceph-master03][DEBUG ] "crush_location": "{}", [ceph-master03][DEBUG ] "name": "ceph-master03", [ceph-master03][DEBUG ] "priority": 0, [ceph-master03][DEBUG ] "public_addr": "172.26.156.219:6789/0", [ceph-master03][DEBUG ] "public_addrs": {[ceph-master03][DEBUG ] "addrvec": [[ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "172.26.156.219:3300", [ceph-master03][DEBUG ] "nonce": 0, [ceph-master03][DEBUG ] "type": "v2"[ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "172.26.156.219:6789", [ceph-master03][DEBUG ] "nonce": 0, [ceph-master03][DEBUG ] "type": "v1"[ceph-master03][DEBUG ] }[ceph-master03][DEBUG ] ][ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] "rank": 0, [ceph-master03][DEBUG ] "weight": 0[ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "0.0.0.0:0/1", [ceph-master03][DEBUG ] "crush_location": "{}", [ceph-master03][DEBUG ] "name": "ceph-master01", [ceph-master03][DEBUG ] "priority": 0, [ceph-master03][DEBUG ] "public_addr": "0.0.0.0:0/1", [ceph-master03][DEBUG ] "public_addrs": {[ceph-master03][DEBUG ] "addrvec": [[ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "0.0.0.0:0", [ceph-master03][DEBUG ] "nonce": 1, [ceph-master03][DEBUG ] "type": "v1"[ceph-master03][DEBUG ] }[ceph-master03][DEBUG ] ][ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] "rank": 1, [ceph-master03][DEBUG ] "weight": 0[ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "0.0.0.0:0/2", [ceph-master03][DEBUG ] "crush_location": "{}", [ceph-master03][DEBUG ] "name": "ceph-master02", [ceph-master03][DEBUG ] "priority": 0, [ceph-master03][DEBUG ] "public_addr": "0.0.0.0:0/2", [ceph-master03][DEBUG ] "public_addrs": {[ceph-master03][DEBUG ] "addrvec": [[ceph-master03][DEBUG ] {[ceph-master03][DEBUG ] "addr": "0.0.0.0:0", [ceph-master03][DEBUG ] "nonce": 2, [ceph-master03][DEBUG ] "type": "v1"[ceph-master03][DEBUG ] }[ceph-master03][DEBUG ] ][ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] "rank": 2, [ceph-master03][DEBUG ] "weight": 0[ceph-master03][DEBUG ] }[ceph-master03][DEBUG ] ], [ceph-master03][DEBUG ] "stretch_mode": false, [ceph-master03][DEBUG ] "tiebreaker_mon": ""[ceph-master03][DEBUG ] }, [ceph-master03][DEBUG ] "name": "ceph-master03", [ceph-master03][DEBUG ] "outside_quorum": [[ceph-master03][DEBUG ] "ceph-master03"[ceph-master03][DEBUG ] ], [ceph-master03][DEBUG ] "quorum": [], [ceph-master03][DEBUG ] "rank": 0, [ceph-master03][DEBUG ] "state": "probing", [ceph-master03][DEBUG ] "stretch_mode": false, [ceph-master03][DEBUG ] "sync_provider": [][ceph-master03][DEBUG ] }[ceph-master03][DEBUG ] ********************************************************************************[ceph-master03][INFO ] monitor: mon.ceph-master03 is running[ceph-master03][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master03.asok mon_status[ceph_deploy.mon][INFO ] processing monitor mon.ceph-master01[ceph-master01][DEBUG ] connection detected need for sudo[ceph-master01][DEBUG ] connected to host: ceph-master01 [ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ] find the location of an executable[ceph-master01][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master01.asok mon_status[ceph_deploy.mon][WARNIN] mon.ceph-master01 monitor is not yet in quorum, tries left: 5[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying[ceph-master01][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master01.asok mon_status[ceph_deploy.mon][WARNIN] mon.ceph-master01 monitor is not yet in quorum, tries left: 4[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying[ceph-master01][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master01.asok mon_status[ceph_deploy.mon][INFO ] mon.ceph-master01 monitor has reached quorum![ceph_deploy.mon][INFO ] processing monitor mon.ceph-master02[ceph-master02][DEBUG ] connection detected need for sudo[ceph-master02][DEBUG ] connected to host: ceph-master02 [ceph-master02][DEBUG ] detect platform information from remote host[ceph-master02][DEBUG ] detect machine type[ceph-master02][DEBUG ] find the location of an executable[ceph-master02][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master02.asok mon_status[ceph_deploy.mon][INFO ] mon.ceph-master02 monitor has reached quorum![ceph_deploy.mon][INFO ] processing monitor mon.ceph-master03[ceph-master03][DEBUG ] connection detected need for sudo[ceph-master03][DEBUG ] connected to host: ceph-master03 [ceph-master03][DEBUG ] detect platform information from remote host[ceph-master03][DEBUG ] detect machine type[ceph-master03][DEBUG ] find the location of an executable[ceph-master03][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-master03.asok mon_status[ceph_deploy.mon][INFO ] mon.ceph-master03 monitor has reached quorum![ceph_deploy.mon][INFO ] all initial monitors are running and have formed quorum[ceph_deploy.mon][INFO ] Running gatherkeys...[ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory /tmp/tmpP6crY0[ceph-master01][DEBUG ] connection detected need for sudo[ceph-master01][DEBUG ] connected to host: ceph-master01 [ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ] get remote short hostname[ceph-master01][DEBUG ] fetch remote file[ceph-master01][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph-master01.asok mon_status[ceph-master01][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-master01/keyring auth get client.admin[ceph-master01][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-master01/keyring auth get client.bootstrap-mds[ceph-master01][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-master01/keyring auth get client.bootstrap-mgr[ceph-master01][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-master01/keyring auth get client.bootstrap-osd[ceph-master01][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-master01/keyring auth get client.bootstrap-rgw[ceph_deploy.gatherkeys][INFO ] Storing ceph.client.admin.keyring[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mds.keyring[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mgr.keyring[ceph_deploy.gatherkeys][INFO ] keyring 'ceph.mon.keyring' already exists[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-osd.keyring[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-rgw.keyring[ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpP6crY0
2.3.2 验证mon节点
验证在 mon 定节点已经自动安装并启动了 ceph-mon 服务,ceph-mon服务的作用之一就是验证权限,会在ceph-deploy 节点初始化目录会生成 ceph.bootstrap-mds/mgr/osd/rgw 服务的 keyring 认证文件,这
些初始化文件拥有对 ceph 集群的最高权限,所以一定要保存好,后续需要发送给各个服务节点。
cephadmin@ceph-master01:/etc/ceph-cluster# ps -ef | grep ceph-monceph 28179 1 0 10:52 ? 00:00:05 /usr/bin/ceph-mon -f --cluster ceph --id ceph-master01 --setuser ceph --setgroup cephcephadm+ 28519 28038 0 11:10 pts/0 00:00:00 grep --color=auto ceph-moncephadmin@ceph-master01:/etc/ceph-cluster# systemctl status ceph-mon.target ● ceph-mon.target - ceph target allowing to start/stop all ceph-mon@.service instances at once Loaded: loaded (/lib/systemd/system/ceph-mon.target; enabled; vendor preset: enabled) Active: active since Mon 2022-09-05 09:46:11 CST; 1h 24min agocephadmin@ceph-master01:/etc/ceph-cluster# ll total 248drwxr-xr-x 2 cephadmin cephadmin 4096 Sep 5 10:52 ./drwxr-xr-x 92 root root 4096 Sep 5 09:46 ../-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-mds.keyring-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-mgr.keyring-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-osd.keyring-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-rgw.keyring-rw------- 1 cephadmin cephadmin 151 Sep 5 10:52 ceph.client.admin.keyring-rw-rw-r-- 1 cephadmin cephadmin 326 Sep 2 16:50 ceph.conf-rw-rw-r-- 1 cephadmin cephadmin 209993 Sep 5 10:52 ceph-deploy-ceph.log-rw------- 1 cephadmin cephadmin 73 Sep 2 16:50 ceph.mon.keyring
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-j5pGvdIW-1667266646255)(Ceph.assets/image-20220905111057439.png)]
执行ceph -s 发现有健康告警
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-D50XaoWI-1667266646256)(Ceph.assets/image-20220906174218646.png)]
在其中一个mon节点执行:
ceph config set mon auth_allow_insecure_global_id_reclaim false
2.4 分发admin 秘钥
在 ceph-deploy 节点把配置文件和 admin 密钥拷贝至 Ceph 集群需要执行 ceph 管理命令的
节点,从而不需要后期通过 ceph 命令对 ceph 集群进行管理配置的时候每次都需要指定
ceph-mon 节点地址和 ceph.client.admin.keyring 文件,另外各 ceph-mon 节点也需要同步
ceph 的集群配置文件与认证文件。
cephadmin@ceph-master01:~# sudo apt install ceph-common -y #node 节点在初始化时已经安装
发送admin密钥到deploy节点,默认分发到/etc/ceph/下, ceph.client.admin.keyring只需要存放在要执行ceph客户端命令下即可,同k8s kubeconfig文件,传到日常管理的ceph-deploy下
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy admin ceph-master01
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-wA98sP60-1667266646257)(Ceph.assets/image-20220905155019458.png)]
认情况下ceph.client.admin.keyring文件的权限为600,属主和属组为root,如果在集群内节点使用cephadmin用户直接直接ceph命令,将会提示无法找到/etc/ceph/ceph.client.admin.keyring
文件,因为权限不足
cephadmin@ceph-master01:~# sudo setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyringcephadmin@ceph-master02:~# sudo setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyringcephadmin@ceph-master03:~# sudo setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring
2.5 部署manager
ceph 的 Luminious(12) 及以上版本有 manager 节点,早期的版本没有。
2.5.1 部署 ceph-mgr 节点
因为此节点是monitor节点,所有的ceph包已经安装了,如果mgr节点与monitor节点不是一台服务器就会安装
cephadmin@ceph-master01:~# sudo apt install ceph-mgrReading package lists... DoneBuilding dependency tree Reading state information... Doneceph-mgr is already the newest version (16.2.10-1bionic).0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.cephadmin@ceph-master02:~# sudo apt install ceph-mgrReading package lists... DoneBuilding dependency tree Reading state information... Doneceph-mgr is already the newest version (16.2.10-1bionic).0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.cephadmin@ceph-master03:~# sudo apt install ceph-mgrReading package lists... DoneBuilding dependency tree Reading state information... Doneceph-mgr is already the newest version (16.2.10-1bionic).0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.
创建mgr节点
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy mgr create ceph-master01 ceph-master02 ceph-master03[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mgr create ceph-master01 ceph-master02 ceph-master03[ceph_deploy.cli][INFO ] ceph-deploy options:[ceph_deploy.cli][INFO ] username : None[ceph_deploy.cli][INFO ] verbose : False[ceph_deploy.cli][INFO ] mgr : [('ceph-master01', 'ceph-master01'), ('ceph-master02', 'ceph-master02'), ('ceph-master03', 'ceph-master03')][ceph_deploy.cli][INFO ] overwrite_conf : False[ceph_deploy.cli][INFO ] subcommand : create[ceph_deploy.cli][INFO ] quiet : False[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f97e641fe60>[ceph_deploy.cli][INFO ] cluster : ceph[ceph_deploy.cli][INFO ] func : <function mgr at 0x7f97e687f250>[ceph_deploy.cli][INFO ] ceph_conf : None[ceph_deploy.cli][INFO ] default_release : False[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-master01:ceph-master01 ceph-master02:ceph-master02 ceph-master03:ceph-master03[ceph-master01][DEBUG ] connection detected need for sudo[ceph-master01][DEBUG ] connected to host: ceph-master01 [ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph_deploy.mgr][INFO ] Distro info: Ubuntu 18.04 bionic[ceph_deploy.mgr][DEBUG ] remote host will use systemd[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-master01[ceph-master01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf[ceph-master01][WARNIN] mgr keyring does not exist yet, creating one[ceph-master01][DEBUG ] create a keyring file[ceph-master01][DEBUG ] create path recursively if it doesn't exist[ceph-master01][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-master01 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-master01/keyring[ceph-master01][INFO ] Running command: sudo systemctl enable ceph-mgr@ceph-master01[ceph-master01][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-master01.service → /lib/systemd/system/ceph-mgr@.service.[ceph-master01][INFO ] Running command: sudo systemctl start ceph-mgr@ceph-master01[ceph-master01][INFO ] Running command: sudo systemctl enable ceph.target[ceph-master02][DEBUG ] connection detected need for sudo[ceph-master02][DEBUG ] connected to host: ceph-master02 [ceph-master02][DEBUG ] detect platform information from remote host[ceph-master02][DEBUG ] detect machine type[ceph_deploy.mgr][INFO ] Distro info: Ubuntu 18.04 bionic[ceph_deploy.mgr][DEBUG ] remote host will use systemd[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-master02[ceph-master02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf[ceph-master02][WARNIN] mgr keyring does not exist yet, creating one[ceph-master02][DEBUG ] create a keyring file[ceph-master02][DEBUG ] create path recursively if it doesn't exist[ceph-master02][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-master02 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-master02/keyring[ceph-master02][INFO ] Running command: sudo systemctl enable ceph-mgr@ceph-master02[ceph-master02][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-master02.service → /lib/systemd/system/ceph-mgr@.service.[ceph-master02][INFO ] Running command: sudo systemctl start ceph-mgr@ceph-master02[ceph-master02][INFO ] Running command: sudo systemctl enable ceph.target[ceph-master03][DEBUG ] connection detected need for sudo[ceph-master03][DEBUG ] connected to host: ceph-master03 [ceph-master03][DEBUG ] detect platform information from remote host[ceph-master03][DEBUG ] detect machine type[ceph_deploy.mgr][INFO ] Distro info: Ubuntu 18.04 bionic[ceph_deploy.mgr][DEBUG ] remote host will use systemd[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-master03[ceph-master03][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf[ceph-master03][WARNIN] mgr keyring does not exist yet, creating one[ceph-master03][DEBUG ] create a keyring file[ceph-master03][DEBUG ] create path recursively if it doesn't exist[ceph-master03][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-master03 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-master03/keyring[ceph-master03][INFO ] Running command: sudo systemctl enable ceph-mgr@ceph-master03[ceph-master03][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-master03.service → /lib/systemd/system/ceph-mgr@.service.[ceph-master03][INFO ] Running command: sudo systemctl start ceph-mgr@ceph-master03[ceph-master03][INFO ] Running command: sudo systemctl enable ceph.target
2.5.2 验证ceph-mgr节点
cephadmin@ceph-master01:/etc/ceph-cluster# ps -ef | grep ceph-mgrcephadmin@ceph-master01:/etc/ceph-cluster# systemctl status ceph-mgr@ceph-master01cephadmin@ceph-master02:/etc/ceph-cluster# ps -ef | grep ceph-mgrcephadmin@ceph-master02:/etc/ceph-cluster# systemctl status ceph-mgr@ceph-master02cephadmin@ceph-master03:/etc/ceph-cluster# ps -ef | grep ceph-mgrcephadmin@ceph-master03:/etc/ceph-cluster# systemctl status ceph-mgr@ceph-master03
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-AlnDS95i-1667266646258)(Ceph.assets/image-20220905174031363.png)]
2.6 部署osd
2.6.1 初始化存储节点
deploy节点操作,安装指定版本的ceph包,本文这里由于node节点与master节点部署在一起,已经安装过了,新node节点接入可以执行
cephadmin@ceph-master01:~# ceph-deploy install --release pacific ceph-master01 ceph-master02 ceph-master03
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-oC0HFAkb-1667266646260)(Ceph.assets/image-20220905194727177.png)]
列出 ceph node 节点各个磁盘:
cephadmin@ceph-master01:~# ceph-deploy disk list ceph-master01 ceph-master02 ceph-master03#也可以使用fdisk -l 查看node节点所有未分区使用的磁盘
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-dUyBBgYt-1667266646261)(Ceph.assets/image-20220905200800064.png)]
使用 ceph-deploy disk zap 擦除各 ceph node 的 ceph 数据磁盘
ceph-master01 ceph-master02 ceph-master03的存储节点磁盘擦除过程如下,可以反复擦除执行
ceph-deploy disk zap ceph-master01 /dev/sdbceph-deploy disk zap ceph-master01 /dev/sdcceph-deploy disk zap ceph-master01 /dev/sddceph-deploy disk zap ceph-master02 /dev/sdbceph-deploy disk zap ceph-master02 /dev/sdcceph-deploy disk zap ceph-master02 /dev/sddceph-deploy disk zap ceph-master03 /dev/sdbceph-deploy disk zap ceph-master03 /dev/sdcceph-deploy disk zap ceph-master03 /dev/sdd
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-t1YGzJXR-1667266646262)(Ceph.assets/image-20220905201426609.png)]
2.6.2 OSD与磁盘部署关系
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-vdy9BHF4-1667266646264)(Ceph.assets/image-20220906175748320.png)]
#服务器上有两块ssd盘时,可以分别把block-db,block-wal存放到ssd盘中ceph-deploy osd create {node} --data /dev/sdc --block-db /dev/sda --block-wal /dev/sdb#服务器上只有一块硬盘时,只指定db的话存放ssd盘,没有指定waf存放位置,waf也会自动写到更快速的ssd盘上,和db共用ceph-deploy osd create {node} --data /path/to/data --block-db /dev/sda #第三种无意义ceph-deploy osd create {node} --data /path/to/data --block-wal /dev/sda
这里采用最简单的第一种方案 单块磁盘,高性能的ceph集群可以使用第二种方案,ssd存放元数据与waf日志
2.6.3 添加OSD
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master01 --data /dev/sdbcephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master01 --data /dev/sdccephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master01 --data /dev/sddcephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master02 --data /dev/sdbcephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master02 --data /dev/sdccephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master02 --data /dev/sddcephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master03 --data /dev/sdbcephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master03 --data /dev/sdccephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy osd create ceph-master03 --data /dev/sdd
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-KupkkkzY-1667266646266)(Ceph.assets/image-20220906180142755.png)]
2.6.4 验证ceph集群
cephadmin@ceph-master01:/etc/ceph-cluster# ceph -s cluster: id: f69afe6f-e559-4df7-998a-c5dc3e300209 health: HEALTH_OK services: mon: 3 daemons, quorum ceph-master01,ceph-master02,ceph-master03 (age 31m) mgr: ceph-master03(active, since 27h), standbys: ceph-master01, ceph-master02 osd: 9 osds: 9 up (since 27h), 9 in (since 28h) data: pools: 2 pools, 33 pgs objects: 1 objects, 100 MiB usage: 370 MiB used, 450 GiB / 450 GiB avail pgs: 33 active+clean
2.7 测试上传与下载数据
存取数据时,客户端必须首先连接至 RADOS 集群上某存储池,然后根据对象名称由相关的 CRUSH 规则完成数据对象寻址。于是,为了测试集群的数据存取功能,这里首先创建一个 用于测试的存储池 mypool,并设定其 PG 数量为 32 个。$ ceph -h #一个更底层的客户端命令 $ rados -h #客户端命令
创建 pool
cephadmin@ceph-master01:~# ceph osd pool create mypool 32 32 pool 'mypool' createdcephadmin@ceph-master01:/etc/ceph-cluster# sudo ceph osd pool ls device_health_metricsmypool或者: cephadmin@ceph-master01:/etc/ceph-cluster# rados lspools mypooldevice_health_metricsmypool或者:cephadmin@ceph-master01:/etc/ceph-cluster# ceph osd lspools1 device_health_metrics2 mypool
上传数据
当前的 ceph 环境还没还没有部署使用块存储和文件系统使用 ceph,也没有使用对象存储的客户端,但是 ceph 的 rados 命令可以实现访问 ceph 对象存储的功能:
cephadmin@ceph-master01:~# sudo rados put msg1 /var/log/syslog --pool=mypool
列出数据
cephadmin@ceph-master01:/etc/ceph-cluster# rados ls --pool=mypoolmsg1
文件信息
cephadmin@ceph-master01:/etc/ceph-cluster# ceph osd map mypool msg1 osdmap e114 pool 'mypool' (2) object 'msg1' -> pg 2.c833d430 (2.10) -> up ([15,13,0], p15) acting ([15,13,0], p15)
表示文件放在了存储池 id 为 2 的 c833d430 的 PG 上,10 为当前 PG 的 id, 2.10 表示数据是在 id 为 2 的存储池当中 id 为 10 的 PG 中存储,在线的 OSD 编号 15,13,10,主 OSD 为 5,活动的 OSD 15,13,10,三个 OSD 表示数据放一共 3 个副本,PG 中的 OSD 是 ceph 的 crush算法计算出三份数据保存在哪些 OSD。
下载文件
cephadmin@ceph-master01:/etc/ceph-cluster# sudo rados get msg1 --pool=mypool /opt/my.txt cephadmin@ceph-master01:/etc/ceph-cluster# ll /opt/my.txt -rw-r--r-- 1 root root 155733 Sep 7 20:51 /opt/my.txtcephadmin@ceph-master01:/etc/ceph-cluster# head /opt/my.txtSep 7 06:25:06 ceph-master01 rsyslogd: [origin software="rsyslogd" swVersion="8.32.0" x-pid="998" x-info="http://www.rsyslog.com"] rsyslogd was HUPedSep 7 06:26:01 ceph-master01 CRON[10792]: (root) CMD (ntpdate ntp.aliyun.com)Sep 7 06:26:01 ceph-master01 CRON[10791]: (CRON) info (No MTA installed, discarding output)Sep 7 06:27:01 ceph-master01 CRON[10794]: (root) CMD (ntpdate ntp.aliyun.com)Sep 7 06:27:01 ceph-master01 CRON[10793]: (CRON) info (No MTA installed, discarding output)Sep 7 06:28:01 ceph-master01 CRON[10797]: (root) CMD (ntpdate ntp.aliyun.com)Sep 7 06:28:01 ceph-master01 CRON[10796]: (CRON) info (No MTA installed, discarding output)Sep 7 06:29:01 ceph-master01 CRON[10799]: (root) CMD (ntpdate ntp.aliyun.com)Sep 7 06:29:01 ceph-master01 CRON[10798]: (CRON) info (No MTA installed, discarding output)Sep 7 06:30:01 ceph-master01 CRON[10801]: (root) CMD (ntpdate ntp.aliyun.com)
修改文件
修改文件只能下载后修改再上传覆盖
cephadmin@ceph-master01:/etc/ceph-cluster# sudo rados put msg1 /etc/passwd --pool=mypoo
删除文件
cephadmin@ceph-master01:/etc/ceph-cluster# sudo rados rm msg1 --pool=mypoolcephadmin@ceph-master01:/etc/ceph-cluster# rados ls --pool=mypool
3. Ceph RBD 使用详解
3.1 RBD架构图
Ceph 可以同时提供 RADOSGW(对象存储网关)、RBD(块存储)、Ceph FS(文件系统存储), RBD 即 RADOS Block Device 的简称,RBD 块存储是常用的存储类型之一,RBD 块设备类 似磁盘可以被挂载,RBD 块设备具有快照、多副本、克隆和一致性等特性,数据以条带化的方式存储在 Ceph 集群的多个 OSD 中。
条带化技术就是一种自动的将 I/O 的负载均衡到多个物理磁盘上的技术,条带化技术就是 将一块连续的数据分成很多小部分并把他们分别存储到不同磁盘上去。这就能使多个进程同 时访问数据的多个不同部分而不会造成磁盘冲突,而且在需要对这种数据进行顺序访问的时 候可以获得最大程度上的 I/O 并行能力,从而获得非常好的性能。
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-FUeRhnZn-1667266646267)(Ceph.assets/image-20220916200520085.png)]
3.2 创建存储池
#创建存储池root@ceph-master01:~# ceph osd pool create rbd-data1 32 32pool 'rbd-data1' created#存储池启用 rbdroot@ceph-master01:~# ceph osd pool application enable rbd-data1 rbd enabled application 'rbd' on pool 'rbd-data1'#初始化 rbdroot@ceph-master01:~# rbd pool init -p rbd-data1
3.3 创建img镜像
rbd 存储池并不能直接用于块设备,而是需要事先在其中按需创建映像(image),并把映 像文件作为块设备使用。rbd 命令可用于创建、查看及删除块设备相在的映像(image),以及克隆映像、创建快照、将映像回滚到快照和查看快照等管理操作。例如,下面的命令能 够在指定的 RBD 即 rbd-data1 创建一个名为 myimg1 的映像.
3.3.1 创建镜像
root@ceph-master01:~# rbd create data-img1 --size 3G --pool rbd-data1 --image-format 2 --image-feature layering#列出镜像root@ceph-master01:~# rbd ls --pool rbd-data1 -lNAME SIZE PARENT FMT PROT LOCKdata-img1 3 GiB 2
3.3.2 列出镜像详细信息
root@ceph-master01:~# rbd --image data-img1 --pool rbd-data1 info rbd image 'data-img1':size 3 GiB in 768 objects order 22 (4 MiB objects) #3G 768个objects,每个objects为4M snapshot_count: 0 id: 284d64e8f879d # 镜像idblock_name_prefix: rbd_data.284d64e8f879dformat: 2 features: layering #镜像特性op_features: flags: create_timestamp: Fri Sep 16 20:34:47 2022 access_timestamp: Fri Sep 16 20:34:47 2022modify_timestamp: Fri Sep 16 20:34:47 2022#已json显示详细信息root@ceph-master01:~# rbd ls --pool rbd-data1 -l --format json --pretty-format[ { "image": "data-img1", "id": "284d64e8f879d", "size": 3221225472, "format": 2 }]
3.3.3 :镜像的特性
RBD默认开启的特性包括: layering/exlcusive lock/object map/fast diff/deep flatten
#启用指定存储池中的指定镜像的特性$ rbd feature enable exclusive-lock --pool rbd-data1 --image data-img1 $ rbd feature enable object-map --pool rbd-data1 --image data-img1 $ rbd feature enable fast-diff --pool rbd-data1 --image data-img1#关闭指定存储池中的指定镜像的特性$ rbd feature disable fast-diff --pool rbd-data1 --image data-img1#验证镜像特性$ rbd --image data-img1 --pool rbd-data1 info
3.4 客户端使用RBD
客户端使用RBD需要两个条件:
一.安装
二.ceph用户
3.4.1 客户端安装 ceph-common
客户端要想挂载使用 ceph RBD,需要安装 ceph 客户端组件 ceph-common,但是 ceph-common 不在 cenos 的 yum 仓库,因此需要单独配置 yum 源,并且centos只能安装最高的版本为Octopus版(15版本)
#配置 yum 源: $ yum install epel-release $ yum install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm -y#下载ceph-common$ yum install -y ceph-common#验证ceph-common[root@zd_spring_156_101 ~]# rpm -qa | grep ceph-commonpython3-ceph-common-15.2.17-0.el7.x86_64ceph-common-15.2.17-0.el7.x86_64
3.4.2 同步账户认证文件
#scp至客户端服务器的/etc/ceph目录下,客户端默认会读取
[cephadmin@ceph-deploy ceph-cluster]$ scp ceph.conf ceph.client.admin.keyring root@172.26.156.17:/etc/ceph/
3.4.3 客户端映射镜像
#映射rbd [root@xianchaonode1 ~]# rbd -p rbd-data1 map data-img1 /dev/rbd0#客户端验证映射镜像[root@xianchaonode1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTrbd0 253:0 0 3G 0 disk sr0 11:0 1 4.2G 0 rom sda 8:0 0 200G 0 disk ├─sda2 8:2 0 199.8G 0 part /└─sda1 8:1 0 200M 0 part /boot
3.4.4 客户端挂载使用
#初始化磁盘[root@xianchaonode1 ~]# mkfs.xfs /dev/rbd0Discarding blocks...Done.meta-data=/dev/rbd0 isize=512 agcount=8, agsize=98304 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=0, sparse=0data = bsize=4096 blocks=786432, imaxpct=25 = sunit=16 swidth=16 blksnaming =version 2 bsize=4096 ascii-ci=0 ftype=1log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=16 blks, lazy-count=1realtime =none extsz=4096 blocks=0, rtextents=0[root@xianchaonode1 ~]# mount /dev/rbd0 /mnt/[root@xianchaonode1 ~]# echo 111 >> /mnt/test.txt[root@xianchaonode1 ~]# cat /mnt/test.txt111[root@xianchaonode1 ~]# df -h Filesystem Size Used Avail Use% Mounted ondevtmpfs 7.9G 0 7.9G 0% /devtmpfs 7.9G 0 7.9G 0% /dev/shmtmpfs 7.9G 795M 7.1G 10% /runtmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup/dev/sda2 200G 62G 138G 31% / tmpfs 1.6G 0 1.6G 0% /run/user/0/dev/rbd0 3.0G 33M 3.0G 2% /mnt[root@xianchaonode1 ~]#
4.CephFS使用详解
ceph FS 即 ceph filesystem,可以实现文件系统共享功能(POSIX 标准), 客户端通过 ceph协议挂载并使用 ceph 集群作为数据存储服务器,https://file.lsjlt.com/upload/f/202308/30/tctgnx0ut15.png)]
4.1 部署MDS服务
如果要使用 cephFS,需要部署 MDS 服务,可以部署在mon节点,
root@ceph-master01:~# apt-cache madison ceph-mdsroot@ceph-master01:~# apt install ceph-mdsroot@ceph-master01:~# ceph-deploy mds create ceph-master01 ceph-master02 ceph-master03
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-oztheZFb-1667266646270)(Ceph.assets/image-20220920143149139.png)]
#检查主从状态ceph -sceph fs status
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-CfQbTB9J-1667266646272)(Ceph.assets/image-20220921150902744.png)]
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-HO23s5I4-1667266646274)(Ceph.assets/image-20220921150923614.png)]
4.2 创建CephFS mdetadata和data存储池
使用 CephFS 之前需要事先于集群中创建一个文件系统,并为其分别指定元数据和数据相关的存储池。下面创建一个名为 cephfs 的文件系统用于测试,它使用 cephfs-metadata 为 元数据存储池,使用 cephfs-data 为数据存储池.
root@ceph-master01:~# ceph osd pool create cephfs-metadata 32 32pool 'cephfs-metadata' createdroot@ceph-master01:~# ceph osd pool create cephfs-data 64 64pool 'cephfs-data' created
4.3 创建 cephFS 并验证
root@ceph-master01:~# ceph fs new mycephfs cephfs-metadata cephfs-datanew fs with metadata pool 5 and data pool 6root@ceph-master01:~# ceph fs lsname: mycephfs, metadata pool: cephfs-metadata, data pools: [cephfs-data ]root@ceph-master01:~# ceph fs status mycephfsmycephfs - 0 clients======== POOL TYPE USED AVAIL cephfs-metadata metadata 0 142G cephfs-data data 0 142G
4.4 创建cephFS客户端账户
#创建账户root@ceph-master01:/etc/ceph-cluster# ceph auth add client.yanyan mon 'allow r' mds 'allow rw' osd 'allow rwx pool=cephfs-data'added key for client.yanyan#验证账户root@ceph-master01:/etc/ceph-cluster# ceph auth get client.yanyan[client.yanyan]key = AQDnhSljvlhoLxAAWrV9uY1kXq5/C0jAziaB9Q==caps mds = "allow rw"caps mon = "allow r"caps osd = "allow rwx pool=cephfs-data"exported keyring for client.yanyanroot@ceph-master01:/etc/ceph-cluster# ceph auth get client.yanyan -o ceph.client.yanyan.keyringexported keyring for client.yanyanroot@ceph-master01:/etc/ceph-cluster# lltotal 416drwxr-xr-x 2 cephadmin cephadmin 4096 Sep 20 17:21 ./drwxr-xr-x 92 root root 4096 Sep 5 09:46 ../-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-mds.keyring-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-mgr.keyring-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-osd.keyring-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-rgw.keyring-rw------- 1 cephadmin cephadmin 151 Sep 5 10:52 ceph.client.admin.keyring-rw-r--r-- 1 root root 150 Sep 20 17:21 ceph.client.yanyan.keyring-rw-rw-r-- 1 cephadmin cephadmin 398 Sep 7 20:01 ceph.conf-rw-rw-r-- 1 cephadmin cephadmin 368945 Sep 7 20:02 ceph-deploy-ceph.log-rw------- 1 cephadmin cephadmin 73 Sep 2 16:50 ceph.mon.keyring-rw-r--r-- 1 root root 9 Sep 12 13:06 pass.txt-rw-r--r-- 1 root root 1645 Oct 16 2015 release.ascroot@ceph-master01:/etc/ceph-cluster# ceph auth print-key client.yanyan > yanyan.keyroot@ceph-master01:/etc/ceph-cluster# lltotal 420drwxr-xr-x 2 cephadmin cephadmin 4096 Sep 20 17:21 ./drwxr-xr-x 92 root root 4096 Sep 5 09:46 ../-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-mds.keyring-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-mgr.keyring-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-osd.keyring-rw------- 1 cephadmin cephadmin 113 Sep 5 10:52 ceph.bootstrap-rgw.keyring-rw------- 1 cephadmin cephadmin 151 Sep 5 10:52 ceph.client.admin.keyring-rw-r--r-- 1 root root 150 Sep 20 17:21 ceph.client.yanyan.keyring-rw-rw-r-- 1 cephadmin cephadmin 398 Sep 7 20:01 ceph.conf-rw-rw-r-- 1 cephadmin cephadmin 368945 Sep 7 20:02 ceph-deploy-ceph.log-rw------- 1 cephadmin cephadmin 73 Sep 2 16:50 ceph.mon.keyring-rw-r--r-- 1 root root 9 Sep 12 13:06 pass.txt-rw-r--r-- 1 root root 1645 Oct 16 2015 release.asc-rw-r--r-- 1 root root 40 Sep 20 17:21 yanyan.keyroot@ceph-master01:/etc/ceph-cluster# cat ceph.client.yanyan.keyring[client.yanyan]key = AQDnhSljvlhoLxAAWrV9uY1kXq5/C0jAziaB9Q==caps mds = "allow rw"caps mon = "allow r"caps osd = "allow rwx pool=cephfs-data"root@ceph-master01:/etc/ceph-cluster#
4.5 安装ceph客户端
#以centos客户端yum install epel-release -yyum install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpmyum install ceph-common -y
4.6 同步认证文件
root@ceph-master01:~# cd /etc/ceph-cluster/root@ceph-master01:/etc/ceph-cluster# scp ceph.conf ceph.client.yanyan.keyring yanyan.key root@172.26.156.165:/etc/ceph/
客户端权限认证
[root@zd_spring_156_101 ceph]# ceph --user yanyan -s
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-FLTarN9h-1667266646275)(Ceph.assets/image-20220921151432682.png)]
4.7 客户端安装ceph-common
#配置 yum 源: $ yum install epel-release $ yum install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm -y#下载ceph-common$ yum install -y ceph-common#验证ceph-common[root@zd_spring_156_101 ~]# rpm -qa | grep ceph-commonpython3-ceph-common-15.2.17-0.el7.x86_64ceph-common-15.2.17-0.el7.x86_64
4.8 cephfs挂载使用
客户端挂载有两种方式,一是内核空间一是用户空间,内核空间挂载需要内核支持 ceph模块(内核版本3.10.0-862以上,centos7.5默认内核),用户空间挂载需要安装 ceph-fuse,如果内核本较低而没有 ceph 模块(验证centos7.5及以上默认内核基本都有ceph模块,centos7.3以下默认内核未测试),那么可以安装 ceph-fuse 挂载,但是推荐使用内核模块挂载。
4.8.1 内核空间挂载使用ceph-fs
#客户端通过 key 挂载(不需要安装ceph-common)[root@other165 ~]# cat /etc/ceph/yanyan.key AQDnhSljvlhoLxAAWrV9uY1kXq5/C0jAziaB9Q==[root@other165 ~]# [root@other165 ~]# mount -t ceph 172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789:/ /mnt -o name=yanyan,secret=AQDnhSljvlhoLxAAWrV9uY1kXq5/C0jAziaB9Q==#客户端通过 key 文件挂载(需要安装ceph-common)[root@other165 ~]# mount -t ceph 172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789:/ /mnt -o name=yanyan,secretfile=/etc/ceph/yanyan.key
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-kfopbFkk-1667266646277)(Ceph.assets/image-20221008205025259.png)]
4.8.2 开机自动挂载
# cat /etc/fstab172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789:/ /mnt ceph defaults,name=yanyan,secretfile=/etc/ceph/yanyan.key,_netdev 0 0
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ngmsFgtW-1667266646278)(Ceph.assets/image-20221008215623597.png)]
4.9用户空间挂载 ceph-fs
如果内核本较低而没有 ceph 模块,那么可以安装 ceph-fuse 挂载,但是推荐使用内核模块
挂载。
4.9.1 安装ceph-fuse
#配置 yum 源: $ yum install epel-release $ yum install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm -y#下载ceph-common$ yum install ceph-fuse -y
4.9.2 ceph-fuse 挂载 ceph
#默认读取/etc/ceph/下ceph-fuse --name client.yanyan -m 172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789 /mnt
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-fMlygoGo-1667266646282)(Ceph.assets/image-20221008214532720.png)]
4.9.3 开机自动挂载
指定用户会自动根据用户名称加载授权文件及配置文件 ceph.conf
vim /etc/fstabnone /data fuse.ceph ceph.id=yanyan,ceph.conf=/etc/ceph/ceph.conf,_netdev,defaults 0 0
5.k8s使用ceph案例
5.1 RBD静态存储
5.1.1 使用pv/pvc挂载RBD
apiVersion: v1 kind: PersistentVolume metadata: name: ceph-pv spec: capacity: storage: 1Gi accessModes: - ReadWriteOnce rbd: monitors: - '172.26.156.217:6789' - '172.26.156.218:6789' - '172.26.156.219:6789' pool: k8stest #需要创建 image: rbda #需要创建 user: admin #需要创建 secretRef: name: ceph-secret #需要创建 fsType: xfs readOnly: false persistentVolumeReclaimPolicy: Recycle---kind: PersistentVolumeClaim apiVersion: v1 metadata: name: ceph-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi
5.1.2 直接使用pod挂载RBD
apiVersion: apps/v1kind: Deploymentmetadata: name: nginx-deploymentspec: replicas: 1 selector: matchLabels: #rs or deployment app: ng-deploy-80 template: metadata: labels: app: ng-deploy-80 spec: nodeName: xianchaonode1 containers: - name: ng-deploy-80 image: nginx ports: - containerPort: 80 volumeMounts: - name: rbd-data1 mountPath: /usr/share/nginx/html/rbd volumes: - name: rbd-data1 rbd: monitors: - '172.26.156.217:6789' - '172.26.156.218:6789' - '172.26.156.219:6789' pool: shijie-rbd-pool1 image: shijie-img-img1 fsType: xfs readOnly: false user: magedu-shijie secretRef: name: ceph-secret-magedu-shijie
5.1 RBD动态存储类
存储卷可以通过 kube-controller-manager 组件动态创建,适用于有状态服务需要多个存储卷的场合。 将 ceph admin 用户 key 文件定义为 k8s secret,用于 k8s 调用 ceph admin 权限动态创建存储卷,即不再需要提前创建好 image 而是 k8s 在需要使用的时候再调用 ceph 创建。
5.1.1 创建rbd pool
root@ceph-master01:/etc/ceph# ceph osd pool create k8s-rbd 32 32 pool 'k8s-rbd' createdroot@ceph-master01:/etc/ceph# ceph osd pool application enable k8s-rbd rbd enabled application 'rbd' on pool 'k8s-rbd'root@ceph-master01:/etc/ceph# rbd pool init -p k8s-rbd
5.1.2 创建 admin 用户 secret:
用于k8s有权限创建rbd
#查看ceph adminbase64密钥root@ceph-master01:/etc/ceph# ceph auth print-key client.admin | base64QVFCM1pCVmpMOE4wRUJBQVJlRzBxM3JwVkYvOERkbk11cnlaTkE9PQ==#ceph admin 用户 secret 文件内容[root@xianchaomaster1 pod-rbd]# vi case1-secret-admin.yaml apiVersion: v1kind: Secretmetadata: name: ceph-secret-admin namespace: defaulttype: "kubernetes.io/rbd"data: key: QVFCM1pCVmpMOE4wRUJBQVJlRzBxM3JwVkYvOERkbk11cnlaTkE9PQ==
5.1.3 创建普通用户的 secret
用于访问存储卷进行数据读写
root@ceph-master01:/etc/ceph# ceph auth get-or-create client.k8s-rbd mon 'allow r' osd 'allow * pool=k8s-rbd'[client.k8s-rbd]key = AQAMgkZjDyhsMhAAEH8F0Gwe3L+aiP/wAkqdyA==root@ceph-master01:/etc/ceph# ceph auth print-key client.k8s-rbd | base64QVFBTWdrWmpEeWhzTWhBQUVIOEYwR3dlM0wrYWlQL3dBa3FkeUE9PQ==vi case2-secret-client.yamlapiVersion: v1kind: Secretmetadata: name: k8s-rbdtype: "kubernetes.io/rbd"data: key: QVFBTWdrWmpEeWhzTWhBQUVIOEYwR3dlM0wrYWlQL3dBa3FkeUE9PQ==
5.1.4 创建存储类
创建动态存储类,为pod提供动态pv
vi case3-ceph-storage-class.yamlapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: ceph-storage-class annotations: storageclass.kubernetes.io/is-default-class: "false" #设置为默认存储类provisioner: kubernetes.io/rbdreclaimPolicy: Retain #默认是Delete,危险parameters: monitors: 172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789 adminId: admin adminSecretName: ceph-secret-admin adminSecretNamespace: default pool: k8s-rbd userId: k8s-rbd userSecretName: k8s-rbd
5.1.5 创建基于存储类的PVC
vi case4-mysql-pvc.yaml apiVersion: v1kind: PersistentVolumeClaimmetadata: name: mysql-data-pvcspec: accessModes: - ReadWriteOnce storageClassName: ceph-storage-class resources: requests: storage: '5Gi' #验证 PV/PVC:kubectl get pvckubectl get pv
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-rznglWmQ-1667266646282)(Ceph.assets/image-20221018204619317.png)]
#验证ceph是否自动创建image
rbd ls --pool k8s-rbd
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-i6tRcInr-1667266646285)(Ceph.assets/image-20221018204739363.png)]
5.1.6 运行单机mysql pod验证
vi case5-mysql-deploy-svc.yamlapiVersion: apps/v1kind: Deploymentmetadata: name: mysqlspec: selector: matchLabels: app: mysql strategy: type: Recreate template: metadata: labels: app: mysql spec: containers: - image: mysql:5.6.46 name: mysql env: # Use secret in real usage - name: MYSQL_ROOT_PASSWORD value: 123456 ports: - containerPort: 3306 name: mysql volumeMounts: - name: mysql-persistent-storage mountPath: /var/lib/mysql volumes: - name: mysql-persistent-storage persistentVolumeClaim: claimName: mysql-data-pvc---kind: ServiceapiVersion: v1metadata: labels: app: mysql-service-label name: mysql-servicespec: type: NodePort ports: - name: http port: 3306 protocol: TCP targetPort: 3306 nodePort: 33306 selector: app: mysql
#连接验证,创建test库
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-SHZqtwMD-1667266646285)(Ceph.assets/image-20221018214026839.png)]
#删除mysql pod 重新创建,验证rbd数据持久
kubectl delete -f case5-mysql-deploy-svc.yaml kubectl apply -f case5-mysql-deploy-svc.yaml
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-2X56RbzN-1667266646286)(Ceph.assets/image-20221018214657678.png)]
#将pod调度到指定的其他node节点,验证能否挂载rbd
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ZpxjM6be-1667266646287)(Ceph.assets/image-20221018215338030.png)]
kubectl delete -f case5-mysql-deploy-svc.yaml kubectl apply -f case5-mysql-deploy-svc.yaml
依然可以挂载成功
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-pLkQl96p-1667266646289)(Ceph.assets/image-20221018215454280.png)]
5.2 cephFS静态存储
5.2.1 使用pv/pvc挂载cephFS
注意的是,一个cephFS pool共享多个目录,需要在cephfs中提前创建好子目录分给各个deployment挂载,找一台linux主机提前挂载此cephfs,创建/data2目录,不然pod只能挂载cepfFS的/目录,mount -t ceph 172.26.156.217:6789,172.26.156.218:6789,172.26.156.219:6789:/ /mnt -o name=admin,secret=AQB3ZBVjL8N0EBAAReG0q3rpVF/8DdnMuryZNA==
#创建pvapiVersion: v1kind: PersistentVolumemetadata: name: cephfs-pv labels: app: static-cephfs-pvspec: capacity: storage: 1Gi accessModes: - ReadWriteMany cephfs: monitors: - 172.26.156.217:6789 - 172.26.156.218:6789 - 172.26.156.219:6789 path: /data2/ #需要提前在cephFS pool中创建好/data2 user: admin secretRef: name: ceph-secret-admin readOnly: false persistentVolumeReclaimPolicy: Recycle storageClassName: slow---#创建pvcapiVersion: v1kind: PersistentVolumeClaimmetadata: name: cephfs-pvc-claimspec: selector: matchLabels: app: static-cephfs-pv storageClassName: slow accessModes: - ReadWriteMany resources: requests: storage: 1Gi---#deploymentapiVersion: apps/v1kind: Deploymentmetadata: name: nginx2spec: selector: matchLabels: k8s-app: nginx2 replicas: 2 template: metadata: labels: k8s-app: nginx2 spec: containers: - name: nginx image: nginx imagePullPolicy: IfNotPresent ports: - containerPort: 80 protocol: TCP volumeMounts: - name: pvc-recycle mountPath: /usr/share/nginx/html/nginx2 volumes: - name: pvc-recycle persistentVolumeClaim: claimName: cephfs-pvc-claim---kind: ServiceapiVersion: v1metadata: labels: k8s-app: nginx2 name: ng-deploy-80-servicespec: type: NodePort ports: - name: http port: 80 protocol: TCP targetPort: 80 nodePort: 23380 selector: k8s-app: nginx2
5.2 直接使用pod挂载cephFS
不需要创建pv/pvc,直接创建deployment挂载cephFS
apiVersion: apps/v1kind: Deploymentmetadata: name: nginx-deploymentspec: replicas: 2 selector: matchLabels: #rs or deployment app: ng-deploy-80 template: metadata: labels: app: ng-deploy-80 spec: containers: - name: ng-deploy-80 image: nginx ports: - containerPort: 80 volumeMounts: - name: magedu-staticdata-cephfs mountPath: /usr/share/nginx/html/cephfs volumes: - name: magedu-staticdata-cephfs cephfs: monitors: - '172.26.156.217:6789' - '172.26.156.218:6789' - '172.26.156.219:6789' path: / user: admin secretRef: name: ceph-secret-admin---kind: ServiceapiVersion: v1metadata: labels: app: ng-deploy-80-service-label name: ng-deploy-80-servicespec: type: NodePort ports: - name: http port: 80 protocol: TCP targetPort: 80 nodePort: 33380 selector: app: ng-deploy-80
5.4 cephFS动态存储类
虽然官方并没有直接提供对Cephfs StorageClass的支持,但是社区给出了类似的解决方案 external-storage/ cephfs。
测试发现Cephfs StorageClass k8s1.20版本之后已经不能使用。按照这种方式会报错以下截图,网上的解决方案需要在kube-apiserver.yaml配置文件中添加–feature-gates=RemoveSelfLink=false,这个参数在k8s1.20版本之后已经移除,后续使用ceph-csi方式。
Cephfs StorageClass部署方案(不成功):
https://www.cnblogs.com/leffss/p/15630641.html
https://www.cnblogs.com/estarhaohao/p/15965785.html
github issues: https://github.com/kubernetes/kubernetes/issues/94660
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-XzYqtsxT-1667266646290)(Ceph.assets/image-20221020143238817.png)]
5.5 ceph-csi 动态存储
Ceph-CSI RBD
https://www.modb.pro/db/137721
Ceph-CSI CephFS
最新版本3.7遇到问题,遂使用CSI-3.4版本
git clone https://github.com/ceph/ceph-csi.git -b release-v3.4cd ceph-csi/deploy/cephfs/kubernetes
修改 ConfigMap 对象配置,clusterID 是 ceph fsid 。
vi csi-config-map.yaml ---apiVersion: v1kind: ConfigMapdata: config.json: |- [ { "clusterID": "f69afe6f-e559-4df7-998a-c5dc3e300209", "monitors": [ "172.26.156.217:6789","172.26.156.218:6789","172.26.156.219:6789" ] } ]metadata: name: ceph-csi-config
ceph-csi 默认部署在 default 命名空间,这里改到 kube-system 。
sed -i "s/namespace: default/namespace: kube-system/g" $(grep -rl "namespace: default" ./)
部署 ceph-csi CephFS ,镜像的仓库是 k8s.gcr.io , 部分镜像拉取失败,可在dockerhub上search替换
kubectl get po -n kube-system | grep csi-cephfscsi-cephfsplugin-8xt97 3/3 Running 0 6d10hcsi-cephfsplugin-bmxwr 3/3 Running 0 6d10hcsi-cephfsplugin-n74cd 3/3 Running 0 6d10hcsi-cephfsplugin-provisioner-79d84c9598-fb6bg 6/6 Running 0 6d10hcsi-cephfsplugin-provisioner-79d84c9598-g579j 6/6 Running 0 6d10hcsi-cephfsplugin-provisioner-79d84c9598-n8w2j 6/6 Running 0 6d10h
***创建 Ceph*FS storageClass
ceph-csi 需要 cephx 凭据才能与 Ceph 集群通信,这里使用的是 admin 用户。
vi secret.yamlapiVersion: v1kind: Secretmetadata: name: csi-cephfs-secret namespace: defaultstringData: adminID: admin adminKey: AQB3ZBVjL8N0EBAAReG0q3rpVF/8DdnMuryZNA==
创建 storageClass 对象,这里使用的 Ceph FS name 是 mycephfs(ceph中新建cephfs时的名字,他不是一个pool) 。
#ceph fs new mycephfs cephfs-metadata cephfs-data
apiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: ceph-csi-cephfsprovisioner: cephfs.csi.ceph.comparameters: clusterID: f69afe6f-e559-4df7-998a-c5dc3e300209 fsName: mycephfs csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret csi.storage.k8s.io/provisioner-secret-namespace: default csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret csi.storage.k8s.io/controller-expand-secret-namespace: default csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret csi.storage.k8s.io/node-stage-secret-namespace: defaultreclaimPolicy: RetainallowVolumeExpansion: truemountOptions: - discard
创建 PVC
apiVersion: v1kind: PersistentVolumeClaimmetadata: name: csi-cephfs-pvc namespace: defaultspec: accessModes: - ReadWriteMany resources: requests: storage: 1Gi storageClassName: ceph-csi-cephfs
自动创建了pv,并且pv/pvc绑定
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-zw7FnGs5-1667266646291)(Ceph.assets/image-20221020203241780.png)]
创建测试的 Deployment
vi Deployment.yamlapiVersion: apps/v1kind: Deploymentmetadata: name: cephfs-test labels: component: cephfs-testspec: replicas: 2 strategy: type: Recreate selector: matchLabels: component: cephfs-test template: metadata: labels: component: cephfs-test spec: containers: - name: nginx image: nginx imagePullPolicy: IfNotPresent ports: - containerPort: 80 volumeMounts: - name: config mountPath: "/data" volumes: - name: config persistentVolumeClaim: claimName: csi-cephfs-pvc readOnly: false
csi-cephfs 默认会创建一个名为 csi 的子文件系统
# ceph fs subvolumegroup ls cephfs[ { "name": "_deleting" }, { "name": "csi" }]
所有使用 csi-cephfs 创建的 PV ,都是在子文件系统 csi 的目录下
kubectl get pv | grep default/csi-cephfs-pvcpvc-0f36fd44-40f1-4ac3-aebe-0264a2fb50ea 1Gi RWX Delete Bound default/csi-cephfs-pvc ceph-csi-cephfs 6d11h# kubectl describe pv pvc-0f36fd44-40f1-4ac3-aebe-0264a2fb50ea | egrep 'subvolumeName|subvolumePath' subvolumeName=csi-vol-056e44c5-eddf-11eb-a990-a63fe71a40b6 subvolumePath=/volumes/csi/csi-vol-056e44c5-eddf-11eb-a990-a63fe71a40b6/e423daf3-017b-4a7e-8713-bd05bab695ee# cd /mnt/cephfs-test/# tree -L 4 ././└── volumes ├── csi │ ├── csi-vol-056e44c5-eddf-11eb-a990-a63fe71a40b6 │ │ └── e423daf3-017b-4a7e-8713-bd05bab695ee │ └── csi-vol-1ac1f4c1-ef8a-11eb-a990-a63fe71a40b6 │ └── 3773a567-a8cb-4bae-9181-38f4e3065436 ├── _csi:csi-vol-056e44c5-eddf-11eb-a990-a63fe71a40b6.meta ├── _csi:csi-vol-1ac1f4c1-ef8a-11eb-a990-a63fe71a40b6.meta └── _deleting7 directories, 2 files
二.理解
1.存储数据, object, pg,pgp, pool, osd, 存储磁盘的关系
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-M99cBbC2-1667266646292)(Ceph.assets/微信图片_20220908100157.jpg)]
2.Filestore、BlueStore、journal理解
ceph后端支持多种存储引擎,我们常听到的其中两种存储引擎就是FileStore与BlueStore,在L版(包括L版)之前,默认使用的filestore作为默认的存储引擎,但是由于FileStore存在一些缺陷,重新设计开发了BlueStore,L版之后默认使用的存储引擎就是BlueStore了。
BlueStore 对于整块数据的写入,数据直接 AIO 的方式写入磁盘,避免了 filestore的先写日志,后 apply到实际磁盘的两次写盘。对于随机IO,直接 WAL 的形式,直接写入 RocksDB 高性能的 KV 存储中。
所以我们在网上看到的各种优化资源文档,部署ceph时需要给ceph单独设定一块ssd盘为Journal刷写日志,但这种只针对L版本之前使用,在L版本之后已经没有Journal预刷写日志了。L版本之后使用下面的优化方式
#服务器上有两块ssd盘时,可以分别把block-db,block-wal存放到ssd盘中ceph-deploy osd create ceph-node1 --data /dev/sdc --block-db /dev/sda --block-wal /dev/sdb#只有一块硬盘时,只指定db的话存放ssd盘,没有指定waf存放位置,waf也会自动写到更快速的ssd盘上,和db共用一块。ceph-deploy osd create ceph-node1 --data /dev/sdb--block-db /dev/sda
三.需求
1.删除OSD的正确方式
Luminous 之前版本
调整osd的crush weight
- 调整osd的crush weight
ceph osd crush reweight osd.0 0.5ceph osd crush reweight osd.0 0.2ceph osd crush reweight osd.0 0
说明:这个地方如果想慢慢的调整就分几次将crush 的weight 减低到0 ,这个过程实际上是让数据不分布在这个节点上,让数据慢慢的分布到其他节点上,直到最终为没有分布在这个osd,并且迁移完成。这个地方不光调整了osd 的crush weight ,实际上同时调整了host 的 weight ,这样会调整集群的整体的crush 分布,在osd 的crush 为0 后, 再对这个osd的任何删除相关操作都不会影响到集群的数据的分布。
- 停止osd进程
systemctl stop ceph-osd@0.service
停止osd的进程,这个是通知集群这个osd进程不在了,不提供服务了,因为本身没权重,就不会影响到整体的分布,也就没有迁移。
- 将节点状态标记为out
ceph osd out osd.0
将osd退出集群,这个是通知集群这个osd不再映射数据了,不提供服务了,因为本身没权重,就不会影响到整体的分布,也就没有迁移。
- 从crush中移除节点
ceph osd crush remove osd.0
这个是从crush中删除,因为OSD权重已经是0了 所以没影响主机的权重,也就没有迁移了。
- 删除节点
ceph osd rm osd.0
这个是从集群里面删除这个OSD的记录。
- 删除OSD认证(不删除编号会占住)
ceph auth del osd.0
这个是从认证当中去删除这个OSD的信息。
经过验证,此种方式只触发了一次迁移,虽然只是一个步骤先后上的调整,对于生产环境的的集群来说,迁移的量要少了一次,实际生产环境当中节点是有自动out的功能,这个可以考虑自己去控制,只是监控的密度需要加大,毕竟这个是一个需要监控的集群,完全让其自己处理数据的迁移是不可能的,带来的故障只会更多。
Luminous 之后版本
调整osd的crush weight
- 调整osd的crush weight
ceph osd crush reweight osd.0 0.5ceph osd crush reweight osd.0 0.2ceph osd crush reweight osd.0 0
说明:这个地方如果想慢慢的调整就分几次将crush 的weight 减低到0 ,这个过程实际上是让数据不分布在这个节点上,让数据慢慢的分布到其他节点上,直到最终为没有分布在这个osd,并且迁移完成。这个地方不光调整了osd 的crush weight ,实际上同时调整了host 的 weight ,这样会调整集群的整体的crush 分布,在osd 的crush 为0 后, 再对这个osd的任何删除相关操作都不会影响到集群的数据的分布。
- 停止osd进程
systemctl stop ceph-osd@0.service
停止osd的进程,这个是通知集群这个osd进程不在了,不提供服务了,因为本身没权重,就不会影响到整体的分布,也就没有迁移。
- 将节点状态标记为out
ceph osd out osd.0
将osd退出集群,这个是通知集群这个osd不再映射数据了,不提供服务了,因为本身没权重,就不会影响到整体的分布,也就没有迁移。
移除设备
ceph osd purge {id} --yes-i-really-mean-it
若 OSD 的配置信息存在于 ceph.conf 配置文件中,管理员在删除 OSD 之后手
动将其删除。
2.ceph 旧OSD 节点 格式化 数据,加入新 ceph集群
旧有的ceph osd , 想要格式化之后加入新的ceph节点。查询 osd 旧有 数据,后面的步骤会用到。[root@ceph-207 ~]# ceph-volume lvm list====== osd.1 =======[block] /dev/ceph-58ef1d0f-272b-4273-82b1-689946254645/osd-block-e0efe172-778e-46e1-baa2-cd56408aac34block device /dev/ceph-58ef1d0f-272b-4273-82b1-689946254645/osd-block-e0efe172-778e-46e1-baa2-cd56408aac34block uuid hCx4XW-OjKC-OC8Y-jEg2-NKYo-Pb6f-y9Nfl3cephx lockbox secretcluster fsid b7e4cb56-9cc8-4e44-ab87-24d4253d0951cluster name cephcrush device class Noneencrypted 0osd fsid e0efe172-778e-46e1-baa2-cd56408aac34osd id 1osdspec affinitytype blockvdo 0devices /dev/sdb直接加入集群报错:ceph-volume lvm activate 1 e0efe172-778e-46e1-baa2-cd56408aac34目前遇到了两类报错:osd.1 21 heartbeat_check: no reply from 192.168.8.206:6804 osd.0 ever on either front or back, first ping sent 2020-11-26T16:00:04.842947+0800 (oldest deadline 2020-11-26T16:00:24.842947+0800)stderr: Calculated size of logical volume is 0 extents. Needs to be larger.--> Was unable to complete a new OSD, will rollback changes 格式化数据,重新加入新的ceph集群: 1、停止osd 服务 , @ 后面的 1 为 ceph-volume lvm list 命令查询出的 osd idsystemctl stop ceph-osd@12、重处理 osd lvm, 1 还是 osd idceph-volume lvm zap --osd-id 13、查询 lvs 信息, 删除 lv、pg 等信息[root@ceph-207 ~]# lvsLV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convertosd-block-e0efe172-778e-46e1-baa2-cd56408aac34 ceph-58ef1d0f-272b-4273-82b1-689946254645 -wi-a----- <16.00ghome cl -wi-ao---- <145.12groot cl -wi-ao---- 50.00gswap cl -wi-ao---- <3.88g[root@ceph-207 ~]# vgremove ceph-58ef1d0f-272b-4273-82b1-689946254645Do you really want to remove volume group "ceph-58ef1d0f-272b-4273-82b1-689946254645" containing 1 logical volumes? [y/n]: yDo you really want to remove active logical volume ceph-58ef1d0f-272b-4273-82b1-689946254645/osd-block-e0efe172-778e-46e1-baa2-cd56408aac34? [y/n]: yLogical volume "osd-block-e0efe172-778e-46e1-baa2-cd56408aac34" successfully remove4、将主机上的磁盘重新加入新的ceph集群ceph-volume lvm create --data /dev/sdb5、查询下 osd tree , 磁盘挂载情况[root@ceph-207 ~]# ceph osd treeID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF-1 0.07997 root default-3 0.03508 host ceph-2060 hdd 0.01559 osd.0 up 1.00000 1.000001 hdd 0.01949 osd.1 up 1.00000 1.00000-5 0.04489 host ceph-2072 hdd 0.01559 osd.2 up 1.00000 1.00000[root@ceph-207 ~]# lsblkNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTsda 8:0 0 200G 0 disk├─sda1 8:1 0 1G 0 part /boot└─sda2 8:2 0 199G 0 part├─cl-root 253:0 0 50G 0 lvm /├─cl-swap 253:1 0 3.9G 0 lvm [SWAP]└─cl-home 253:2 0 145.1G 0 lvm /homesdb 8:16 0 16G 0 disk└─ceph--c221ed63--d87a--4bbd--a503--d8f2ed9e806b-osd--block--530376b8--c7bc--4d64--bc0c--4f8692559562 253:3 0 16G 0 lvmsr0
3.ceph如何修改配置文件
默认生成的ceph.conf文件如果需要改动的话需要加一些参数,如果配置文件变化也是通过ceph-deploy进行推送。请不要直接修改某个节点的"/etc/ceph/ceph.conf"文件,而是在部署机下修改ceph.conf,采用推送的方式更加方便安全。
vi /etc/ceph-cluster/ceph.conf[global]fsid = f69afe6f-e559-4df7-998a-c5dc3e300209public_network = 172.26.0.0/16cluster_network = 10.0.0.0/24mon_initial_members = ceph-master01, ceph-master02, ceph-master03mon_host = 172.26.156.217,172.26.156.218,172.26.156.219auth_cluster_required = cephxauth_service_required = cephxauth_client_required = cephx[mon]mon_clock_drift_allowed = 0.10mon clock drift warn backoff = 10
亲测,参数名称要不要下划线都可以
改完后将配置推到集群所有的机器
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy --overwrite-conf config push ceph-master01 ceph-master02 ceph-master03
重启所有机器
root@ceph-master01:~# systemctl restart ceph-mon@ceph-master01.serviceroot@ceph-master02:~# systemctl restart ceph-mon@ceph-master02.serviceroot@ceph-master03:~# systemctl restart ceph-mon@ceph-master03.service
查看集群状态
cephadmin@ceph-master01:/etc/ceph-cluster# ceph -s cluster: id: f69afe6f-e559-4df7-998a-c5dc3e300209 health: HEALTH_OK services: mon: 3 daemons, quorum ceph-master01,ceph-master02,ceph-master03 (age 3m) mgr: ceph-master03(active, since 26h), standbys: ceph-master01, ceph-master02 osd: 9 osds: 9 up (since 26h), 9 in (since 27h) data: pools: 2 pools, 33 pgs objects: 1 objects, 100 MiB usage: 370 MiB used, 450 GiB / 450 GiB avail pgs: 33 active+clean
四.故障记录
1. bash: python2: command not found
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-bJQ1m94F-1667266646295)(Ceph.assets/image-20220902164421447.png)]
原因:ceph-master02节点没有安装python2.7
解决方案:
cephadmin@ceph-master01:~$ sudo apt install python2.7 -ycephadmin@ceph-master01:~$ sudo ln -sv /usr/bin/python2.7 /usr/bin/python2
2.[ceph_deploy][ERROR ] RuntimeError: AttributeError: module ‘platform’ has no attribute ‘linux_distribution’
# ceph-deploy new --cluster-network 10.0.0.0/24 --public-network 172.26.0.0/16 master01
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-fwXxvYVu-1667266646298)(Ceph.assets/image-20220825151515834.png)]
原因:
部署cepe的操作系统为ubuntu20.04,该版本python3.7后不再支持platform.linux_distribution
解决办法:
修改/usr/lib/python3/dist-packages/ceph_deploy/hosts/remotes.py文件为如下所示
def platform_information(_linux_distribution=None): """ detect platform information from remote host """ """ linux_distribution = _linux_distribution or platform.linux_distribution distro, release, codename = linux_distribution() """ distro = release = codename = None try: linux_distribution = _linux_distribution or platform.linux_distribution distro, release, codename = linux_distribution() except AttributeError: pass
修改前:
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-qC4Z6XWq-1667266646299)(Ceph.assets/image-20220825152858037.png)]
修改后:
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-66O37hn8-1667266646300)(Ceph.assets/image-20220825152510116.png)]
##### 3.apt-cache madison ceph-deploy 为1.5.38的低版本
cephadmin@ceph-master01:~$ sudo apt-cache madison ceph-deploy ceph-deploy | 1.5.38-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe amd64 Packagesceph-deploy | 1.5.38-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe i386 Packagescephadmin@ceph-master01:~$
4. RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdd
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy disk zap ceph-master01 /dev/sdd [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy disk zap ceph-master01 /dev/sdd[ceph_deploy.cli][INFO ] ceph-deploy options:[ceph_deploy.cli][INFO ] username : None[ceph_deploy.cli][INFO ] verbose : False[ceph_deploy.cli][INFO ] debug : False[ceph_deploy.cli][INFO ] overwrite_conf : False[ceph_deploy.cli][INFO ] subcommand : zap[ceph_deploy.cli][INFO ] quiet : False[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f8d53187280>[ceph_deploy.cli][INFO ] cluster : ceph[ceph_deploy.cli][INFO ] host : ceph-master01[ceph_deploy.cli][INFO ] func : <function disk at 0x7f8d5315d350>[ceph_deploy.cli][INFO ] ceph_conf : None[ceph_deploy.cli][INFO ] default_release : False[ceph_deploy.cli][INFO ] disk : ['/dev/sdd'][ceph_deploy.osd][DEBUG ] zapping /dev/sdd on ceph-master01[ceph-master01][DEBUG ] connection detected need for sudo[ceph-master01][DEBUG ] connected to host: ceph-master01 [ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ] find the location of an executable[ceph_deploy.osd][INFO ] Distro info: Ubuntu 18.04 bionic[ceph-master01][DEBUG ] zeroing last few blocks of device[ceph-master01][DEBUG ] find the location of an executable[ceph-master01][INFO ] Running command: sudo /usr/sbin/ceph-volume lvm zap /dev/sdd[ceph-master01][WARNIN] --> Zapping: /dev/sdd[ceph-master01][WARNIN] --> Zapping lvm member /dev/sdd. lv_path is /dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357[ceph-master01][WARNIN] Running command: /bin/dd if=/dev/zero of=/dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357 bs=1M count=10 conv=fsync[ceph-master01][WARNIN] stderr: 10+0 records in[ceph-master01][WARNIN] 10+0 records out[ceph-master01][WARNIN] stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0201594 s, 520 MB/s[ceph-master01][WARNIN] --> --destroy was not specified, but zapping a whole device will remove the partition table[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] --> RuntimeError: could not complete wipefs on device: /dev/sdd[ceph-master01][ERROR ] RuntimeError: command returned non-zero exit status: 1[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdd
原因:该osd没有完全剔除,不能zap擦除
解决方案:1.取消磁盘挂载(可能是挂载正在使用中)2.完全移除osd
5. 使用zap擦除格式化磁盘时,报错[ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdd
cephadmin@ceph-master01:/etc/ceph-cluster# ceph-deploy disk zap ceph-master01 /dev/sdd[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy disk zap ceph-master01 /dev/sdd[ceph_deploy.cli][INFO ] ceph-deploy options:[ceph_deploy.cli][INFO ] username : None[ceph_deploy.cli][INFO ] verbose : False[ceph_deploy.cli][INFO ] debug : False[ceph_deploy.cli][INFO ] overwrite_conf : False[ceph_deploy.cli][INFO ] subcommand : zap[ceph_deploy.cli][INFO ] quiet : False[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f8880b7c280>[ceph_deploy.cli][INFO ] cluster : ceph[ceph_deploy.cli][INFO ] host : ceph-master01[ceph_deploy.cli][INFO ] func : <function disk at 0x7f8880b52350>[ceph_deploy.cli][INFO ] ceph_conf : None[ceph_deploy.cli][INFO ] default_release : False[ceph_deploy.cli][INFO ] disk : ['/dev/sdd'][ceph_deploy.osd][DEBUG ] zapping /dev/sdd on ceph-master01[ceph-master01][DEBUG ] connection detected need for sudo[ceph-master01][DEBUG ] connected to host: ceph-master01 [ceph-master01][DEBUG ] detect platform information from remote host[ceph-master01][DEBUG ] detect machine type[ceph-master01][DEBUG ] find the location of an executable[ceph_deploy.osd][INFO ] Distro info: Ubuntu 18.04 bionic[ceph-master01][DEBUG ] zeroing last few blocks of device[ceph-master01][DEBUG ] find the location of an executable[ceph-master01][INFO ] Running command: sudo /usr/sbin/ceph-volume lvm zap /dev/sdd[ceph-master01][WARNIN] --> Zapping: /dev/sdd[ceph-master01][WARNIN] --> Zapping lvm member /dev/sdd. lv_path is /dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357[ceph-master01][WARNIN] Running command: /bin/dd if=/dev/zero of=/dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357 bs=1M count=10 conv=fsync[ceph-master01][WARNIN] stderr: 10+0 records in[ceph-master01][WARNIN] 10+0 records out[ceph-master01][WARNIN] 10485760 bytes (10 MB, 10 MiB) copied, 0.0244706 s, 429 MB/s[ceph-master01][WARNIN] stderr: [ceph-master01][WARNIN] --> --destroy was not specified, but zapping a whole device will remove the partition table[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy[ceph-master01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition[ceph-master01][WARNIN] --> RuntimeError: could not complete wipefs on device: /dev/sdd[ceph-master01][ERROR ] RuntimeError: command returned non-zero exit status: 1[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdd
解决办法:
手动执行命令擦除磁盘,Device or resource busy,说明磁盘正在使用中
cephadmin@ceph-master01:/etc/ceph-cluster# sudo /usr/sbin/ceph-volume lvm zap /dev/sdd--> Zapping: /dev/sdd--> Zapping lvm member /dev/sdd. lv_path is /dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357Running command: /bin/dd if=/dev/zero of=/dev/ceph-657ad072-e5d8-4812-a561-19cac0b02e0c/osd-block-b05d277c-b899-4f81-9d00-e7a8cf46b357 bs=1M count=10 conv=fsync stderr: 10+0 records in10+0 records out stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0241062 s, 435 MB/s--> --destroy was not specified, but zapping a whole device will remove the partition table stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy--> failed to wipefs device, will try again to workaround probable race condition stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy--> failed to wipefs device, will try again to workaround probable race condition stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy--> failed to wipefs device, will try again to workaround probable race condition stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy--> failed to wipefs device, will try again to workaround probable race condition stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy--> failed to wipefs device, will try again to workaround probable race condition stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy--> failed to wipefs device, will try again to workaround probable race condition stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy--> failed to wipefs device, will try again to workaround probable race condition stderr: wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy--> failed to wipefs device, will try again to workaround probable race condition--> RuntimeError: could not complete wipefs on device: /dev/sdd
方式一:彻底清除磁盘
dd if=/dev/zero of=/dev/sdb bs=512K count=1reboot
方式二:dmsetup移除
root@ceph-master01:~# lsblk sdi 8:128 0 447.1G 0 disk └─ceph--3511f2c6--2be6--40fd--901d--3b75e433afa5-osd--block--ca994912--f215--4612--97fa--abe33b07985b 253:7 0 447.1G 0 lvm# dmsetup移除root@ceph-master01:~# dmsetup remove ceph--3511f2c6--2be6--40fd--901d--3b75e433afa5-osd--block--ca994912--f215--4612--97fa--abe33b07985b
6. mons are allowing insecure global_id reclaim
如果AUTH_INSECURE_GLOBAL_ID_RECLAIM还没有引发健康警报并且auth_expose_insecure_global_id_reclaim尚未禁用该设置(默认情况下处于启用状态),则当前没有需要升级的客户端已连接,可以安全地禁止不安全的global_id回收:ceph config set mon auth_allow_insecure_global_id_reclaim false# 如果仍然有需要升级的客户端,则可以使用以下方法暂时使此警报停止:ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w # 1 week# 不建议这样做,但是您也可以无限期地禁用此警告:ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false
7. 1 pool(s) do not have an application enabled
执行ceph -s 出现pool(s) do not have an application enabled告警
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-MwmiQAZc-1667266646300)(Ceph.assets/image-20220907191339251.png)]
原因:执行监控检查命令ceph health detail,发现提示application not enabled on pool ‘mypool’
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-lbwCOjVT-1667266646301)(Ceph.assets/image-20220907191508997.png)]
意思是mypool存储池未设置任何应用(rdb,cephfs等),设置一个即可
cephadmin@ceph-master01:~# ceph osd pool application enable mypool rbdenabled application 'rbd' on pool 'mypool'cephadmin@ceph-master01:~# ceph -s cluster: id: f69afe6f-e559-4df7-998a-c5dc3e300209 health: HEALTH_WARN clock skew detected on mon.ceph-master02 services: mon: 3 daemons, quorum ceph-master01,ceph-master02,ceph-master03 (age 26h) mgr: ceph-master03(active, since 26h), standbys: ceph-master01, ceph-master02 osd: 9 osds: 9 up (since 25h), 9 in (since 26h) data: pools: 2 pools, 33 pgs objects: 1 objects, 100 MiB usage: 370 MiB used, 450 GiB / 450 GiB avail pgs: 33 active+clean
8.clock skew detected on mon.ceph-master02
多次执行ntpdate ntp.aliyun.com 同步时间,依然报警事件有误差
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-6EmLKmfn-1667266646302)(Ceph.assets/image-20220907201432575.png)]
原因:
cephadmin@ceph-master01:~# ceph health detailHEALTH_WARN clock skew detected on mon.ceph-master02[WRN] MON_CLOCK_SKEW: clock skew detected on mon.ceph-master02 mon.ceph-master02 clock skew 0.0662306s > max 0.05s (latency 0.00108482s)
服务器误差大于0.05秒,就会出现告警
解决办法:
修改默认参数
mon clock drift allowed #监视器间允许的时钟漂移量
mon clock drift warn backoff #时钟偏移警告的退避指数
cephadmin@ceph-master01:/etc/ceph-cluster# vi /etc/ceph/ceph.conf [mon]mon_clock_drift_allowed = 0.10mon clock drift warn backoff = 10#重启所有mon节点root@ceph-master01:~# systemctl restart ceph-mon@ceph-master01.serviceroot@ceph-master02:~# systemctl restart ceph-mon@ceph-master02.serviceroot@ceph-master03:~# systemctl restart ceph-mon@ceph-master03.service
9. 低内核无法挂载ceph rbd块存储
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-i5i7K00g-1667266646303)(Ceph.assets/image-20220908212236350.png)]
环境:
ceph 16.2.10
客户端 ceph-common 15.2.17
客户端内核 3.10.0-862
客户端操作系统 centos 7.5
所有的文章都告诉你 “低内核版本挂载rbd存储,rbd创建时只开发layering特性就能挂载”
rbd create myimg2 --size 3G --pool myrbd1 --image-format 2 --image-feature layering
但是低版本内核也分低版本,超低版本,在执行挂载命令 rbd -p myrbd1 map myimg2
centos 7.5 操作系统默认没升级内核的3.10.0-862版本不行。
centos 7.7 操作系统默认没升级内核的3.10.0-1062.4.3版本可以。
10. 1 filesystem is online with fewer MDS than max_mds
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-f29SjgfK-1667266646304)(Ceph.assets/image-20220921140448449.png)]
原因:
mds服务没有创建启动
解决办法:
ceph-deploy mds create ceph-master01 ceph-master02 ceph-master03
如果AUTH_INSECURE_GLOBAL_ID_RECLAIM还没有引发健康警报并且auth_expose_insecure_global_id_reclaim尚未禁用该设置(默认情况下处于启用状态),则当前没有需要升级的客户端已连接,可以安全地禁止不安全的global_id回收:ceph config set mon auth_allow_insecure_global_id_reclaim false# 如果仍然有需要升级的客户端,则可以使用以下方法暂时使此警报停止:ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w # 1 week# 不建议这样做,但是您也可以无限期地禁用此警告:ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false
7. 1 pool(s) do not have an application enabled
执行ceph -s 出现pool(s) do not have an application enabled告警
[外链图片转存中…(img-MwmiQAZc-1667266646300)]
原因:执行监控检查命令ceph health detail,发现提示application not enabled on pool ‘mypool’
[外链图片转存中…(img-lbwCOjVT-1667266646301)]
意思是mypool存储池未设置任何应用(rdb,cephfs等),设置一个即可
cephadmin@ceph-master01:~# ceph osd pool application enable mypool rbdenabled application 'rbd' on pool 'mypool'cephadmin@ceph-master01:~# ceph -s cluster: id: f69afe6f-e559-4df7-998a-c5dc3e300209 health: HEALTH_WARN clock skew detected on mon.ceph-master02 services: mon: 3 daemons, quorum ceph-master01,ceph-master02,ceph-master03 (age 26h) mgr: ceph-master03(active, since 26h), standbys: ceph-master01, ceph-master02 osd: 9 osds: 9 up (since 25h), 9 in (since 26h) data: pools: 2 pools, 33 pgs objects: 1 objects, 100 MiB usage: 370 MiB used, 450 GiB / 450 GiB avail pgs: 33 active+clean
8.clock skew detected on mon.ceph-master02
多次执行ntpdate ntp.aliyun.com 同步时间,依然报警事件有误差
[外链图片转存中…(img-6EmLKmfn-1667266646302)]
原因:
cephadmin@ceph-master01:~# ceph health detailHEALTH_WARN clock skew detected on mon.ceph-master02[WRN] MON_CLOCK_SKEW: clock skew detected on mon.ceph-master02 mon.ceph-master02 clock skew 0.0662306s > max 0.05s (latency 0.00108482s)
服务器误差大于0.05秒,就会出现告警
解决办法:
修改默认参数
mon clock drift allowed #监视器间允许的时钟漂移量
mon clock drift warn backoff #时钟偏移警告的退避指数
cephadmin@ceph-master01:/etc/ceph-cluster# vi /etc/ceph/ceph.conf [mon]mon_clock_drift_allowed = 0.10mon clock drift warn backoff = 10#重启所有mon节点root@ceph-master01:~# systemctl restart ceph-mon@ceph-master01.serviceroot@ceph-master02:~# systemctl restart ceph-mon@ceph-master02.serviceroot@ceph-master03:~# systemctl restart ceph-mon@ceph-master03.service
9. 低内核无法挂载ceph rbd块存储
[外链图片转存中…(img-i5i7K00g-1667266646303)]
环境:
ceph 16.2.10
客户端 ceph-common 15.2.17
客户端内核 3.10.0-862
客户端操作系统 centos 7.5
所有的文章都告诉你 “低内核版本挂载rbd存储,rbd创建时只开发layering特性就能挂载”
rbd create myimg2 --size 3G --pool myrbd1 --image-format 2 --image-feature layering
但是低版本内核也分低版本,超低版本,在执行挂载命令 rbd -p myrbd1 map myimg2
centos 7.5 操作系统默认没升级内核的3.10.0-862版本不行。
centos 7.7 操作系统默认没升级内核的3.10.0-1062.4.3版本可以。
10. 1 filesystem is online with fewer MDS than max_mds
[外链图片转存中…(img-f29SjgfK-1667266646304)]
原因:
mds服务没有创建启动
解决办法:
ceph-deploy mds create ceph-master01 ceph-master02 ceph-master03
来源地址:https://blog.csdn.net/weixin_43876317/article/details/127627885