接上。。。。
(三)扩展集群
compute(mon0):增加一个osd进程osd2 和一个元数据服务器mds0
controller(osd0):增加一个监视器服务器mon1
network(osd1) :增加一个监视器服务器mon2
注:多个监视器服务器可以生成quoraum
1. 在compute上增加OSD节点
(1)compute节点创建osd2目录
ssh compute
mkdir -p /var/lib/ceph/osd/ceph-osd2
fdisk /dev/sdc
mkfs.rfs -f /dev/sdc1
mount /dev/sdc1 /var/lib/ceph/osd/ceph-osd2
mount -o remount,user_xattr /dev/sdc1 /var/lib/ceph/osd/ceph-osd2
vi /etc/fstab
/dev/sdc1 /var/lib/ceph/osd/ceph-osd2 xfs defaults 0 0
/dev/sdc1 /var/lib/ceph/osd/ceph-osd2 xfs remount,user_xattr 0 0
(2)在管理节点compute上,准备OSD
cd /home/mengfei/my-cluster
ceph-deploy osd prepare compute:/var/lib/ceph/osd/ceph-osd2
ceph-deploy osd activate compute:/var/lib/ceph/osd/ceph-osd2
root@compute:/home/mengfei/my-cluster# ceph-deploy osd prepare compute:/var/lib/ceph/osd/ceph-osd2
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.20): /usr/bin/ceph-deploy osd prepare compute:/var/lib/ceph/osd/ceph-osd2
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks compute:/var/lib/ceph/osd/ceph-osd2:
[compute][DEBUG ] connected to host: compute
[compute][DEBUG ] detect platform information from remote host
[compute][DEBUG ] detect machine type
[ceph_deploy.osd][INFO ] Distro info: Ubuntu 14.04 trusty
[ceph_deploy.osd][DEBUG ] Deploying osd to compute
[compute][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[compute][INFO ] Running command: udevadm trigger --subsystem-match=block --action=add
[ceph_deploy.osd][DEBUG ] Preparing host compute disk /var/lib/ceph/osd/ceph-osd2 journal None activate False
[compute][INFO ] Running command: ceph-disk -v prepare --fs-type xfs --cluster ceph -- /var/lib/ceph/osd/ceph-osd2
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
[compute][WARNIN] DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/osd/ceph-osd2
[compute][INFO ] checking OSD status...
[compute][INFO ] Running command: ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host compute is now ready for osd use.
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph-deploy osd activate compute:/var/lib/ceph/osd/ceph-osd2
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.20): /usr/bin/ceph-deploy osd activate compute:/var/lib/ceph/osd/ceph-osd2
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks compute:/var/lib/ceph/osd/ceph-osd2:
[compute][DEBUG ] connected to host: compute
[compute][DEBUG ] detect platform information from remote host
[compute][DEBUG ] detect machine type
[ceph_deploy.osd][INFO ] Distro info: Ubuntu 14.04 trusty
[ceph_deploy.osd][DEBUG ] activating host compute disk /var/lib/ceph/osd/ceph-osd2
[ceph_deploy.osd][DEBUG ] will use init type: upstart
[compute][INFO ] Running command: ceph-disk -v activate --mark-init upstart --mount /var/lib/ceph/osd/ceph-osd2
[compute][WARNIN] DEBUG:ceph-disk:Cluster uuid is 8b2af1e6-92eb-4d74-9ca5-057522bb738f
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[compute][WARNIN] DEBUG:ceph-disk:Cluster name is ceph
[compute][WARNIN] DEBUG:ceph-disk:OSD uuid is 032998d3-03b5-458d-b32b-de48305e5b59
[compute][WARNIN] DEBUG:ceph-disk:Allocating OSD id...
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise 032998d3-03b5-458d-b32b-de48305e5b59
[compute][WARNIN] DEBUG:ceph-disk:OSD id is 2
[compute][WARNIN] DEBUG:ceph-disk:Initializing OSD...
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-osd2/activate.monmap
[compute][WARNIN] got monmap epoch 1
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster ceph --mkfs --mkkey -i 2 --monmap /var/lib/ceph/osd/ceph-osd2/activate.monmap --osd-data /var/lib/ceph/osd/ceph-osd2 --osd-journal /var/lib/ceph/osd/ceph-osd2/journal --osd-uuid 032998d3-03b5-458d-b32b-de48305e5b59 --keyring /var/lib/ceph/osd/ceph-osd2/keyring
[compute][WARNIN] 2014-11-28 14:32:34.800238 b6822740 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway
[compute][WARNIN] 2014-11-28 14:32:35.280160 b6822740 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway
[compute][WARNIN] 2014-11-28 14:32:35.304026 b6822740 -1 filestore(/var/lib/ceph/osd/ceph-osd2) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
[compute][WARNIN] 2014-11-28 14:32:35.370476 b6822740 -1 created object store /var/lib/ceph/osd/ceph-osd2 journal /var/lib/ceph/osd/ceph-osd2/journal for osd.2 fsid 8b2af1e6-92eb-4d74-9ca5-057522bb738f
[compute][WARNIN] 2014-11-28 14:32:35.370543 b6822740 -1 auth: error reading file: /var/lib/ceph/osd/ceph-osd2/keyring: can't open /var/lib/ceph/osd/ceph-osd2/keyring: (2) No such file or directory
[compute][WARNIN] 2014-11-28 14:32:35.370712 b6822740 -1 created new key in keyring /var/lib/ceph/osd/ceph-osd2/keyring
[compute][WARNIN] DEBUG:ceph-disk:Marking with init system upstart
[compute][WARNIN] DEBUG:ceph-disk:Authorizing OSD key...
[compute][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring auth add osd.2 -i /var/lib/ceph/osd/ceph-osd2/keyring osd allow * mon allow profile osd
[compute][WARNIN] added key for osd.2
[compute][WARNIN] DEBUG:ceph-disk:ceph osd.2 data dir is ready at /var/lib/ceph/osd/ceph-osd2
[compute][WARNIN] DEBUG:ceph-disk:Creating symlink /var/lib/ceph/osd/ceph-2 -> /var/lib/ceph/osd/ceph-osd2
[compute][WARNIN] DEBUG:ceph-disk:Starting ceph osd.2...
[compute][WARNIN] INFO:ceph-disk:Running command: /sbin/initctl emit --no-wait -- ceph-osd cluster=ceph id=2
[compute][INFO ] checking OSD status...
[compute][INFO ] Running command: ceph --cluster=ceph osd stat --format=json
root@compute:/home/mengfei/my-cluster#
(3)增加OSD节点后,查看集群重新平衡状态
ceph osd tree
ceph -w
ceph -s
ceph osd dump
root@compute:/home/mengfei/my-cluster# ceph osd tree (weight默认是0)
# id weight type name up/down reweight
-1 0 root default
-2 0 host controller
0 0 osd.0 up 1
-3 0 host network
1 0 osd.1 up 1
-4 0 host compute
2 0 osd.2 up 1
root@compute:/home/mengfei/my-cluster#
[root@compute:/home/mengfei/my-cluster# ceph -w (由于没修改weight权重值,所以下边状态是192 creating+incomplete)
cluster 8b2af1e6-92eb-4d74-9ca5-057522bb738f
health HEALTH_WARN 192 pgs incomplete; 192 pgs stuck inactive; 192 pgs stuck unclean; 50 requests are blocked > 32 sec
monmap e3: 3 mons at {compute=192.168.128.101:6789/0,controller=192.168.128.100:6789/0,network=192.168.128.102:6789/0}, election epoch 6, quorum 0,1,2 controller,compute,network
mdsmap e5: 1/1/1 up {0=compute=up:creating}
osdmap e23: 3 osds: 3 up, 3 in
pgmap v50: 192 pgs, 3 pools, 0 bytes data, 0 objects
398 MB used, 2656 MB / 3054 MB avail
192 creating+incomplete
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph osd dump
epoch 23
fsid 8b2af1e6-92eb-4d74-9ca5-057522bb738f
created 2014-11-27 16:22:54.085639
modified 2014-11-28 16:30:06.501906
flags
pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool crash_replay_interval 45 stripe_width 0
pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
max_osd 3
osd.0 up in weight 0 up_from 15 up_thru 15 down_at 12 last_clean_interval [4,11) 192.168.128.100:6800/3272 192.168.128.100:6801/3272 192.168.128.100:6802/3272 192.168.128.100:6803/3272 exists,up f4707c04-aeca-46fe-bf0e-f7e2d43d0524
osd.1 up in weight 0 up_from 14 up_thru 0 down_at 13 last_clean_interval [8,12) 192.168.128.102:6800/3272 192.168.128.102:6801/3272 192.168.128.102:6802/3272 192.168.128.102:6803/3272 exists,up c8b2811c-fb19-49c3-b630-374a4db7073e
osd.2 up in weight 0 up_from 22 up_thru 0 down_at 21 last_clean_interval [19,19) 192.168.128.101:6801/16367 192.168.128.101:6802/16367 192.168.128.101:6803/16367 192.168.128.101:6804/16367 exists,up 032998d3-03b5-458d-b32b-de48305e5b59
root@compute:/home/mengfei/my-cluster#
2. 在compute上增加元数据服务器
注:为使用CephFS文件系统,至少需要一台元数据服务器
注:当前Ceph产品仅支持一个元数据服务器,可尝试运行多个,但不受商业支持
ceph-deploy mds create compute
ceph mds stat 查看状态
ceph mds dump 查看状态
[root@compute:/home/mengfei/my-cluster# ceph-deploy mds create compute
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.20): /usr/bin/ceph-deploy mds create compute
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts compute:compute
[compute][DEBUG ] connected to host: compute
[compute][DEBUG ] detect platform information from remote host
[compute][DEBUG ] detect machine type
[ceph_deploy.mds][INFO ] Distro info: Ubuntu 14.04 trusty
[ceph_deploy.mds][DEBUG ] remote host will use upstart
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to compute
[compute][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[compute][DEBUG ] create path if it doesn't exist
[compute][INFO ] Running command: ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.compute osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-compute/keyring
[compute][INFO ] Running command: initctl emit ceph-mds cluster=ceph id=compute
Unhandled exception in thread started by
[root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph mds stat
e3: 1/1/1 up {0=compute=up:creating}
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph mds dump
dumped mdsmap epoch 3
epoch 3
flags 0
created 2014-11-27 16:22:54.081490
modified 2014-11-28 14:45:35.509558
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
last_failure 0
last_failure_osd_epoch 0
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap}
max_mds 1
in 0
up {0=4306}
failed
stopped
data_pools 0
metadata_pool 1
inline_data disabled
4306: 192.168.128.101:6805/7363 'compute' mds.0.1 up:creating seq 1
root@compute:/home/mengfei/my-cluster#
删除元数据:(注:删除元数据mds时,会提示以下信息,必须降低max_mds)
root@compute:/home/mengfei# ceph mds stop 0
Error EBUSY: must decrease max_mds or else MDS will immediately reactivate
root@compute:/home/mengfei#
root@compute:/home/mengfei# ceph mds set_max_mds 0 (降低max值)
max_mds = 0
root@compute:/home/mengfei#
root@compute:/home/mengfei# ceph mds stop 0
telling mds.0 192.168.128.101:6800/26057 to deactivate
root@compute:/home/mengfei#
3. 在controller=osd0/network=osd1节点增加监视器mon1和mon2
注:Ceph使用Paxos算法,需要多个Ceph监视器组成Quoram(如1,2:3,3:4,3:5,4:6等)
ceph-deploy admin create controller network (注:重新分发以下的配置文件)
ceph-deploy mon create controller network
注:执行以上命令时,提示/var/run/ceph/ceph-mon.controller.asok not found. 主要还是ceph.conf文件不对
添加相关项再push到所有节点后就正常了。
vi /home/mengfei/my-cluster/ceph.conf (以下项也并不全面,稍后再改)
[global]
fsid = 8b2af1e6-92eb-4d74-9ca5-057522bb738f
mon_initial_members = compute,controller,network
mon_host = 192.168.128.101,192.168.128.100,192.168.128.102
public network = 192.168.128.0/24
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
#filestore_xattr_use_omap = true
[osd]
osd journal size = 100
filestore_xattr_use_omap = true
osd pool default size = 3
osd pool default min_size = 1
osd crush chooseleaf type = 1
[osd.0]
host = controller
[osd.1]
host = network
[osd.2]
host = compute
[mon.a]
host = compute
mon_addr = 192.168.128.101:6789
[mon.b]
host = controller
mon_addr = 192.168.128.100:6789
[mon.c]
host = network
mon_addr = 192.168.128.102:6789
[mds.a]
host = compute
root@compute:/home/mengfei/my-cluster# ceph-deploy mon create controller network
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.20): /usr/bin/ceph-deploy mon create controller network
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts controller network
[ceph_deploy.mon][DEBUG ] detecting platform for host controller ...
[controller][DEBUG ] connected to host: controller
[controller][DEBUG ] detect platform information from remote host
[controller][DEBUG ] detect machine type
[ceph_deploy.mon][INFO ] distro info: Ubuntu 14.04 trusty
[controller][DEBUG ] determining if provided host has same hostname in remote
[controller][DEBUG ] get remote short hostname
[controller][DEBUG ] deploying mon to controller
[controller][DEBUG ] get remote short hostname
[controller][DEBUG ] remote hostname: controller
[controller][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[controller][DEBUG ] create the mon path if it does not exist
[controller][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-controller/done
[controller][DEBUG ] create a done file to avoid re-doing the mon deployment
[controller][DEBUG ] create the init path if it does not exist
[controller][DEBUG ] locating the `service` executable...
[controller][INFO ] Running command: initctl emit ceph-mon cluster=ceph id=controller
[controller][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.controller.asok mon_status
[controller][DEBUG ] ********************************************************************************
[controller][DEBUG ] status for monitor: mon.controller
[controller][DEBUG ] {
[controller][DEBUG ] "election_epoch": 0,
[controller][DEBUG ] "extra_probe_peers": [
[controller][DEBUG ] "192.168.128.101:6789/0"
[controller][DEBUG ] ],
[controller][DEBUG ] "monmap": {
[controller][DEBUG ] "created": "0.000000",
[controller][DEBUG ] "epoch": 1,
[controller][DEBUG ] "fsid": "8b2af1e6-92eb-4d74-9ca5-057522bb738f",
[controller][DEBUG ] "modified": "0.000000",
[controller][DEBUG ] "mons": [
[controller][DEBUG ] {
[controller][DEBUG ] "addr": "192.168.128.101:6789/0",
[controller][DEBUG ] "name": "compute",
[controller][DEBUG ] "rank": 0
[controller][DEBUG ] }
[controller][DEBUG ] ]
[controller][DEBUG ] },
[controller][DEBUG ] "name": "controller",
[controller][DEBUG ] "outside_quorum": [],
[controller][DEBUG ] "quorum": [],
[controller][DEBUG ] "rank": -1,
[controller][DEBUG ] "state": "probing",
[controller][DEBUG ] "sync_provider": []
[controller][DEBUG ] }
[controller][DEBUG ] ********************************************************************************
[controller][INFO ] monitor: mon.controller is currently at the state of probing
[controller][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.controller.asok mon_status
[controller][WARNIN] monitor controller does not exist in monmap
[ceph_deploy.mon][DEBUG ] detecting platform for host network ...
[network][DEBUG ] connected to host: network
[network][DEBUG ] detect platform information from remote host
[network][DEBUG ] detect machine type
[ceph_deploy.mon][INFO ] distro info: Ubuntu 14.04 trusty
[network][DEBUG ] determining if provided host has same hostname in remote
[network][DEBUG ] get remote short hostname
[network][DEBUG ] deploying mon to network
[network][DEBUG ] get remote short hostname
[network][DEBUG ] remote hostname: network
[network][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[network][DEBUG ] create the mon path if it does not exist
[network][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-network/done
[network][DEBUG ] create a done file to avoid re-doing the mon deployment
[network][DEBUG ] create the init path if it does not exist
[network][DEBUG ] locating the `service` executable...
[network][INFO ] Running command: initctl emit ceph-mon cluster=ceph id=network
[network][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.network.asok mon_status
[network][DEBUG ] ********************************************************************************
[network][DEBUG ] status for monitor: mon.network
[network][DEBUG ] {
[network][DEBUG ] "election_epoch": 1,
[network][DEBUG ] "extra_probe_peers": [
[network][DEBUG ] "192.168.128.101:6789/0"
[network][DEBUG ] ],
[network][DEBUG ] "monmap": {
[network][DEBUG ] "created": "0.000000",
[network][DEBUG ] "epoch": 3,
[network][DEBUG ] "fsid": "8b2af1e6-92eb-4d74-9ca5-057522bb738f",
[network][DEBUG ] "modified": "2014-11-28 16:18:49.267793",
[network][DEBUG ] "mons": [
[network][DEBUG ] {
[network][DEBUG ] "addr": "192.168.128.100:6789/0",
[network][DEBUG ] "name": "controller",
[network][DEBUG ] "rank": 0
[network][DEBUG ] },
[network][DEBUG ] {
[network][DEBUG ] "addr": "192.168.128.101:6789/0",
[network][DEBUG ] "name": "compute",
[network][DEBUG ] "rank": 1
[network][DEBUG ] },
[network][DEBUG ] {
[network][DEBUG ] "addr": "192.168.128.102:6789/0",
[network][DEBUG ] "name": "network",
[network][DEBUG ] "rank": 2
[network][DEBUG ] }
[network][DEBUG ] ]
[network][DEBUG ] },
[network][DEBUG ] "name": "network",
[network][DEBUG ] "outside_quorum": [],
[network][DEBUG ] "quorum": [],
[network][DEBUG ] "rank": 2,
[network][DEBUG ] "state": "electing",
[network][DEBUG ] "sync_provider": []
[network][DEBUG ] }
[network][DEBUG ] ********************************************************************************
[network][INFO ] monitor: mon.network is running
[network][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.network.asok mon_status
root@compute:/home/mengfei/my-cluster#
查看监视器quorum状态 (注:增加监视器后,ceph将同步各监视器并形成quarum)
ceph mon stat
ceph mon_status
ceph mon dump
ceph quorum_status
root@compute:/home/mengfei/my-cluster# ceph mon stat
e3: 3 mons at {compute=192.168.128.101:6789/0,controller=192.168.128.100:6789/0,network=192.168.128.102:6789/0}, election epoch 6, quorum 0,1,2 controller,compute,network
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph mon_status
{"name":"controller","rank":0,"state":"leader","election_epoch":6,"quorum":[0,1,2],"outside_quorum":[],"extra_probe_peers":["192.168.128.101:6789\/0","192.168.128.102:6789\/0"],"sync_provider":[],"monmap":{"epoch":3,"fsid":"8b2af1e6-92eb-4d74-9ca5-057522bb738f","modified":"2014-11-28 16:18:49.267793","created":"0.000000","mons":[{"rank":0,"name":"controller","addr":"192.168.128.100:6789\/0"},{"rank":1,"name":"compute","addr":"192.168.128.101:6789\/0"},{"rank":2,"name":"network","addr":"192.168.128.102:6789\/0"}]}}
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph mon dump
dumped monmap epoch 3
epoch 3
fsid 8b2af1e6-92eb-4d74-9ca5-057522bb738f
last_changed 2014-11-28 16:18:49.267793
created 0.000000
0: 192.168.128.100:6789/0 mon.controller
1: 192.168.128.101:6789/0 mon.compute
2: 192.168.128.102:6789/0 mon.network
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph quorum_status
{"election_epoch":6,"quorum":[0,1,2],"quorum_names":["controller","compute","network"],"quorum_leader_name":"controller","monmap":{"epoch":3,"fsid":"8b2af1e6-92eb-4d74-9ca5-057522bb738f","modified":"2014-11-28 16:18:49.267793","created":"0.000000","mons":[{"rank":0,"name":"controller","addr":"192.168.128.100:6789\/0"},{"rank":1,"name":"compute","addr":"192.168.128.101:6789\/0"},{"rank":2,"name":"network","addr":"192.168.128.102:6789\/0"}]}}
root@compute:/home/mengfei/my-cluster#
(四)验证集群osd,检查集群健康状况
ceph health 查看健康状态
ceph auth list 查看认证状态
ceph osd tree 查看状态
ceph -s 查看状态
ceph -w 查看实时状态(和-s内容一样)
ceph osd dump 查看osd配置信息
ceph osd rm 删除节点 remove osd(s) <id> [<id>...]
ceph osd crush rm osd.0 在集群中删除一个osd 硬盘 crush map
ceph osd crush rm node1 在集群中删除一个osd的host节点
以下是修改object副本个数及最小个数命令(也可以在ceph.conf中指定):
ceph osd pool set data size 3
ceph osd pool set metadata size 3
ceph osd pool set rbd size 3
ceph osd pool set data min_size 1
ceph osd pool set metadata min_size 1
ceph osd pool set rbd min_size 1
以下是修改允许的最大时钟差(默认情况下,实例中报ceph -w会报:clock skew detected on mon.compute,ceph health detail可看详细,修改此值为0.5,就会health oK)
[mon]
mon_clock_drift_allowed = 0.5
以下是修改weight权重值命令:
ceph osd crush set 0 1.0 host=controller
ceph osd crush set 1 1.0 host=network
ceph osd crush set 2 1.0 host=compute
注:你将会看到PG状态由活跃且干净状态变成活跃态,其中存在部分降级对象。当迁移完成后,
将再次返回活跃且干净状态。(可按Control+c组合键退出)
root@compute:/var/log/ceph# ceph osd crush set 0 1.0 host=controller
set item id 0 name 'osd.0' weight 1 at location {host=controller} to crush map
root@compute:/var/log/ceph# ceph osd crush set 1 1.0 host=network
set item id 1 name 'osd.1' weight 1 at location {host=network} to crush map
root@compute:/var/log/ceph# ceph osd crush set 2 1.0 host=compute
set item id 2 name 'osd.2' weight 1 at location {host=compute} to crush map
root@compute:/var/log/ceph#
root@compute:/home/mengfei/my-cluster# ceph osd tree (weight默认是0)
# id weight type name up/down reweight
-1 0 root default
-2 0 host controller
0 0 osd.0 up 1
-3 0 host network
1 0 osd.1 up 1
-4 0 host compute
2 0 osd.2 up 1
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph -s (inactive+unclean状态,实例中修改weight值为1之后<默认是0>,就正常了)
cluster 8b2af1e6-92eb-4d74-9ca5-057522bb738f
health HEALTH_WARN 192 pgs incomplete; 192 pgs stuck inactive; 192 pgs stuck unclean
monmap e1: 1 mons at {compute=192.168.128.101:6789/0}, election epoch 1, quorum 0 compute
osdmap e16: 2 osds: 2 up, 2 in
pgmap v31: 192 pgs, 3 pools, 0 bytes data, 0 objects
266 MB used, 1770 MB / 2036 MB avail
192 creating+incomplete
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph osd tree (weight值修改为1)
# id weight type name up/down reweight
-1 3 root default
-2 1 host controller
0 1 osd.0 up 1
-3 1 host network
1 1 osd.1 up 1
-4 1 host compute
2 1 osd.2 up 1
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph -s (以下是修改weight=1后的输出)
cluster 8b2af1e6-92eb-4d74-9ca5-057522bb738f
health HEALTH_WARN clock skew detected on mon.compute, mon.network (这是时钟偏移报警,应该没事,修改ceph.conf [mon]mon_clock_drift_allowed = 0.5解决)
monmap e3: 3 mons at {compute=192.168.128.101:6789/0,controller=192.168.128.100:6789/0,network=192.168.128.102:6789/0}, election epoch 30, quorum 0,1,2 controller,compute,network
mdsmap e14: 1/1/1 up {0=compute=up:active}
osdmap e89: 3 osds: 3 up, 3 in
pgmap v351: 192 pgs, 3 pools, 1884 bytes data, 20 objects
406 MB used, 2648 MB / 3054 MB avail
192 active+clean
root@compute:/home/mengfei/my-cluster#
root@compute:/var/log/ceph# ceph osd dump
epoch 89
fsid 8b2af1e6-92eb-4d74-9ca5-057522bb738f
created 2014-11-27 16:22:54.085639
modified 2014-11-28 23:39:44.056533
flags
pool 0 'data' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 89 flags hashpspool crash_replay_interval 45 stripe_width 0
pool 1 'metadata' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 88 flags hashpspool stripe_width 0
pool 2 'rbd' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 87 flags hashpspool stripe_width 0
max_osd 3
osd.0 up in weight 1 up_from 32 up_thru 82 down_at 31 last_clean_interval [15,29) 192.168.128.100:6800/2811 192.168.128.100:6801/2811 192.168.128.100:6802/2811 192.168.128.100:6803/2811 exists,up f4707c04-aeca-46fe-bf0e-f7e2d43d0524
osd.1 up in weight 1 up_from 33 up_thru 82 down_at 29 last_clean_interval [14,28) 192.168.128.102:6800/3105 192.168.128.102:6801/3105 192.168.128.102:6802/3105 192.168.128.102:6803/3105 exists,up c8b2811c-fb19-49c3-b630-374a4db7073e
osd.2 up in weight 1 up_from 35 up_thru 82 down_at 30 last_clean_interval [27,29) 192.168.128.101:6801/3173 192.168.128.101:6802/3173 192.168.128.101:6803/3173 192.168.128.101:6804/3173 exists,up 032998d3-03b5-458d-b32b-de48305e5b59
root@compute:/var/log/ceph#
(五)存储/恢复对象数据
注:为了能够操作Ceph存储集群中的对象数据,Ceph客户端必需满足:
1. 设置一个对象名
2. 指定一个数据池
Ceph客户端取回最新的集群映射表,并根据CRUSH算法先计算如何将对象映射到某个PG中,
然后再计算如何将该PG动态映射入一个Ceph OSD进程上。为了查找对象位置,你需要的仅仅是对象名称和数据池名称
ceph osd map {poolname} {object-name}
1.练习:定位一个对象
作为一个练习,我们先创建一个对象。使用rados put命令指定对象名称、存储对象数据的测试文件路径和地址池名称。例如:
格式:rados put {object-name} {file-path} --pool=data
rados put zhi-ceph zhi.txt --pool=data
2. 为了验证Ceph存储集群已存储该对象,执行如下命令
rados -p data ls
rados -p metadta ls
root@compute:/home/mengfei/my-cluster# rados -p data ls
zhi-ceph
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# rados -p metadata ls
609.00000000
mds0_sessionmap
608.00000000
601.00000000
602.00000000
mds0_inotable
1.00000000.inode
200.00000000
604.00000000
605.00000000
mds_anchortable
mds_snaptable
600.00000000
603.00000000
100.00000000
200.00000001
606.00000000
607.00000000
100.00000000.inode
1.00000000
root@compute:/home/mengfei/my-cluster#
3. 现在,可标识对象位置
格式:ceph osd map {pool-name} {object-name}
ceph osd map data zhi-ceph
ceph osd map metadata zhi-ceph
root@compute:/home/mengfei/my-cluster# ceph osd map data zhi-ceph
osdmap e89 pool 'data' (0) object 'zhi-ceph' -> pg 0.e67b1a3 (0.23) -> up ([1,2,0], p1) acting ([1,2,0], p1)
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph osd map metadata zhi-ceph
osdmap e89 pool 'metadata' (1) object 'zhi-ceph' -> pg 1.e67b1a3 (1.23) -> up ([0,1,2], p0) acting ([0,1,2], p0)
root@compute:/home/mengfei/my-cluster#
Ceph将输出对象位置信息,例如:
osdmap e537 pool 'data' (0) object 'test-object-1' -> pg 0.d1743484 (0.4) -> up [1,0] acting [1,0]
4. 删除测试对象,使用rados rm命令
rados rm zhi-ceph --pool=data
注:当集群扩展后,对象位置可能会动态变更。Ceph动态平衡的一个好处就是Ceph可自动完成迁移而无须你手动操作
5.创建一个池
ceph osd pool create zhi-pool 128
ceph osd pool set zhi-pool min_size 1
ceph -w 查看实时的迁移状态
(六)块设备快速启动
注:要使用这个指南,你必须首先在对象存储快速启动引导中执行程序。在使用Ceph的块设备工作前,确保您的Ceph的存储集群是在主动 + 清洁状态。在admin节点上执行这个快速启动。
注意:Ceph的块设备也被称为RBD或RADOS块设备
1. 安装 Ceph
(1) 检查linux的内核版本
lsb_release -a
uname -r
(2)在管理节点上,用ceph-deploy安装Ceph在你的ceph-client节点 (注:前边已经安装过,这里不再执行)
ceph-deploy install network (实例中将network作为client端)
(3) 在管理节点上,用ceph-deploy复制Ceph配置文件和ceph.client.admin.keyring到你的ceph-client
ceph-deploy admin network
2. 配置一个块设备
(1)在ceph-client节点上,创建一个块设备的镜像
rbd create foo --size 4096 [-m {mon-IP}] [-k /path/to/ceph.client.admin.keyring]
ceph osd pool create rbd-pool 128
ceph osd pool set rbd-pool min_size 1
rbd create foo --size 512
rbd create bar --size 256 --pool rbd-pool
rbd create zhi --size 512 --pool rbd-pool
ceph osd pool delete rbd-pool rbd-pool --yes-i-really-really-mean-it
注:删除一个pool,要指定两次poolname 并加上后边的参数
验证查询块设备信息:
rbd ls 查看块设备镜像
rbd ls rbd-pool 列出块设备在一个特定的池
rbd --image foo info 从一个特定的镜像查询信息
rbd --image bar -p rbd-pool info 查询一个池内的镜像信息
root@network:/home/mengfei/my-cluster# rbd ls
foo
root@network:/home/mengfei/my-cluster#
root@network:/home/mengfei/my-cluster# rbd ls rbd-pool
bar
root@network:/home/mengfei/my-cluster#
root@network:/home/mengfei# rbd showmapped
id pool image snap device
1 rbd foo - /dev/rbd1
2 rbd-pool zhi - /dev/rbd2
3 rbd-pool bar - /dev/rbd3
root@network:/home/mengfei#
root@network:/home/mengfei/my-cluster# rbd --image foo info
rbd image 'foo':
size 512 MB in 128 objects
order 22 (4096 kB objects)
block_name_prefix: rb.0.16cb.2ae8944a
format: 1
root@network:/home/mengfei/my-cluster#
root@network:/home/mengfei/my-cluster# rbd --image bar -p rbd-pool info
rbd image 'bar':
size 512 MB in 128 objects
order 22 (4096 kB objects)
block_name_prefix: rb.0.16b8.2ae8944a
format: 1
root@network:/home/mengfei/my-cluster#
调整块设备镜像
Ceph的块设备镜像精简置备。他们实际上不使用任何物理存储,直到你开始保存数据。
然而,他们有一个最大容量-大小选项设置。如果你想增加(或减少)一个的CEPH座设备镜像的最大尺寸,执行以下命令:
rbd resize --image foo --size 1024 增加size
rbd resize --image foo --allow-shrink --size 512 减小size
rbd resize --image zhi -p rbd-pool --size 1024
rbd resize --image zhi -p rbd-pool --allow-shrink --size 256
删除块设备镜像
rbd rm foo 要删除一个块设备
rbd rm bar -p rbd-pool 从池中删除一个块设备
(2)在ceph-client节点上,加载rbd客户端模块
modprobe rbd
(3)在ceph-client节点上,映射这个镜像到一个块设备
rbd map foo --pool rbd --name client.admin [-m {mon-IP}] [-k /path/to/ceph.client.admin.keyring]
rbd map foo
rbd map bar --pool rbd-pool
rbd map zhi --pool rbd-pool (实例中以zhi为例来操作)
rbd showmapped 显示映射块设备
取消块设备映射
格式:rbd unmap /dev/rbd/{poolname}/{imagename}
rbd unmap /dev/rbd/rbd/foo
(4)用这个块设备在一个ceph-client节点network上创建一个文件系统
mkfs.ext4 -m0 /dev/rbd/rbd-pool/zhi (这个可能要花费几分钟)
root@network:/home/mengfei/my-cluster# rbd map zhi --pool rbd-pool
root@network:/home/mengfei/my-cluster# mkfs.ext4 -m0 /dev/rbd/rbd-pool/zhi
mke2fs 1.42.9 (4-Feb-2014)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=1024 blocks, Stripe width=1024 blocks
32768 inodes, 131072 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=134217728
4 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304
Allocating group tables: done
Writing inode tables: done
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done
root@network:/home/mengfei/my-cluster#
(5)挂载这个文件系统到你的ceph-client节点上
mkdir /mnt/ceph-block-device
mount /dev/rbd/rbd-pool/zhi /mnt/ceph-block-device
cd /mnt/ceph-block-device
修改自动加载:
默认情况下,创建块设备后,有个/etc/init.d/rbdmap文件
root@network:/home/mengfei# vi /etc/ceph/rbdmap (不知道为何,在管理节点compute有这个文件,client节点network上没有,暂从compute拷过来)
# RbdDevice Parameters
#poolname/imagename id=client,keyring=/etc/ceph/ceph.client.keyring
rbd-pool/zhi
#rbd-pool/bar
#rbd/foo
注:因为如果禁用了cephx,所以不必配置keyring了。
这样就可以手动控制、并且开关机可以自动挂载和卸载rbd块设备了
配置rbdmap (实例中/etc/init.d/rbdmap会自动生成)
创建rbd块设备并rbd map后,如果不及时rbd unmap,关机的时候系统会hung在umount此rbd设备上。
所以配置rbdmap是必须的。首先下载并设置开机启动rbdmap
$ sudo wget https://raw.github.com/ceph/ceph ... 06a/src/init-rbdmap -O /etc/init.d/rbdmap
$ sudo chmod +x /etc/init.d/rbdmap
$ sudo update-rc.d rbdmap defaults
root@network:/home/mengfei# update-rc.d rbdmap defaults
Adding system startup for /etc/init.d/rbdmap ...
/etc/rc0.d/K20rbdmap -> ../init.d/rbdmap
/etc/rc1.d/K20rbdmap -> ../init.d/rbdmap
/etc/rc6.d/K20rbdmap -> ../init.d/rbdmap
/etc/rc2.d/S20rbdmap -> ../init.d/rbdmap
/etc/rc3.d/S20rbdmap -> ../init.d/rbdmap
/etc/rc4.d/S20rbdmap -> ../init.d/rbdmap
/etc/rc5.d/S20rbdmap -> ../init.d/rbdmap
root@network:/home/mengfei#
(6)验证rbd的信息
ceph -w
rados -p rbd-pool ls
ceph osd map rbd-pool rbd
root@network:/home/mengfei/my-cluster# rados -p rbd-pool ls
rb.0.16df.2ae8944a.000000000041
rb.0.16df.2ae8944a.000000000042
rbd_directory
rb.0.16df.2ae8944a.000000000060
rb.0.16df.2ae8944a.000000000001
rb.0.16df.2ae8944a.000000000020
bar.rbd
zhi.rbd
rb.0.16df.2ae8944a.000000000002
rb.0.16df.2ae8944a.000000000040
rb.0.16df.2ae8944a.000000000043
rb.0.16df.2ae8944a.00000000007f
rb.0.16df.2ae8944a.000000000000
root@network:/home/mengfei/my-cluster#
root@network:/home/mengfei/my-cluster# ceph osd map rbd-pool rbd
osdmap e139 pool 'rbd-pool' (4) object 'rbd' -> pg 4.7a31dfd8 (4.58) -> up ([1,0,2], p1) acting ([1,0,2], p1)
root@network:/home/mengfei/my-cluster#
(七)Ceph的文件系统快速入门
(1)先决条件
安装 ceph-common.
apt-get install ceph-common
注:确保Ceph的存储集群正在运行,并且在活跃 + 清洁状态。此外,确保你至少有一个Ceph的元数据服务器运行
ceph -s
root@compute:/var/lib/ceph/osd/ceph-osd2/current# ceph -s
cluster 8b2af1e6-92eb-4d74-9ca5-057522bb738f
health HEALTH_OK
monmap e3: 3 mons at {compute=192.168.128.101:6789/0,controller=192.168.128.100:6789/0,network=192.168.128.102:6789/0}, election epoch 72, quorum 0,1,2 controller,compute,network
mdsmap e34: 1/1/1 up {0=compute=up:active}
osdmap e139: 3 osds: 3 up, 3 in
pgmap v758: 448 pgs, 5 pools, 60758 kB data, 43 objects
584 MB used, 2470 MB / 3054 MB avail
448 active+clean
root@compute:/var/lib/ceph/osd/ceph-osd2/current#
(2)创建一个文件系统
ceph osd pool create cephfs_data 64 (注:用ceph osd dump可以看到新生成的pool名及int号,生成此pool id是5)
ceph osd pool create cephfs_metadata 64 (生成此pool的id为6)
#ceph fs newfs mycephfs cephfs_metadata cephfs_data 此命令不对,用以下命令
ceph osd dump
创建新的mds fs pool:
格式:ceph mds newfs <int[0-]> <int[0-]> {--yes-i-really-mean-it} : make new filesystom using pools <metadata> and <data>
ceph mds dump
ceph mds newfs 6 5 --yes-i-really-mean-it
root@compute:/home/mengfei/my-cluster# ceph mds newfs 6 5 --yes-i-really-mean-it
new fs with metadata pool 6 and data pool 5
root@compute:/home/mengfei/my-cluster#
ceph mds add_data_pool cephfs_data (如果要添加pool,可以用此命令,如添加的是默认pool:data会无法删除,只能上边命令新建)
root@compute:/home/mengfei/my-cluster# ceph mds dump (注:默认情况下data_pools和metadata_pool是0和1)
dumped mdsmap epoch 201
epoch 201
flags 0
created 2014-12-04 15:09:55.788256
modified 2014-12-04 15:09:57.898040
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
last_failure 0
last_failure_osd_epoch 0
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap}
max_mds 1
in 0
up {0=7938}
failed
stopped
data_pools 0
metadata_pool 1
inline_data disabled
7938: 192.168.128.101:6809/29475 'compute' mds.0.34 up:active seq 2
7883: 192.168.128.100:6807/21690 'controller' mds.-1.0 up:standby seq 1500
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph mds newfs 6 5 --yes-i-really-mean-it
new fs with metadata pool 6 and data pool 5
root@compute:/home/mengfei/my-cluster#
root@compute:/home/mengfei/my-cluster# ceph mds dump
dumped mdsmap epoch 206
epoch 206
flags 0
created 2014-12-04 15:15:10.430339
modified 2014-12-04 15:15:13.558941
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
last_failure 0
last_failure_osd_epoch 0
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap}
max_mds 1
in 0
up {0=7953}
failed
stopped
data_pools 5
metadata_pool 6
inline_data disabled
7953: 192.168.128.101:6810/29475 'compute' mds.0.35 up:active seq 2
7883: 192.168.128.100:6807/21690 'controller' mds.-1.0 up:standby seq 1579
root@compute:/home/mengfei/my-cluster#
(3)创建一个秘钥文件
Ceph的存储集群运行的身份验证默认打开的。你应该有一个文件,其中包含密钥。为了获得针对特定用户的密钥,请执行下列步骤:
<1> 识别keyring文件内的一个用户。例如:
cat ceph.client.admin.keyring
root@compute:/home/mengfei/my-cluster# cat ceph.client.admin.keyring
[client.admin]
key = AQBe33ZUQBvWFBAApPAN9YAiqSFJQrTXv/TM1A==
root@compute:/home/mengfei/my-cluster#
<2> 用户将使用安装Ceph的FS文件系统复制。操作步骤如下:
[client.admin]
key = AQCj2YpRiAe6CxAA7/ETt7Hcl9IyxyYciVs47w==
<3> 打开一个文本编辑器
<4> 粘贴秘钥到一个空文件。操作步骤如下:
AQCj2YpRiAe6CxAA7/ETt7Hcl9IyxyYciVs47w==
<5> 将文件保存的用户名作为一个属性(例如, /etc/ceph/admin.secret).
<6> Ensure the file permissions are appropriate for the user, but notvisible to other users.
(4)内核驱动
mkdir /mnt/mycephfs
格式:mount -t ceph {ip-address-of-monitor}:6789:/ /mnt/mycephfs
mount -t ceph {ip-address-of-monitor}:6789:/ /mnt/mycephfs
Ceph的存储集群,默认情况下,使用验证。指定一个用户的名称和secretfile中创建创建一个秘密的文件部分。例如:
mount -t ceph 192.168.128.101:6789:/ /mnt/mycephfs -o name=admin,secretfile=/etc/ceph/admin.secret
注意:在client节点上挂载Ceph FS文件系统,而不是服务器monitor节点。
在客户端client上执行:
vi /etc/fstab 添加开机自动mount
格式:{ipaddress}:{port}:/ /{mountpoint} {filesystem-name} [name=username,secret=secretkey|secretfile=/path/to/secretfile],[{mount.options}]
192.168.128.101:6789:/ /mnt/mycephfs ceph name=admin,secretfile=/etc/ceph/admin.secret,noatime 0 2
(5)Ceph的文件系统(FUSE) (注:此步没操作成功,提示ceph-fuse:command not found)
载Ceph FS作为一个在用户空间文件系统(FUSE)
mkdir ~/mycephfs
格式:ceph-fuse -m {ip-address-of-monitor}:6789 ~/mycephfs
Ceph的存储集群在默认情况下使用验证。如果它不是在默认位置,指定一个密钥(即:/etc/ceph):
ceph-fuse -k ./ceph.client.admin.keyring -m 192.168.128.101:6789 ~/mycephfs |