本帖最后由 levycui 于 2020-4-28 16:52 编辑
问题导读:
1、CM/CDH包含哪些内容?
2、如何进行Cloudera Manager 升级?
3、如何对 Cloudera Management Service/Agent 进行备份?
4、如何对Cloudera Manager Server 进行升级?
概要
笔者最近在研究 CDH 大版本升级(CDH 5.16.2 -> CDH 6.3.3)的细节内容,涉及到每个组件的方方面面,确保升级过程中出现各种意外情况能够自主可控,降低未来生产环境升级的风险。
本系列文章包括四个部分(CM/CDH 5.16.2 -> CM/CDH 6.3.3):
1. Cloudera Manager 升级
2. CDH 自动升级
3. CDH 手动升级
4. CM/CDH 回滚
所有文章都是根据 Cloudera 官方整理和补充而来,而且都是经过测试环境充分验证,切实可参考的文档,但是考虑到每个人的 CM 和 CDH 环境存在差异,所以读者还需要参考官网。
在本篇文章中,笔者将讲解和实战 Cloudera Manager(CM)升级,将 CM 5.16.2 升级到 CM 6.3.3,为 CDH 升级(CDH 5.16.2 -> CDH 6.3.3)做好准备。
Cloudera Manager 升级
无需停止 CDH 组件服务即可完成 Cloudera Manager 升级。但是在 Cloudera Manager 完成升级后,建议对 CDH 集群进行滚动重启。
另外,不支持将 CDH 5 滚动升级到 CDH 6,因此在升级 CDH 集群时需要停止服务。
1. CM 信息收集
[mw_shl_code=shell,true]lsb_release -a[/mw_shl_code]
[mw_shl_code=shell,true]# cat /etc/cloudera-scm-server/db.properties
# Auto-generated by scm_prepare_database.sh on Mon Apr 9 16:24:14 CST 2018
#
# For information describing how to configure the Cloudera Manager Server
# to connect to databases, see the "Cloudera Manager Installation Guide."
#
com.cloudera.cmf.db.type=mysql
com.cloudera.cmf.db.host=127.0.0.1
com.cloudera.cmf.db.name=scm
com.cloudera.cmf.db.user=scm
com.cloudera.cmf.db.setupType=EXTERNAL
com.cloudera.cmf.db.password=xxx[/mw_shl_code]
Cloudera Manager Version:
[mw_shl_code=shell,true]Version: Cloudera Enterprise 5.16.2 (#7 built by jenkins on 20190518-0557 git: fedcd738d6af67bc26077f7ad53b03ea9dafa2f0)[/mw_shl_code]
JDK:
[mw_shl_code=shell,true]Java Version: 1.8.0_144[/mw_shl_code]
- Cloudera Management Service 组件的数据库信息
从 Cloudera Management Service 配置中可以获取到如下信息:
[mw_shl_code=shell,true]# Activity Monitor Database Hostname
cm.dataflow.com
# Activity Monitor Database Name
am
# Activity Monitor Database Username
am
# Navigator Audit Server Database Hostname
cm.dataflow.com
# Navigator Metadata Server Database Name
navms
# Navigator Metadata Server Database Username
navms
# Reports Manager Database Name
rm
# Reports Manager Database Username
rm[/mw_shl_code]
- 确保服务 Service Monitor, Host Monitor 和 Event Server roles 运行正常
- 查询 Navigator Metadata Server 存储目录
[mw_shl_code=shell,true]/var/lib/cloudera-scm-navigator[/mw_shl_code]
2. 升级到 Cloudera Manager 6.x 检查
- Cloudera Enterprise 6 要求和支持的版本
https://docs.cloudera.com/docume ... tml#c6_requirements
https://docs.cloudera.com/docume ... tml#os_requirements
https://docs.cloudera.com/docume ... cdh_cm_supported_db
https://docs.cloudera.com/docume ... l#java_requirements
- 查看 5.x 和 6 的发布 Notes 以及 Cloudera Security Bulletins
3. 准备升级 Cloudera Navigator
Cloudera Navigator 在 Cloudera Manager 升级过程中进行了升级。无需其他额外的步骤。但是,为确保有效地升级 Navigator 元数据并且升级后的版本性能不受影响,请确保检查 Navigator Metadata 服务是否配置了适合大小的 Java 内存。
根据下面的步骤,配置 Navigator Metadata 服务的 Java 内存大小:
- 选择 Clusters > Cloudera Management Service > Instances > Navigator Metadata Server > Log Files > Role Log File
- 搜索 solr core nav_elements 确定 element documents 数量
Found 4459224 documents in solr core nav_elements
- 搜索 solr core nav_relations
Found 6694170 documents in solr core nav_relations
((num_nav_elements + num_nav_relations) * 200 bytes) + 2 GB
- Clusters > Cloudera Management Service > Configuration
根据上面计算的值,修改 Navigator Metadata Server 的内存大小。
4. Cloudera Manager Agent 备份
升级过程中,最重要的就是备份,一定要记住备份,记住备份,备份。否则,你准备好跑路的打算。
[mw_shl_code=shell,true]
# 创建存储备份的目录
export CM_BACKUP_DIR="`date +%F`-CM5.16"
echo $CM_BACKUP_DIR
mkdir -p $CM_BACKUP_DIR
# 备份 Agent 目录和 runtime state.
sudo -E tar -cf $CM_BACKUP_DIR/cloudera-scm-agent.tar --exclude=*.sock /etc/cloudera-scm-agent /etc/default/cloudera-scm-agent /var/run/cloudera-scm-agent /var/lib/cloudera-scm-agent
# 备份系统 yum repository 目录
sudo -E tar -cf $CM_BACKUP_DIR/repository.tar /etc/yum.repos.d[/mw_shl_code]
5. 备份 Cloudera Management Service
根据实际配置的目录进行备份。
[mw_shl_code=shell,true]# Service Monitor role
# 默认是 /var/lib/cloudera-service-monitor
sudo cp -rp /data/cloudera-service-monitor /data/cloudera-service-monitor-`date +%F`-CM5.16
# Host Monitor role
# 默认是 /var/lib/cloudera-host-monitor
sudo cp -rp /data/cloudera-host-monitor /data/cloudera-host-monitor-`date +%F`-CM5.16
# Event Server role
# 默认是 /var/lib/cloudera-scm-eventserver
sudo cp -rp /var/lib/cloudera-scm-eventserver /data/cloudera-scm-eventserver-`date +%F`-CM5.16[/mw_shl_code]
6. 备份 Cloudera Navigator Data
- 确保最近执行过 purge 任务,清理 stale 和 deleted entities
Clusters > Cloudera Navigator,选择 Administration > Purge Settings
如果最近没有执行,则修改 Purge schedule 并执行
设置 purge 过程选项以清除升级系统所能承受的尽可能多的积压数据
https://docs.cloudera.com/docume ... era-navigator-purge
- 停止 Navigator Metadata Server
- 备份 Cloudera Navigator Solr 存储目录
[mw_shl_code=shell,true]sudo cp -rp /var/lib/cloudera-scm-navigator /var/lib/cloudera-scm-navigator-`date +%F`-CM5.16[/mw_shl_code]
Cloudera Navigator 升级作为 Cloudera Manager 升级的一部分,要确保 Navigator Metadata Serve 配置合适的内存(见上面)。
7. 停止 Cloudera Manager Server 和 Cloudera Management Service
- 登录 Cloudera Manager Portal,停止 Cloudera Management Service
- 登录 Cloudera Manager Server 节点,停止服务
[mw_shl_code=shell,true]sudo systemctl stop cloudera-scm-server[/mw_shl_code]
8. 备份 Cloudera Manager Databases
- 备份 Cloudera Manager server database
Reports Manager
Navigator Audit Server
Navigator Metadata Server
Activity Monitor (Only used for MapReduce 1 monitoring).
[mw_shl_code=shell,true]# 备份 am rm navms nav
# 可以分开单独备份
mysqldump --databases am rm navms nav -u root -p > /data/cloudera_manager_database_all/cloudera_manager_database_all-backup-`date +%F`-CM5.16.sql[/mw_shl_code]
- 备份其他 Cloudera Manager databases
mysqldump --databases scm -u root -p > /data/cloudera_manager_database_all/scm-backup-`date +%F`-CM5.16.sql
以防万一,全部备份好了。
9. 备份 Cloudera Manager Server
登录 Cloudera Manager Server 节点
[mw_shl_code=shell,true]export CM_BACKUP_DIR="`date +%F`-CM5.16"
echo $CM_BACKUP_DIR
mkdir -p $CM_BACKUP_DIR
sudo -E tar -cf $CM_BACKUP_DIR/cloudera-scm-server.tar /etc/cloudera-scm-server /etc/default/cloudera-scm-server
sudo -E tar -cf $CM_BACKUP_DIR/repository.tar /etc/yum.repos.d[/mw_shl_code]
10. 启动 Cloudera Manager Server & Cloudera Management Service(可选)
如果需要立刻升级 Cloudera Manager,则跳过。
- 登录 Cloudera Manager Server 节点
[mw_shl_code=shell,true]sudo systemctl start cloudera-scm-server[/mw_shl_code]
- 启动 Cloudera Management Service
11. Cloudera Manager Server 升级
1. 下载安装的包
下载地址(需要根据 License 生成的用户名和密码来访问):
https://archive.cloudera.com/p/c ... -redhat7.tar.gz.md5
如果没有 License,可以考虑升级到 6.3.2 版本。
制作本地源:
[mw_shl_code=shell,true]# 拷贝 rpm 包到 /var/www/html/cm6
# ll
total 1380460
-rw-r--r-- 1 root root 10327216 Apr 23 13:39 cloudera-manager-agent-6.3.3-1842532.el7.x86_64.rpm
-rw-r--r-- 1 root root 1204027392 Apr 23 13:40 cloudera-manager-daemons-6.3.3-1842532.el7.x86_64.rpm
-rw-r--r-- 1 root root 12100 Apr 23 13:40 cloudera-manager-server-6.3.3-1842532.el7.x86_64.rpm
-rw-r--r-- 1 root root 10996 Apr 23 13:40 cloudera-manager-server-db-2-6.3.3-1842532.el7.x86_64.rpm
-rw-r--r-- 1 root root 14209892 Apr 23 13:40 enterprise-debuginfo-6.3.3-1842532.el7.x86_64.rpm
-rw-r--r-- 1 root root 184988341 Apr 23 13:40 oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm
# 制作 repo
createrepo .
# allkeys.asc
cp allkeys.asc /var/www/html/cm6/[/mw_shl_code]
配置本地 repo 源:
[mw_shl_code=shell,true]# cat /etc/yum.repos.d/cloudera-manager.repo
[cloudera-manager]
name = Cloudera Manager, Version 6.3.3
baseurl = http://yum.repo.dataflow.com/cm6.3.3/
gpgcheck = 0[/mw_shl_code]
2. 安装 Oracle JDK 8
原先集群所有节点的 JDK 版本符合要求,忽略。
3. 升级 Cloudera Manager Server
- 登录 Cloudera Manager Admin Console,停止 Cloudera Management Service
[mw_shl_code=shell,true]Clusters > Cloudera Management Service.
Actions > Stop.[/mw_shl_code]
- 确保已经 disable 任何调度的 replication 或 snapshot jobs,以及等到 Cloudera Manager Admin Console 所有运行的命令执行完毕。
- 如果有任何 Hive Replication Schedules 复制数据云端,在继续升级之前删除这些 replication clusters,等升级完成后再重新创建这些 Replication Schedules。
- 登录 Cloudera Manager Server 节点,停止 Cloudera Manager Server 服务
[mw_shl_code=shell,true]sudo systemctl stop cloudera-scm-server[/mw_shl_code]
然后停止 Cloudera Manager Agent
[mw_shl_code=shell,true]sudo systemctl stop cloudera-scm-agent[/mw_shl_code]
- 升级 Cloudera Manager Server 安装包
[mw_shl_code=shell,true]sudo yum clean all
sudo yum upgrade cloudera-manager-server cloudera-manager-daemons cloudera-manager-agent[/mw_shl_code]
遇到提示(笔者 CentOS 7 环境未遇到):
[mw_shl_code=shell,true]
Configuration file '/etc/cloudera-scm-agent/config.ini'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.[/mw_shl_code]
可能会收到类似的 /etc/cloudera-scm-server/db.properties 提示,对两个提示都回答 N。
系统可能会提示需要接受 GPG 密钥,回复 y(笔者环境未遇到)。
[mw_shl_code=shell,true]Retrieving key from https://archive.cloudera.com/.../cm/RPM-GPG-KEY-cloudera
Importing GPG key ...
Userid : "Yum Maintainer <webmaster@cloudera.com>"
Fingerprint: ...
From : https://archive.cloudera.com/.../RPM-GPG-KEY-cloudera
[/mw_shl_code] 如果之前版本定制过 /etc/cloudera-scm-agent/config.ini 文件内容, 定制化的文件会被备份为 .rpmsave ,将定制的内容重写到 /etc/cloudera-scm-agent/config.ini 文件中。
[mw_shl_code=shell,true]rpm -qa 'cloudera-manager-*'[/mw_shl_code]
- 启动 Cloudera Manager Agent 服务
[mw_shl_code=shell,true]
sudo systemctl daemon-reload
sudo systemctl start cloudera-scm-agent[/mw_shl_code]
- 启动 Cloudera Manager Server 服务
[mw_shl_code=shell,true]sudo systemctl daemon-reload
sudo systemctl start cloudera-scm-server[/mw_shl_code]
- 使用浏览器登录 Cloudera Manager Admin Console
[mw_shl_code=shell,true]http://cloudera_Manager_server_hostname:7180/cmf/upgrade[/mw_shl_code]
根据提示开始正式升级 Cloudera Manager Agents。
4. 升级 Cloudera Manager Agents
推荐使用 Cloudera Manage 升级 Agents。
具体升级步骤,根据提示即可。下面列几点注意事项:
- /usr/java/latest
- 找到 Cloudera Manager Agent not upgraded
- 点击 Upgrade Cloudera Manager Agent packages
- 选择 Custom Repository 选项,并输入 Repository URL
默认会自动填充,如果没有自己配置 CM 本地源。
- 点击 Run Host Inspector 运行主机检查
[mw_shl_code=shell,true]# The user 'kudu' is not part of group 'hive' on the following hosts:
usermod -a -G hive kudu[/mw_shl_code]
- 启动 Cloudera Management Service
根据配置变更等情况,需要重启一些服务或分发一些配置。
12. Cloudera Manager Server 升级后
1. 升级 Cloudera Navigator Encryption 组件
未部署,忽略。
- Cloudera Navigator Key Trustee Server
- Cloudera Navigator Key HSM
- Cloudera Navigator Key Trustee KMS
- Cloudera Navigator Encrypt.
如果仍在使用 Key Trustee Server 5.4,并且要升级到 Cloudera Manager 5.10 或更高版本,则必须将 Key Trustee Server 升级到最新版本。
可以随时升级其他 Cloudera Navigator 组件。升级 Cloudera Manager 或 CDH 时,不必执行这些升级。
2. 执行升级后操作
- 启动 Cloudera Management Service,并根据提示调整配置
[mw_shl_code=shell,true]登录 Cloudera Manager Admin Console.
选择 Clusters > Cloudera Management Service.
选择 Actions > Start.[/mw_shl_code]
- 如果 Cloudera Manager 升级后报告出一些陈旧的配置,可以重启集群服务并重新部署 client 配置。如果后面继续升级 CDH,则这一步不是必须的。
[mw_shl_code=shell,true]Home > Status,点击 Restart
Home > Status,点击 Deploy Client Configuration[/mw_shl_code]
- 如果在升级前 disabled 任何 backup 或 snapshot jobs,现在可以重新创建。
- 如果在升级前 deleted any Hive Replication schedules,重新创建。
Cloudera Manager 升级完成
到此,Cloudera Manager 升级完成。
总结
Cloudera Manager 升级一向都是比较简单的,也很少出错。最后还是要提醒,升级 Cloudera Manager 之前要做好备份,其他不再多说。
作者:平凡的世界 DataFlow范式
来源:https://mp.weixin.qq.com/s/C4_0TH26MXYSg87fjs9-JA
最新经典文章,欢迎关注公众号
|
|