分享

CDH 5.5.HDFS HA环境 不能自动切换活跃主节点

唐运 发表于 2015-12-17 19:34:48 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 3 22781
hdfs-site.xml
<configuration>
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://testcluster</value>
  </property>
  <property>
    <name>fs.trash.interval</name>
    <value>1440</value>
    <description>Number of minutes between trash checkpoints.If zero, the trash feature is disabled.</description>
  </property>
  <property>
    <name>ha.zookeeper.quorum</name>
    <value>nn1.test.com:2181,nn2.test.com:2181,dn1.test.com:2181</value>
  </property>
  
  <property>
    <name>ha.zookeeper.auth</name>
    <value>@/etc/hadoop/conf/zk-auth.txt</value>
  </property>
  <property>
    <name>ha.zookeeper.acl</name>
    <value>@/etc/hadoop/conf/zk-acl.txt</value>
  </property>
</configuration>
[root@nn1 conf]# cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
     <name>dfs.namenode.name.dir</name>
     <value>file:///data/hdfs/name</value>
     <final>true</final>
  </property>
  <property>
     <name>dfs.namenode.shared.edits.dir</name>
     <value>qjournal://nn1.test.com:8485;nn2.test.com:8485;dn1.test.com:8485/testcluster</value>
  </property>
  <property>
     <name>dfs.journalnode.edits.dir</name>
     <value>/data/hdfs/jn</value>
  </property>
  <property>
     <name>dfs.datanode.data.dir</name>
     <value>file:///data/hdfs/data</value>
     <final>true</final>
  </property>
  <property>
     <name>dfs.nameservices</name>
     <value>testcluster</value>
  </property>
  <property>
     <name>dfs.ha.namenodes.testcluster</name>
     <value>nn1,nn2</value>
  </property>
  <property>
     <name>dfs.namenode.rpc-address.testcluster.nn1</name>
     <value>nn1.test.com:8020</value>
  </property>
  <property>
     <name>dfs.namenode.rpc-address.testcluster.nn2</name>
     <value>nn2.test.com:8020</value>
  </property>
  <property>
     <name>dfs.namenode.http-address.testcluster.nn1</name>
     <value>nn1.test.com:50070</value>
  </property>
  <property>
     <name>dfs.namenode.http-address.testcluster.nn2</name>
     <value>nn2.test.com:50070</value>
  </property>
  <property>
     <name>dfs.client.failover.proxy.provider.testcluster</name>
     <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property>


<property>
   <name>ha.zookeeper.quorum</name>
   <value>nn1.test.com:2181,nn2.test.com:2181,dn1.test.com:2181</value>
</property>

  <property>
     <name>dfs.ha.fencing.methods</name>
     <value>sshfence</value>
  </property>
  <property>
     <name>dfs.ha.fencing.ssh.private-key-files</name>
     <value>/root/.ssh/id_rsa</value>
  </property>
  <property>
     <name>dfs.ha.fencing.ssh.connect-timeout</name>
     <value>20000</value>
  </property>
  <property>
     <name>dfs.ha.automatic-failover.enabled</name>
     <value>true</value>
  </property>
  <property>
     <name>dfs.replication</name>
     <value>3</value>
     <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time</description>
  </property>
  <property>
     <name>dfs.namenode.handler.count</name>
     <value>60</value>
  </property>
  <property>
     <name>dfs.datanode.balance.bandwidthPerSec</name>
     <value>20971520</value>
     <final>true</final>
  </property>
  <property>
     <name>dfs.block.size</name>
     <value>134217728</value>
     <final>true</final>
  </property>
  <property>
     <name>dfs.webhdfs.enabled</name>
     <value>true</value>
  </property>
  <property>
     <name>dfs.datanode.max.xcievers</name>
     <value>8192</value>
  </property>
  <property>
     <name>dfs.permissions.superusergroup</name>
     <value>hadoop</value>
  </property>
  <property>
     <name>dfs.support.append</name>
     <value>true</value>
  </property>
</configuration>


core-site.xml
<configuration>
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://testcluster</value>
  </property>
  <property>
    <name>fs.trash.interval</name>
    <value>1440</value>
    <description>Number of minutes between trash checkpoints.If zero, the trash feature is disabled.</description>
  </property>
  <property>
    <name>ha.zookeeper.quorum</name>
    <value>nn1.test.com:2181,nn2.test.com:2181,dn1.test.com:2181</value>
  </property>
  
  <property>
    <name>ha.zookeeper.auth</name>
    <value>@/etc/hadoop/conf/zk-auth.txt</value>
  </property>
  <property>
    <name>ha.zookeeper.acl</name>
    <value>@/etc/hadoop/conf/zk-acl.txt</value>
  </property>
</configuration>


错误日志:

2015-12-17 19:19:21,750 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Edits file http://dn1.test.com:8480/getJour ... a-a7c3-eb3619d6591f, http://nn1.test.com:8480/getJour ... a-a7c3-eb3619d6591f of size 42 edits # 2 loaded in 0 seconds
2015-12-17 19:19:21,750 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Loaded 2 edits starting from txid 18
2015-12-17 19:19:51,050 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:19:52,736 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:19:53,437 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:19:54,958 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:19:55,580 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:19:56,482 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:19:57,326 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:19:57,881 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:15,214 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:16,172 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:17,148 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:18,077 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:40,209 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:41,200 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:42,107 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:42,821 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:43,333 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:43,825 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:45,001 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:57,173 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:58,163 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:58,999 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:20:59,753 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:21:16,159 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:21:17,312 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2015-12-17 19:21:21,766 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log roll on remote NameNode nn2.test.com/192.168.10.22:8020

已有(3)人评论

跳转到指定楼层
唐运 发表于 2015-12-17 19:35:28
有时候切换正常,有时候不行。不知道为什么。只有警告信息,没有错误信息
回复

使用道具 举报

atsky123 发表于 2015-12-17 20:16:57
唐运 发表于 2015-12-17 19:35
有时候切换正常,有时候不行。不知道为什么。只有警告信息,没有错误信息

内存可能不够用了,建议增大内存
回复

使用道具 举报

一战成名 发表于 2016-5-3 09:47:10
楼主  我想问下  你的cdh5.5 怎么配置 hdfs的HA的  是在CM的管理界面里配置吗 能否给我个配置文档的链接  谢谢
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条