我有5个节点,在两台机器上部署了namenode HA和Hmaster HA,hdfs启动后,操作正常,但启动hbase后,一直报错,看regionserver日志发现一直尝试连接hbase,集群是192.168.90.x 网段的,为什么连的是cluster01/60.191.124.236:8020 这个地址,确认hosts文件除了5台机器的ip和hostname,没有别的东西,手动在regionserver上往hdfs上传也没问题,实在不知道哪里出了问题[hadoop@slave01 ~]$ hdfs dfs -put /var/log/boot.log /
[hadoop@slave01 ~]$ hdfs dfs -ls /
Found 2 items
-rw-r--r-- 2 hadoop supergroup 2053 2016-01-14 14:05 /boot.log
drwxr-xr-x - hadoop supergroup 0 2016-01-14 13:39 /hbase
报错如下:
2016-01-14 13:44:13,322 INFO [regionserver/slave01/192.168.90.44:16020] ipc.Client: Retrying connect to server: cluster01/60.191.124.236:8020. Already tried 13 time(s); maxRetries=45
2016-01-14 13:44:33,342 INFO [regionserver/slave01/192.168.90.44:16020] ipc.Client: Retrying connect to server: cluster01/60.191.124.236:8020. Already tried 14 time(s); maxRetries=45
2016-01-14 13:44:53,357 INFO [regionserver/slave01/192.168.90.44:16020] ipc.Client: Retrying connect to server: cluster01/60.191.124.236:8020. Already tried 15 time(s); maxRetries=45
2016-01-14 13:45:13,377 INFO [regionserver/slave01/192.168.90.44:16020] ipc.Client: Retrying connect to server: cluster01/60.191.124.236:8020. Already tried 16 time(s); maxRetries=45
2016-01-14 13:45:33,400 INFO [regionserver/slave01/192.168.90.44:16020] ipc.Client: Retrying connect to server: cluster01/60.191.124.236:8020. Already tried 17 time(s); maxRetries=45
2016-01-14 13:45:53,421 INFO [regionserver/slave01/192.168.90.44:16020] ipc.Client: Retrying connect to server: cluster01/60.191.124.236:8020. Already tried 18 time(s); maxRetries=45
2016-01-14 13:46:13,494 INFO [regionserver/slave01/192.168.90.44:16020] ipc.Client: Retrying connect to server: cluster01/60.191.124.236:8020. Already tried 19 time(s); maxRetries=45
2016-01-14 13:46:33,515 INFO [regionserver/slave01/192.168.90.44:16020] ipc.Client: Retrying connect to server: cluster01/60.191.124.236:8020. Already tried 20 time(s); maxRetries=45
2016-01-14 13:47:03,545 WARN [regionserver/slave01/192.168.90.44:16020] ipc.Client: Address change detected. Old: cluster01/60.191.124.236:8020 New: cluster01:8020
2016-01-14 13:47:03,545 INFO [regionserver/slave01/192.168.90.44:16020] ipc.Client: Retrying connect to server: cluster01:8020. Already tried 0 time(s); maxRetries=45
2016-01-14 13:47:03,549 INFO [regionserver/slave01/192.168.90.44:16020] regionserver.HRegionServer: STOPPED: Failed initialization
2016-01-14 13:47:03,551 ERROR [regionserver/slave01/192.168.90.44:16020] regionserver.HRegionServer: Failed init
java.io.IOException: Failed on local exception: java.io.IOException: Couldn't set up IO streams; Host Details : local host is: "slave01/192.168.90.44"; destination host is: "cluster01":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1415)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy19.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy19.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:707)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
at com.sun.proxy.$Proxy20.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1785)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1068)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1064)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1064)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1398)
at org.apache.hadoop.hbase.regionserver.HRegionServer.setupWALAndReplication(HRegionServer.java:1606)
at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1362)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:899)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Couldn't set up IO streams
hbase-site.xml 配置如下:
[mw_shl_code=xml,true]<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://cluster01/hbase</value>
</property>
<property>
<name>hbase.master</name>
<value>60000</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>slave01,slave02,slave03</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/data/zkdata</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/data/tmp/hbase/</value>
</property>
</configuration>[/mw_shl_code]
hdfs-site.xml的配置如下:
[mw_shl_code=xml,true]<configuration>
<property>
<name>dfs.nameservices</name>
<value>cluster01</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>128M</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/data/checkpoint</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster01</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster01.nn1</name>
<value>master01:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster01.nn2</name>
<value>master02:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster01.nn1</name>
<value>master01:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster01.nn2</name>
<value>master02:8020</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/dfs/data</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://slave01:8485;slave02:8485;slave03:8485/jt_journal</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data/journal</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster01</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
</configuration>[/mw_shl_code]
|
|