当两个namenode都启动de时候,其中一个standby状态的NN会切换ACTIVE 如手动killl掉active的NN时候,standby的NN不能切换成ACTIVE状态
日志:
2015-03-25 12:41:33,841 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Disconnecting from master port 222015-03-25 12:41:33,841 WARN org.apache.hadoop.ha.SshFenceByTcpPort: Unable to connect to master as user rootcom.jcraft.jsch.JSchException: Auth fail at com.jcraft.jsch.Session.connect(Session.java:452) at org.apache.hadoop.ha.SshFenceByTcpPort.tryFence(SshFenceByTcpPort.java:100) at org.apache.hadoop.ha.NodeFencer.fence(NodeFencer.java:97) at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:521) at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494) at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:59) at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:837) at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:900) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:799) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)2015-03-25 12:41:33,842 WARN org.apache.hadoop.ha.NodeFencer: Fencing method org.apache.hadoop.ha.SshFenceByTcpPort(null) was unsuccessful.2015-03-25 12:41:33,842 ERROR org.apache.hadoop.ha.NodeFencer: Unable to fence service by any configured method.2015-03-25 12:41:33,843 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of electionjava.lang.RuntimeException: Unable to fence NameNode at master/192.168.11.128:9000 at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:522) at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494) at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:59) at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:837) at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:900) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:799) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)2015-03-25 12:41:33,843 INFO org.apache.hadoop.ha.ActiveStandbyElector: Trying to re-establish ZK session2015-03-25 12:41:33,855 INFO org.apache.zookeeper.ZooKeeper: Session: 0x34c4f2fe3a60003 closed2015-03-25 12:41:34,857 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=master:2181,node01:2181,node02:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@2d098f1f2015-03-25 12:41:34,862 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server node01/192.168.11.129:2181. Will not attempt to authenticate using SASL (unknown error)2015-03-25 12:41:34,863 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to node01/192.168.11.129:2181, initiating session2015-03-25 12:41:34,868 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server node01/192.168.11.129:2181, sessionid = 0x24c4f2fad420002, negotiated timeout = 50002015-03-25 12:41:34,870 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down2015-03-25 12:41:34,874 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.2015-03-25 12:41:34,877 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced...2015-03-25 12:41:34,880 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: 0a07636c757374657212036e6e311a066d617374657220a84628d33e2015-03-25 12:41:34,883 INFO org.apache.hadoop.ha.ZKFailoverController: Should fence: NameNode at master/192.168.11.128:90002015-03-25 12:41:35,892 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.11.128:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 SECONDS)2015-03-25 12:41:35,893 WARN org.apache.hadoop.ha.FailoverController: Unable to gracefully make NameNode at master/192.168.11.128:9000 standby (unable to connect)java.net.ConnectException: Call From node01/192.168.11.129 to master:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730) at org.apache.hadoop.ipc.Client.call(Client.java:1351) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy8.transitionToStandby(Unknown Source) at org.apache.hadoop.ha.protocolPB.HAServiceProtocolClientSideTranslatorPB.transitionToStandby(HAServiceProtocolClientSideTranslatorPB.java:112) at org.apache.hadoop.ha.FailoverController.tryGracefulFence(FailoverController.java:172) at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:503) at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494) at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:59) at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:837) at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:900) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:799) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642) at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399) at org.apache.hadoop.ipc.Client.call(Client.java:1318) ... 14 more2015-03-25 12:41:35,894 INFO org.apache.hadoop.ha.NodeFencer: ====== Beginning Service Fencing Process... ======2015-03-25 12:41:35,894 INFO org.apache.hadoop.ha.NodeFencer: Trying method 1/1: org.apache.hadoop.ha.SshFenceByTcpPort(null)2015-03-25 12:41:35,894 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Connecting to master...2015-03-25 12:41:35,894 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Connecting to master port 222015-03-25 12:41:35,895 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Connection established2015-03-25 12:41:35,903 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Remote version string: SSH-2.0-OpenSSH_6.42015-03-25 12:41:35,903 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Local version string: SSH-2.0-JSCH-0.1.422015-03-25 12:41:35,903 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour2562015-03-25 12:41:35,907 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: aes256-ctr is not available.2015-03-25 12:41:35,907 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: aes192-ctr is not available.2015-03-25 12:41:35,907 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: aes256-cbc is not available.2015-03-25 12:41:35,907 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: aes192-cbc is not available.2015-03-25 12:41:35,907 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: arcfour256 is not available.2015-03-25 12:41:35,908 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_KEXINIT sent2015-03-25 12:41:35,908 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_KEXINIT received2015-03-25 12:41:35,908 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server->client aes128-ctr hmac-md5 none2015-03-25 12:41:35,908 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client->server aes128-ctr hmac-md5 none2015-03-25 12:41:35,911 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_KEXDH_INIT sent2015-03-25 12:41:35,911 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: expecting SSH_MSG_KEXDH_REPLY2015-03-25 12:41:35,917 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: ssh_rsa_verify: signature true2015-03-25 12:41:35,918 WARN org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Permanently added 'master' (RSA) to the list of known hosts.2015-03-25 12:41:35,918 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_NEWKEYS sent2015-03-25 12:41:35,918 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_NEWKEYS received2015-03-25 12:41:35,920 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_SERVICE_REQUEST sent2015-03-25 12:41:35,920 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_SERVICE_ACCEPT received2015-03-25 12:41:35,923 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Authentications that can continue: gssapi-with-mic,publickey,keyboard-interactive,password2015-03-25 12:41:35,923 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Next authentication method: gssapi-with-mic2015-03-25 12:41:35,929 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Authentications that can continue: publickey,keyboard-interactive,password2015-03-25 12:41:35,929 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Next authentication method: publickey2015-03-25 12:41:35,929 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Authentications that can continue: password2015-03-25 12:41:35,929 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Next authentication method: password2015-03-25 12:41:35,930 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Disconnecting from master port 222015-03-25 12:41:35,930 WARN org.apache.hadoop.ha.SshFenceByTcpPort: Unable to connect to master as user rootcom.jcraft.jsch.JSchException: Auth fail at com.jcraft.jsch.Session.connect(Session.java:452) at org.apache.hadoop.ha.SshFenceByTcpPort.tryFence(SshFenceByTcpPort.java:100) at org.apache.hadoop.ha.NodeFencer.fence(NodeFencer.java:97) at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:521) at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494) at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:59) at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:837) at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:900) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:799) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)2015-03-25 12:41:35,931 WARN org.apache.hadoop.ha.NodeFencer: Fencing method org.apache.hadoop.ha.SshFenceByTcpPort(null) was unsuccessful.2015-03-25 12:41:35,931 ERROR org.apache.hadoop.ha.NodeFencer: Unable to fence service by any configured method.2015-03-25 12:41:35,931 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of electionjava.lang.RuntimeException: Unable to fence NameNode at master/192.168.11.128:9000 at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:522) at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494) at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:59) at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:837) at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:900) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:799) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)2015-03-25 12:41:35,932 INFO org.apache.hadoop.ha.ActiveStandbyElector: Trying to re-establish ZK session2015-03-25 12:41:35,942 INFO org.apache.zookeeper.ZooKeeper: Session: 0x24c4f2fad420002 closed2015-03-25 12:41:36,944 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=master:2181,node01:2181,node02:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@201cc1812015-03-25 12:41:36,945 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server node01/192.168.11.129:2181. Will not attempt to authenticate using SASL (unknown error)2015-03-25 12:41:36,946 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to node01/192.168.11.129:2181, initiating session2015-03-25 12:41:36,979 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server node01/192.168.11.129:2181, sessionid = 0x24c4f2fad420003, negotiated timeout = 50002015-03-25 12:41:36,980 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down2015-03-25 12:41:36,985 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.2015-03-25 12:41:36,987 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced...2015-03-25 12:41:36,990 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: 0a07636c757374657212036e6e311a066d617374657220a84628d33e2015-03-25 12:41:36,991 INFO org.apache.hadoop.ha.ZKFailoverController: Should fence: NameNode at master/192.168.11.128:90002015-03-25 12:41:38,005 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.11.128:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 SECONDS)2015-03-25 12:41:38,006 WARN org.apache.hadoop.ha.FailoverController: Unable to gracefully make NameNode at master/192.168.11.128:9000 standby (unable to connect)java.net.ConnectException: Call From node01/192.168.11.129 to master:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
|
|