栎梓天冲 发表于 2015-6-9 21:16:06

Hbase Hmaster起一下就挂了

我的hadoop是2.5.0 HBASE是0.98 ,之前装过hadoop2.6和另一个版本的Hbase

2015-06-09 17:24:34,256 DEBUG master.SplitLogManager$DeleteAsyncCallback: deleted /hbase/splitWAL/WALs%2FSlave4%2C60020%2C1433837704561-splitting%2FSlave4%252C60020%252C1433837704561.143
3837711368.meta
2015-06-09 17:24:34,260 WARN master.SplitLogManager: returning success without actually splitting and deleting all the log files in path hdfs://Master1:9000/hbase/WALs/Slave4,60020,
1433837704561-splitting
2015-06-09 17:24:34,260 INFO master.SplitLogManager: finished splitting (more than or equal to) 0 bytes in 1 log files in [hdfs://Master1:9000/hbase/WALs/Slave4,60020,1433837704561-
splitting] in 4308ms
2015-06-09 17:24:34,360 INFO catalog.CatalogTracker: Failed verification of hbase:meta,,1 at address=Slave4,60020,1433837704561, exception=org.apache.hadoop.hbase.NotServingRegionEx
ception: org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on Slave4,60020,1433841863913
      at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2695)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4139)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:3506)
      at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20158)
      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
      at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
      at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
      at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
      at java.lang.Thread.run(Thread.java:745)
      
      
lave1,60020,1433841863879
2015-06-09 17:24:35,068 WARN master.SplitLogManager: Stopped while waiting for log splits to be completed
2015-06-09 17:24:35,068 WARN master.SplitLogManager: error while splitting logs in installe
d = 1 but only 0 done
2015-06-09 17:24:35,069 DEBUG master.DeadServer: Finished processing Slave4,60020,1433837704561
2015-06-09 17:24:35,069 DEBUG master.DeadServer: Finished processing Slave4,60020,1433837704561
2015-06-09 17:24:35,069 ERROR executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for Slave4,60020,1433837704561, will retry
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:330)
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:210)
      at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs in Task = installed = 1 done = 0 error = 0
      at org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:378)
      at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:415)
      at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:389)
      at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:287)
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:203)
      ... 4 more
2015-06-09 17:24:35,069 ERROR executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:185)
      at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
2015-06-09 17:24:35,102 WARN master.SplitLogManager: Stopped while waiting for log splits to be completed
2015-06-09 17:24:35,102 WARN master.SplitLogManager: error while splitting logs in installe
d = 1 but only 0 done
2015-06-09 17:24:35,102 DEBUG master.DeadServer: Finished processing Slave2,60020,1433837704549
2015-06-09 17:24:35,103 DEBUG master.DeadServer: Finished processing Slave2,60020,1433837704549
2015-06-09 17:24:35,103 ERROR executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for Slave2,60020,1433837704549, will retry
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:330)
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:210)
      at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs in Task = installed = 1 done = 0 error = 0
      at org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:378)
      at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:415)
      at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:389)
      at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:287)
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:203)
      ... 4 more
2015-06-09 17:24:35,103 ERROR executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:185)
      at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
2015-06-09 17:24:35,108 WARN master.SplitLogManager: Stopped while waiting for log splits to be completed
2015-06-09 17:24:35,108 WARN master.SplitLogManager: error while splitting logs in installe
d = 1 but only 0 done
2015-06-09 17:24:35,108 DEBUG master.DeadServer: Finished processing Slave3,60020,1433837704597
2015-06-09 17:24:35,108 ERROR executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for Slave3,60020,1433837704597, will retry
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:330)
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:210)
      at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs in Task = installed = 1 done = 0 error = 0
      at org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:378)
      at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:415)
      at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:389)
      at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:287)
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:203)
      ... 4 more
2015-06-09 17:24:35,108 DEBUG master.DeadServer: Finished processing Slave3,60020,1433837704597
2015-06-09 17:24:35,109 ERROR executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:185)
      at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
2015-06-09 17:24:35,147 WARN master.SplitLogManager: Interrupted while waiting for log splits to be completed
2015-06-09 17:24:35,147 WARN master.SplitLogManager: error while splitting logs in installe
d = 1 but only 0 done
2015-06-09 17:24:35,147 DEBUG master.DeadServer: Finished processing Slave1,60020,1433837704527
2015-06-09 17:24:35,148 ERROR executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.util.concurrent.RejectedExecutionException: Task ServerShutdownHandler-Master1,60000,1433841863065-5 rejected from org.apache.hadoop.hbase.executor.ExecutorService$TrackingThreadPoolExecutor@63c6ec18[
Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 6]
      at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
      at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
      at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
      at org.apache.hadoop.hbase.executor.ExecutorService$Executor.submit(ExecutorService.java:224)
      at org.apache.hadoop.hbase.executor.ExecutorService.submit(ExecutorService.java:148)
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:328)
      at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:210)
      at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
2015-06-09 17:24:35,153 DEBUG catalog.CatalogTracker: Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@19609cfa
2015-06-09 17:24:35,153 INFO client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x34d9d6915c806e3
2015-06-09 17:24:35,159 INFO zookeeper.ZooKeeper: Session: 0x34d9d6915c806e3 closed
2015-06-09 17:24:35,159 INFO zookeeper.ClientCnxn: EventThread shut down
2015-06-09 17:24:35,179 INFO master.SplitLogManager$TimeoutMonitor: Master1,60000,1433841863065.splitLogManagerTimeoutMonitor exiting
2015-06-09 17:24:35,267 INFO zookeeper.ZooKeeper: Session: 0x24d9d6826f906ef closed
2015-06-09 17:24:35,267 INFO zookeeper.ClientCnxn: EventThread shut down
2015-06-09 17:24:35,267 INFO master.HMaster: HMaster main thread exiting
2015-06-09 17:24:35,267 ERROR master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
      at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:194)
      at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:135)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
      at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
      at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2794)

bob007 发表于 2015-6-9 21:32:15

首先确定datanode有没有挂掉,特别是slave4。然后看看是否磁盘满了。
split的时候发生错误

zhangshuai 发表于 2015-6-10 09:08:35

跟我的错误一样。 至今没有解决办法。启动起来集群正常,过几分钟一个个挂掉。求答案啊!!

栎梓天冲 发表于 2015-6-10 10:31:35

bob007 发表于 2015-6-9 21:32
首先确定datanode有没有挂掉,特别是slave4。然后看看是否磁盘满了。
split的时候发生错误

不是磁盘才有3%占用

Alkaloid0515 发表于 2015-6-10 12:24:48

zhangshuai 发表于 2015-6-10 09:08
跟我的错误一样。 至今没有解决办法。启动起来集群正常,过几分钟一个个挂掉。求答案啊!!

原因比较多,看看权限、datanode是否可以用,是否是僵尸进程

langyahun 发表于 2015-6-10 17:02:00

有可能是更换了HBASE版本后,zookeeper还保留着上一次hbase版本的设置,造成了冲突。
1.切换到zookeeper的bin目录;
2.执行$sh zkCli.sh
输入‘ls /’
4.输入‘rmr /hbase’
5.退出
重启hbase即可。
希望能帮到你。

栎梓天冲 发表于 2015-6-11 14:42:52

langyahun 发表于 2015-6-10 17:02
有可能是更换了HBASE版本后,zookeeper还保留着上一次hbase版本的设置,造成了冲突。
1.切换到zookeeper的 ...

谢谢,重启了机子就全都搞定了

栎梓天冲 发表于 2015-6-11 14:44:12

Alkaloid0515 发表于 2015-6-10 12:24
原因比较多,看看权限、datanode是否可以用,是否是僵尸进程

重启了机子,放了一天就自己好了

栎梓天冲 发表于 2015-6-11 14:46:52

zhangshuai 发表于 2015-6-10 09:08
跟我的错误一样。 至今没有解决办法。启动起来集群正常,过几分钟一个个挂掉。求答案啊!!

重启集群(关机重启服务器),放他一下。我的就是这么稀里糊涂的好的

flysky0802 发表于 2015-6-16 17:50:02

{:2_25:} 重启服务器?? 如果你几十台机器,生产系统的话,就死定了!

hbase 有 SplitLog参数调整,可以考虑加长时间或者关闭日志写入!
然后执行hbase 修复操作!

页: [1] 2
查看完整版本: Hbase Hmaster起一下就挂了