分享

Hbase Hmaster起一下就挂了

我的hadoop是2.5.0 HBASE是0.98 ,之前装过hadoop2.6和另一个版本的Hbase

2015-06-09 17:24:34,256 DEBUG [main-EventThread] master.SplitLogManager$DeleteAsyncCallback: deleted /hbase/splitWAL/WALs%2FSlave4%2C60020%2C1433837704561-splitting%2FSlave4%252C60020%252C1433837704561.143
3837711368.meta
2015-06-09 17:24:34,260 WARN  [master:Master1:60000] master.SplitLogManager: returning success without actually splitting and deleting all the log files in path hdfs://Master1:9000/hbase/WALs/Slave4,60020,
1433837704561-splitting
2015-06-09 17:24:34,260 INFO  [master:Master1:60000] master.SplitLogManager: finished splitting (more than or equal to) 0 bytes in 1 log files in [hdfs://Master1:9000/hbase/WALs/Slave4,60020,1433837704561-
splitting] in 4308ms
2015-06-09 17:24:34,360 INFO  [master:Master1:60000] catalog.CatalogTracker: Failed verification of hbase:meta,,1 at address=Slave4,60020,1433837704561, exception=org.apache.hadoop.hbase.NotServingRegionEx
ception: org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on Slave4,60020,1433841863913
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2695)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4139)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:3506)
        at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20158)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Thread.java:745)
        
        
lave1,60020,1433841863879
2015-06-09 17:24:35,068 WARN  [MASTER_SERVER_OPERATIONS-Master1:60000-0] master.SplitLogManager: Stopped while waiting for log splits to be completed
2015-06-09 17:24:35,068 WARN  [MASTER_SERVER_OPERATIONS-Master1:60000-0] master.SplitLogManager: error while splitting logs in [hdfs://Master1:9000/hbase/WALs/Slave4,60020,1433837704561-splitting] installe
d = 1 but only 0 done
2015-06-09 17:24:35,069 DEBUG [MASTER_SERVER_OPERATIONS-Master1:60000-0] master.DeadServer: Finished processing Slave4,60020,1433837704561
2015-06-09 17:24:35,069 DEBUG [MASTER_SERVER_OPERATIONS-Master1:60000-4] master.DeadServer: Finished processing Slave4,60020,1433837704561
2015-06-09 17:24:35,069 ERROR [MASTER_SERVER_OPERATIONS-Master1:60000-0] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for Slave4,60020,1433837704561, will retry
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:330)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:210)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs in [hdfs://Master1:9000/hbase/WALs/Slave4,60020,1433837704561-splitting] Task = installed = 1 done = 0 error = 0
        at org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:378)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:415)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:389)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:287)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:203)
        ... 4 more
2015-06-09 17:24:35,069 ERROR [MASTER_SERVER_OPERATIONS-Master1:60000-4] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:185)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015-06-09 17:24:35,102 WARN  [MASTER_SERVER_OPERATIONS-Master1:60000-1] master.SplitLogManager: Stopped while waiting for log splits to be completed
2015-06-09 17:24:35,102 WARN  [MASTER_SERVER_OPERATIONS-Master1:60000-1] master.SplitLogManager: error while splitting logs in [hdfs://Master1:9000/hbase/WALs/Slave2,60020,1433837704549-splitting] installe
d = 1 but only 0 done
2015-06-09 17:24:35,102 DEBUG [MASTER_SERVER_OPERATIONS-Master1:60000-1] master.DeadServer: Finished processing Slave2,60020,1433837704549
2015-06-09 17:24:35,103 DEBUG [MASTER_SERVER_OPERATIONS-Master1:60000-0] master.DeadServer: Finished processing Slave2,60020,1433837704549
2015-06-09 17:24:35,103 ERROR [MASTER_SERVER_OPERATIONS-Master1:60000-1] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for Slave2,60020,1433837704549, will retry
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:330)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:210)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs in [hdfs://Master1:9000/hbase/WALs/Slave2,60020,1433837704549-splitting] Task = installed = 1 done = 0 error = 0
        at org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:378)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:415)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:389)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:287)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:203)
        ... 4 more
2015-06-09 17:24:35,103 ERROR [MASTER_SERVER_OPERATIONS-Master1:60000-0] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:185)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015-06-09 17:24:35,108 WARN  [MASTER_SERVER_OPERATIONS-Master1:60000-2] master.SplitLogManager: Stopped while waiting for log splits to be completed
2015-06-09 17:24:35,108 WARN  [MASTER_SERVER_OPERATIONS-Master1:60000-2] master.SplitLogManager: error while splitting logs in [hdfs://Master1:9000/hbase/WALs/Slave3,60020,1433837704597-splitting] installe
d = 1 but only 0 done
2015-06-09 17:24:35,108 DEBUG [MASTER_SERVER_OPERATIONS-Master1:60000-2] master.DeadServer: Finished processing Slave3,60020,1433837704597
2015-06-09 17:24:35,108 ERROR [MASTER_SERVER_OPERATIONS-Master1:60000-2] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for Slave3,60020,1433837704597, will retry
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:330)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:210)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs in [hdfs://Master1:9000/hbase/WALs/Slave3,60020,1433837704597-splitting] Task = installed = 1 done = 0 error = 0
        at org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:378)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:415)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:389)
        at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:287)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:203)
        ... 4 more
2015-06-09 17:24:35,108 DEBUG [MASTER_SERVER_OPERATIONS-Master1:60000-4] master.DeadServer: Finished processing Slave3,60020,1433837704597
2015-06-09 17:24:35,109 ERROR [MASTER_SERVER_OPERATIONS-Master1:60000-4] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:185)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015-06-09 17:24:35,147 WARN  [MASTER_SERVER_OPERATIONS-Master1:60000-3] master.SplitLogManager: Interrupted while waiting for log splits to be completed
2015-06-09 17:24:35,147 WARN  [MASTER_SERVER_OPERATIONS-Master1:60000-3] master.SplitLogManager: error while splitting logs in [hdfs://Master1:9000/hbase/WALs/Slave1,60020,1433837704527-splitting] installe
d = 1 but only 0 done
2015-06-09 17:24:35,147 DEBUG [MASTER_SERVER_OPERATIONS-Master1:60000-3] master.DeadServer: Finished processing Slave1,60020,1433837704527
2015-06-09 17:24:35,148 ERROR [MASTER_SERVER_OPERATIONS-Master1:60000-3] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.util.concurrent.RejectedExecutionException: Task ServerShutdownHandler-Master1,60000,1433841863065-5 rejected from org.apache.hadoop.hbase.executor.ExecutorService$TrackingThreadPoolExecutor@63c6ec18[
Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 6]
        at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
        at org.apache.hadoop.hbase.executor.ExecutorService$Executor.submit(ExecutorService.java:224)
        at org.apache.hadoop.hbase.executor.ExecutorService.submit(ExecutorService.java:148)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:328)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:210)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015-06-09 17:24:35,153 DEBUG [master:Master1:60000] catalog.CatalogTracker: Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@19609cfa
2015-06-09 17:24:35,153 INFO  [master:Master1:60000] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x34d9d6915c806e3
2015-06-09 17:24:35,159 INFO  [master:Master1:60000] zookeeper.ZooKeeper: Session: 0x34d9d6915c806e3 closed
2015-06-09 17:24:35,159 INFO  [master:Master1:60000-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-06-09 17:24:35,179 INFO  [Master1,60000,1433841863065.splitLogManagerTimeoutMonitor] master.SplitLogManager$TimeoutMonitor: Master1,60000,1433841863065.splitLogManagerTimeoutMonitor exiting
2015-06-09 17:24:35,267 INFO  [master:Master1:60000] zookeeper.ZooKeeper: Session: 0x24d9d6826f906ef closed
2015-06-09 17:24:35,267 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-06-09 17:24:35,267 INFO  [master:Master1:60000] master.HMaster: HMaster main thread exiting
2015-06-09 17:24:35,267 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
        at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:194)
        at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:135)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2794)

已有(11)人评论

跳转到指定楼层
bob007 发表于 2015-6-9 21:32:15
首先确定datanode有没有挂掉,特别是slave4。然后看看是否磁盘满了。
split的时候发生错误
回复

使用道具 举报

zhangshuai 发表于 2015-6-10 09:08:35
跟我的错误一样。 至今没有解决办法。启动起来集群正常,过几分钟一个个挂掉。求答案啊!!
回复

使用道具 举报

栎梓天冲 发表于 2015-6-10 10:31:35
bob007 发表于 2015-6-9 21:32
首先确定datanode有没有挂掉,特别是slave4。然后看看是否磁盘满了。
split的时候发生错误

不是磁盘才有3%占用
回复

使用道具 举报

Alkaloid0515 发表于 2015-6-10 12:24:48
zhangshuai 发表于 2015-6-10 09:08
跟我的错误一样。 至今没有解决办法。启动起来集群正常,过几分钟一个个挂掉。求答案啊!!

原因比较多,看看权限、datanode是否可以用,是否是僵尸进程

回复

使用道具 举报

langyahun 发表于 2015-6-10 17:02:00
有可能是更换了HBASE版本后,zookeeper还保留着上一次hbase版本的设置,造成了冲突。
1.切换到zookeeper的bin目录;
2.执行$sh zkCli.sh
输入‘ls /’
4.输入‘rmr /hbase’
5.退出
重启hbase即可。
希望能帮到你。
回复

使用道具 举报

栎梓天冲 发表于 2015-6-11 14:42:52
langyahun 发表于 2015-6-10 17:02
有可能是更换了HBASE版本后,zookeeper还保留着上一次hbase版本的设置,造成了冲突。
1.切换到zookeeper的 ...

谢谢,重启了机子就全都搞定了
回复

使用道具 举报

栎梓天冲 发表于 2015-6-11 14:44:12
Alkaloid0515 发表于 2015-6-10 12:24
原因比较多,看看权限、datanode是否可以用,是否是僵尸进程

重启了机子,放了一天就自己好了
回复

使用道具 举报

栎梓天冲 发表于 2015-6-11 14:46:52
zhangshuai 发表于 2015-6-10 09:08
跟我的错误一样。 至今没有解决办法。启动起来集群正常,过几分钟一个个挂掉。求答案啊!!

重启集群(关机重启服务器),放他一下。我的就是这么稀里糊涂的好的
回复

使用道具 举报

flysky0802 发表于 2015-6-16 17:50:02
重启服务器?? 如果你几十台机器,生产系统的话,就死定了!

hbase 有 SplitLog  参数调整,可以考虑加长时间或者关闭日志写入!
然后执行hbase 修复操作!

回复

使用道具 举报

12下一页
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条