分享

同过CM web界面删除节点后nodemanager无法启动

grinsky 发表于 2016-6-15 14:52:58 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 3 10616
同过CM web界面删除节点后nodemanager无法启动
nodemanager日志如下:
[mw_shl_code=xml,true]2016-06-15 14:47:54,755 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system...
2016-06-15 14:47:54,755 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped.
2016-06-15 14:47:54,756 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2016-06-15 14:47:54,756 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.service.ServiceStateException: com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero).
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:255)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:474)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:521)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero).
        at com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
        at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108)
        at org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.<init>(YarnSecurityTokenProtos.java:1815)
        at org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.<init>(YarnSecurityTokenProtos.java:1779)
        at org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:1939)
        at org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:1934)
        at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217)
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223)
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
        at org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.parseFrom(YarnSecurityTokenProtos.java:2444)
        at org.apache.hadoop.yarn.security.ContainerTokenIdentifier.readFields(ContainerTokenIdentifier.java:172)
        at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:142)
        at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:271)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:298)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:254)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:237)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        ... 5 more
2016-06-15 14:47:54,759 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at hadoop4/172.16.0.6
************************************************************/[/mw_shl_code]


resoucemanager日志:
[mw_shl_code=xml,true]2016-06-15 14:16:05,451 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:mapred/hadoop5@ATM.COM (auth:KERBEROS) cause:org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1464339568023_0001' doesn't exist in RM.
2016-06-15 14:16:05,451 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8032, call org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport from 172.16.0.18:51185 Call#498 Retry#0
org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1464339568023_0001' doesn't exist in RM.
        at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324)
        at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
        at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
2016-06-15 14:16:05,458 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:mapred/hadoop5@ATM.COM (auth:KERBEROS) cause:org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1464590240778_0001' doesn't exist in RM.
2016-06-15 14:16:05,458 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8032, call org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport from 172.16.0.18:51185 Call#501 Retry#0
org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1464590240778_0001' doesn't exist in RM.
        at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324)
        at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
        at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
2016-06-15 14:25:37,974 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: Release request cache is cleaned up
2016-06-15 14:35:59,342 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: RECEIVED SIGNAL 15: SIGTERM
2016-06-15 14:35:59,352 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@hadoop5:8088
2016-06-15 14:35:59,455 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032
2016-06-15 14:35:59,456 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-06-15 14:35:59,457 INFO org.apache.hadoop.ipc.Server: Stopping server on 8033
2016-06-15 14:35:59,456 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8032
2016-06-15 14:35:59,461 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to standby state
2016-06-15 14:35:59,459 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-06-15 14:35:59,460 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8033
2016-06-15 14:35:59,462 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ResourceManager metrics system...
2016-06-15 14:35:59,466 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ResourceManager metrics system stopped.
2016-06-15 14:35:59,468 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ResourceManager metrics system shutdown complete.
2016-06-15 14:35:59,468 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher is draining to stop, igonring any new events.
2016-06-15 14:35:59,470 INFO org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: Delayed Deletion Thread Interrupted. Shutting it down
2016-06-15 14:35:59,470 WARN org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher: org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher$LauncherThread interrupted. Returning.
2016-06-15 14:35:59,470 INFO org.apache.hadoop.ipc.Server: Stopping server on 8030
2016-06-15 14:35:59,471 INFO org.apache.hadoop.ipc.Server: Stopping server on 8031
2016-06-15 14:35:59,471 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-06-15 14:35:59,472 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8030
2016-06-15 14:35:59,530 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: NMLivelinessMonitor thread interrupted
2016-06-15 14:35:59,530 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-06-15 14:35:59,530 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8031
2016-06-15 14:35:59,535 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Returning, interrupted : java.lang.InterruptedException
2016-06-15 14:35:59,539 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Update thread interrupted. Exiting.
2016-06-15 14:36:00,546 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService: Interrupted while waiting to reload alloc configuration
2016-06-15 14:36:00,547 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Continuous scheduling thread interrupted. Exiting.
java.lang.InterruptedException: sleep interrupted
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:319)
2016-06-15 14:36:00,547 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
2016-06-15 14:36:00,547 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: AMLivelinessMonitor thread interrupted
2016-06-15 14:36:00,547 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer thread interrupted
2016-06-15 14:36:00,547 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: AMLivelinessMonitor thread interrupted
2016-06-15 14:36:00,550 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to standby state
2016-06-15 14:36:00,551 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down ResourceManager at hadoop5/172.16.0.18
************************************************************/[/mw_shl_code]

已有(3)人评论

跳转到指定楼层
arsenduan 发表于 2016-6-15 16:19:27
以前是否正常。
建议重启下。
回复

使用道具 举报

Joker 发表于 2016-6-16 10:05:34
org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1464590240778_0001' doesn't exist in RM.

之前还有任务在运行? 都没有了RM怎么运行。可以优先尝试个重启
回复

使用道具 举报

grinsky 发表于 2016-6-20 13:19:47
这个删掉重装了555555555
单独将yarn删掉重装根本不起作用啊,好像cm只是将其从web管理页面移除,并没有真的从系统中删掉
再重新安装的时候就直接加回来,还是原来的那些问题
要在怎么样才能真正的重装一个服务呢
而不是要重装整个CDH+CM
要不然实在太折腾了
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条