分享

kafka问题也莫名shutdown

caiyifeng 发表于 2015-4-23 10:00:12 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 5 36325
刚才发了帖子咨询storm总是会莫名shutdown
刚才发现kafka也是,白天都正常运行,第二天过来看,总是会shutdown

我查了日志信息如下:
[2015-04-22 19:24:40,960] INFO [BrokerChangeListener on Controller 0]: Broker change listener fired for path /brokers/ids with children 0 (kafka.controller.ReplicaStateMachine$BrokerChangeListener)
[2015-04-22 19:24:40,966] INFO [BrokerChangeListener on Controller 0]: Newly added brokers: , deleted brokers: 2, all live brokers: 0 (kafka.controller.ReplicaStateMachine$BrokerChangeListener)
[2015-04-22 19:24:40,967] INFO [Controller-0-to-broker-2-send-thread], Shutting down (kafka.controller.RequestSendThread)
[2015-04-22 19:24:40,967] INFO [Controller-0-to-broker-2-send-thread], Stopped  (kafka.controller.RequestSendThread)
[2015-04-22 19:24:40,967] INFO [Controller-0-to-broker-2-send-thread], Shutdown completed (kafka.controller.RequestSendThread)
[2015-04-22 19:24:40,968] INFO [Controller 0]: Broker failure callback for 2 (kafka.controller.KafkaController)
[2015-04-22 19:24:40,968] INFO [Controller 0]: Removed ArrayBuffer() from list of shutting down brokers. (kafka.controller.KafkaController)
[2015-04-22 19:24:40,968] INFO [Partition state machine on Controller 0]: Invoking state change to OfflinePartition for partitions [poc-dataTopic,0] (kafka.controller.PartitionStateMachine)
[2015-04-22 19:24:40,976] DEBUG [OfflinePartitionLeaderSelector]: Some broker in ISR is alive for [poc-dataTopic,0]. Select 0 from ISR 0 to be the leader. (kafka.controller.OfflinePartitionLeaderSelector)
[2015-04-22 19:24:40,977] INFO [OfflinePartitionLeaderSelector]: Selected new leader and ISR {"leader":0,"leader_epoch":2,"isr":[0]} for offline partition [poc-dataTopic,0] (kafka.controller.OfflinePartitionLeaderSelector)
[2015-04-22 19:24:40,981] DEBUG [Partition state machine on Controller 0]: After leader election, leader cache is updated to Map([poc-dataTopic,1] -> (Leader:0,ISR:0,2,LeaderEpoch:1,ControllerEpoch:1), [poc-dataTopic,0] -> (Leader:0,ISR:0,LeaderEpoch:2,ControllerEpoch:1)) (kafka.controller.PartitionStateMachine)
[2015-04-22 19:24:40,982] INFO [Replica state machine on controller 0]: Invoking state change to OfflineReplica for replicas [Topic=poc-dataTopic,Partition=1,Replica=2],[Topic=poc-dataTopic,Partition=0,Replica=2] (kafka.controller.ReplicaStateMachine)
[2015-04-22 19:24:40,982] DEBUG [Controller 0]: Removing replica 2 from ISR 0,2 for partition [poc-dataTopic,1]. (kafka.controller.KafkaController)
[2015-04-22 19:24:40,988] INFO [Controller 0]: New leader and ISR for partition [poc-dataTopic,1] is {"leader":0,"leader_epoch":2,"isr":[0]} (kafka.controller.KafkaController)
[2015-04-22 19:24:40,989] DEBUG [Controller 0]: Removing replica 2 from ISR 0 for partition [poc-dataTopic,0]. (kafka.controller.KafkaController)
[2015-04-22 19:24:40,993] WARN [Controller 0]: Cannot remove replica 2 from ISR of partition [poc-dataTopic,0] since it is not in the ISR. Leader = 0 ; ISR = List(0) (kafka.controller.KafkaController)
[2015-04-22 19:24:40,994] DEBUG The stop replica request (delete = true) sent to broker 2 is  (kafka.controller.ControllerBrokerRequestBatch)
[2015-04-22 19:24:40,994] DEBUG The stop replica request (delete = false) sent to broker 2 is [Topic=poc-dataTopic,Partition=1,Replica=2],[Topic=poc-dataTopic,Partition=0,Replica=2] (kafka.controller.ControllerBrokerRequestBatch)
[2015-04-22 19:24:40,994] WARN [Channel manager on controller 0]: Not sending request Name: StopReplicaRequest; Version: 0; CorrelationId: 18; ClientId: ; DeletePartitions: false; ControllerId: 0; ControllerEpoch: 1; Partitions: [poc-dataTopic,1] to broker 2, since it is offline. (kafka.controller.ControllerChannelManager)
[2015-04-22 19:24:40,994] WARN [Channel manager on controller 0]: Not sending request Name: StopReplicaRequest; Version: 0; CorrelationId: 18; ClientId: ; DeletePartitions: false; ControllerId: 0; ControllerEpoch: 1; Partitions: [poc-dataTopic,0] to broker 2, since it is offline. (kafka.controller.ControllerChannelManager)
[2015-04-22 19:24:40,996] INFO [Controller-0-to-broker-0-send-thread], Shutting down (kafka.controller.RequestSendThread)
[2015-04-22 19:24:40,998] WARN [Controller-0-to-broker-0-send-thread], Controller 0 fails to send a request to broker id:0,host:master2.hadoopdomain,port:9092 (kafka.controller.RequestSendThread)
java.nio.channels.AsynchronousCloseException
        at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:205)
        at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:387)
        at kafka.utils.Utils$.read(Utils.scala:375)
        at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
        at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
        at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
        at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
        at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
[2015-04-22 19:24:40,999] INFO [Controller-0-to-broker-0-send-thread], Stopped  (kafka.controller.RequestSendThread)
[2015-04-22 19:24:40,999] INFO [Controller-0-to-broker-0-send-thread], Shutdown completed (kafka.controller.RequestSendThread)
[2015-04-22 19:24:41,000] INFO [Controller 0]: Controller shutdown complete (kafka.controller.KafkaController)

已有(5)人评论

跳转到指定楼层
caiyifeng 发表于 2015-4-23 10:06:28
我查了同一时间点的zookeeper日志,报如下错误,不知是否有关系:
2015-04-22 19:24:11,097 [myid:0] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for client /10.50.5.16:38668 which had sessionid 0x4ce05462480021
2015-04-22 19:24:11,100 [myid:0] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for client /10.50.5.15:45085 which had sessionid 0x4ce05462480020
2015-04-22 19:24:11,175 [myid:0] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for client /10.50.5.17:38168 which had sessionid 0x4ce0546248002c
2015-04-22 19:24:11,179 [myid:0] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for client /10.50.5.17:38166 which had sessionid 0x4ce05462480029
2015-04-22 19:24:11,185 [myid:0] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for client /10.50.5.16:38778 which had sessionid 0x4ce0546248002f
2015-04-22 19:24:11,195 [myid:0] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for client /10.50.5.16:38775 which had sessionid 0x4ce0546248002e
2015-04-22 19:24:12,393 [myid:0] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x4ce05462480030, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2015-04-22 19:24:12,395 [myid:0] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for client /10.50.5.15:45698 which had sessionid 0x4ce05462480030
2015-04-22 19:24:12,407 [myid:0] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x4ce0546248002d, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2015-04-22 19:24:12,408 [myid:0] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for client /10.50.5.16:38769 which had sessionid 0x4ce0546248002d
2015-04-22 19:24:12,408 [myid:0] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x4ce05462480032, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
回复

使用道具 举报

jixianqiuxue 发表于 2015-4-23 12:11:15
caiyifeng 发表于 2015-4-23 10:06
我查了同一时间点的zookeeper日志,报如下错误,不知是否有关系:
2015-04-22 19:24:11,097 [myid:0] - IN ...

  java.nio.channels.AsynchronousCloseException
当在某个信道的 I/O 操作中处于阻塞状态的某个线程被另一个线程中断时,该线程将收到此经过检查的异常。

回复

使用道具 举报

caiyifeng 发表于 2015-4-24 09:42:14
jixianqiuxue 发表于 2015-4-23 12:11
java.nio.channels.AsynchronousCloseException
当在某个信道的 I/O 操作中处于阻塞状态的某个线程被 ...

请问知道如何解决吗?集群上没有任务的,不会出现阻塞状态的
回复

使用道具 举报

howtodown 发表于 2015-4-24 15:18:51
caiyifeng 发表于 2015-4-24 09:42
请问知道如何解决吗?集群上没有任务的,不会出现阻塞状态的

client和server交互数据出现问题了。这个需要了解zookeeper机制
回复

使用道具 举报

howtodown 发表于 2015-4-24 15:21:32
caiyifeng 发表于 2015-4-24 09:42
请问知道如何解决吗?集群上没有任务的,不会出现阻塞状态的




参考下这篇文章

Zookeeper源码分析之二Session建立
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条