HRegionServer经常性的挂掉
本帖最后由 ighack 于 2019-5-21 09:17 编辑我在regionserver的log日志中发现
2019-05-20 21:10:46,802 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=moniser:2181,basappser2:2181 sessionTimeout=30000 watcher=org.apache.zookee
per.ZooKeeperMain$MyWatcher@3a034642019-05-20 21:10:46,826 INFO zookeeper.ClientCnxn: Opening socket connection to server moniser/192.168.0.238:2181. Will not attempt to authenticate using SASL (unknown error)2019-05-20 21:10:46,831 INFO zookeeper.ClientCnxn: Socket connection established to moniser/192.168.0.238:2181, initiating session
2019-05-20 21:10:46,839 INFO zookeeper.ClientCnxn: Session establishment complete on server moniser/192.168.0.238:2181, sessionid = 0x16ac4a4aee
c000c, negotiated timeout = 80000
在out日志中发现
hbase-daemon.sh: line 226:7824 Killed nice -n $HBASE_NICENESS "$HBASE_HOME"/bin/hbase --config "${HBASE_CONF_DIR}" $command "$@" start >> ${HBASE_LOGOUT} 2>&1
我在zookeeper日志中
2019-05-20 21:10:47,175 - WARN - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x16ac4a4aeec000c, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:748)
2019-05-20 21:10:47,176 - INFO - Closed socket connection for client /192.168.0.238:26831 which had sess
ionid 0x16ac4a4aeec000c
2019-05-21 02:14:40,886 - INFO - Accepted socket connection from /127.0.0.1:15554
2019-05-21 02:14:42,886 - WARN - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:748)
2019-05-21 02:14:42,886 - INFO - Closed socket connection for client /127.0.0.1:15554 (no session establ
ished for client)
zookeeper的tickTime=40000
可以设置的时间在长一些
zookeeper.session.timeout.ms=400000
还有为啥都是本地机器,如果都是这样,可能配置有问题,也就是你的网络可能有问题了,比如hosts,hostname,ip地址的配置等出现问题
本帖最后由 ighack 于 2019-5-21 10:54 编辑
我只有两台机器做的HBase
A B
在这两台机器上装的zookeeper
而且经常挂掉的是A。B很少挂掉
配制上也没发现机器名写错了啊
hosts也是对的啊
超时80秒,这个时间很长了啊
而且hbase也不是说运行一下就挂
有时可以运行2天,有时运行1天
ighack 发表于 2019-5-21 10:47
我只有两台机器做的HBase
A B
在这两台机器上装的zookeeper
zookeeper是用来选举的,要么你用伪分布,配置三台都是本地的,要么就用三台虚拟机。
两台安装了出问题的可能性非常大,而且找不到原因。
刚开始学习,按照正常的路子来走。
我也想要三台。可是公司没有资源给我
该hbase只用于pinpoint的监控。不是一个重要业务组件 ighack 发表于 2019-5-21 11:59
我也想要三台。可是公司没有资源给我
该hbase只用于pinpoint的监控。不是一个重要业务组件
推荐伪分布
ip地址使用一个即可:
server.1=192.168.1.201:2888:3888
server.2=192.168.1.201:2889:3889
server.3=192.168.1.201:2890:3890
推荐参考
ZooKeeper介绍、伪分布式集群安装及使用
http://www.aboutyun.com/forum.php?mod=viewthread&tid=9097
zk 的 tickTime 是时间单元,不应该设置那么大,10秒以内足以,应该调大HBASE的zk timeout和rpc timeout 我看了一下主要是zk的timeout
hbase的我设的和tickTime 是一样的80000 本帖最后由 ighack 于 2019-6-5 09:31 编辑
2019-06-04 18:34:20,376 INFO regionserver.HRegionServer: moniser,16020,1559617721856-MemstoreFlusherChore requesting flush of Trace
V2,&\x00\x00\x00\x00\x00\x00\x00,1559114757691.8ba4163358f97beb059bbe066b15c6c5. because S has an old edit so flush to free WALs after random delay 194067ms2019-06-04 18:34:20,377 INFO regionserver.HRegionServer: moniser,16020,1559617721856-MemstoreFlusherChore requesting flush of Trace
V2,\xCC\x00\x00\x00\x00\x00\x00\x00,1559114757691.2e55b6f50195981b03373a114c155b4e. because S has an old edit so flush to free WALs after random delay 179269ms2019-06-04 18:34:20,377 INFO regionserver.HRegionServer: moniser,16020,1559617721856-MemstoreFlusherChore requesting flush of Appli
cationMapStatisticsSelf_Ver2,\x05\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00,1559114765629.233d41a8e943fd1e4ddb29b694b13dd0. because C has an old edit so flush to free WALs after random delay 242343ms
2019-06-04 18:34:20,828 INFO regionserver.HStore: Completed compaction of 3 (all) file(s) in S of TraceV2,\xFF\x00\x00\x00\x00\x00\x00\x00,1559114757691.aca60625a1273b36a87e5affac2fd09a. into b1c00fc5a642412db370dbef1d5acbaf(size=82.6 M), total size for store is 82.6 M. This selection was in queue for 0sec, and took 1sec to execute.
2019-06-04 18:34:20,828 INFO regionserver.CompactSplitThread: Completed compaction: Request = regionName=TraceV2,\xFF\x00\x00\x00\x00\x00\x00\x00,1559114757691.aca60625a1273b36a87e5affac2fd09a., storeName=S, fileCount=3, fileSize=82.7 M (81.0 M, 877.4 K, 819.6 K), priority=7, time=7788170363172924; duration=1sec
2019-06-04 18:34:29,089 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:host.name=moniser
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_131
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/java/jdk1.8.0_131/jre
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/pinpoint/app/pinpoint/hbase-1.3.1/bin/../conf:/usr/java/jdk1.8.0_131/lib/tools.jar:
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:os.version=3.10.0-862.14.4.el7.x86_64
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:user.name=pinpoint
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:user.home=/pinpoint/app
2019-06-04 18:34:29,090 INFO zookeeper.ZooKeeper: Client environment:user.dir=/pinpoint/app/pinpoint/hbase-1.3.1
2019-06-04 18:34:29,092 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=moniser:2181,basappser2:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@3a03464
2019-06-04 18:34:29,113 INFO zookeeper.ClientCnxn: Opening socket connection to server moniser/192.168.0.238:2181. Will not attempt to authenticate using SASL (unknown error)
2019-06-04 18:34:29,118 INFO zookeeper.ClientCnxn: Socket connection established to moniser/192.168.0.238:2181,initiating session
2019-06-04 18:34:29,126 INFO zookeeper.ClientCnxn: Session establishment complete on server moniser/192.168.0.238:2181, sessionid =0x16b01d9f02d003c, negotiated timeout = 80000
最近又发现这样的日志。
gc很正常啊
2019-06-04T18:33:02.284+0800: 26661.606: 1145281K->730766K(1554228K), 0.0064589 secs]
2019-06-04T18:33:09.949+0800: 26669.272: 1143066K->755126K(1554228K), 0.0066203 secs]
2019-06-04T18:33:13.576+0800: 26672.899: 1167478K->737788K(1554228K), 0.0144429 secs]
2019-06-04T18:33:24.612+0800: 26683.935: 1150140K->745860K(1554228K), 0.0058815 secs]
2019-06-04T18:33:30.032+0800: 26689.355: 1158212K->758564K(1554228K), 0.0063865 secs]
2019-06-04T18:33:33.507+0800: 26692.830: 1170877K->769817K(1554228K), 0.0049250 secs]
2019-06-04T18:33:37.921+0800: 26697.244: 1182169K->752898K(1554228K), 0.0080765 secs]
2019-06-04T18:33:49.647+0800: 26708.970: 1165250K->780087K(1554228K), 0.0068562 secs]
2019-06-04T18:33:55.591+0800: 26714.914: 1192439K->762115K(1554228K), 0.0061978 secs]
2019-06-04T18:34:05.792+0800: 26725.115: 1174467K->767322K(1554228K), 0.0072509 secs]
2019-06-04T18:34:16.262+0800: 26735.585: 1179674K->770432K(1554228K), 0.0050493 secs]
2019-06-04T18:34:17.823+0800: 26737.146: 1182773K->774634K(1554228K), 0.0071750 secs]
2019-06-04T18:34:20.759+0800: 26740.082: 1186986K->776495K(1554228K), 0.0084955 secs]
2019-06-04 18:34:29,118 - INFO - Accepted socket connection from /192.168.0.238:7785
2019-06-04 18:34:29,121 - INFO - Client attempting to establish new session at /192.168.0.238:7785
2019-06-04 18:34:29,124 - INFO - Established session 0x16b01d9f02d003c with negotiated timeout 80000 for client /192.168.0.238:
77852019-06-04 18:34:29,457 - WARN - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x16b01d9f02d003c, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:748)
2019-06-04 18:34:29,458 - INFO - Closed socket connection for client /192.168.0.238:7785 which had sessionid 0x16b01d9f02d003c
页:
[1]
2