yunge2016 发表于 2017-7-2 18:51:17

flume整合kafka和hdfs过程中, 报错 hdfs.HDFSEventSink: HDFs IO error

报错代码
17/07/02 18:43:11 INFO avro.ReliableSpoolingFileEventReader: Last read took us just up to a file boundary. Rolling to the next file, if there is one.
17/07/02 18:43:11 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /usr/hadoop3data/baijia_article.txt to /usr/hadoop3data/baijia_article.txt.COMPLETED
17/07/02 18:43:11 INFO avro.ReliableSpoolingFileEventReader: Last read took us just up to a file boundary. Rolling to the next file, if there is one.
17/07/02 18:43:11 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /usr/hadoop3data/testdata.txt to /usr/hadoop3data/testdata.txt.COMPLETED
17/07/02 18:43:11 INFO hdfs.HDFSDataStream: Serializer = TEXT, UseRawLocalFileSystem = false
17/07/02 18:43:18 INFO hdfs.BucketWriter: Creating hdfs://192.168.137.3:8020/data/2017-07-02.1499035391653.tmp
17/07/02 18:43:28 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: hdfs://192.168.137.3:8020/data/2017-07-02.1499035391653.tmp
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:682)
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:232)
        at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:504)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:406)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:675)
        ... 6 more


nextuser 发表于 2017-7-2 19:08:07

滚动的方式调大一些。
flumetohdfs_agent.sinks.hdfs_sink.hdfs.rollSize = 1024000
flumetohdfs_agent.sinks.hdfs_sink.hdfs.rollCount = 100
flumetohdfs_agent.sinks.hdfs_sink.hdfs.rollInterval = 36000

yunge2016 发表于 2017-7-2 19:54:19

配置文件里没有这些,现在是报HDFSIOError的错误呢。
17/07/02 18:44:05 INFO hdfs.BucketWriter: Creating hdfs://192.168.137.3:8020/data/2017-07-02.1499035391656.tmp
17/07/02 18:44:15 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: hdfs://192.168.137.3:8020/data/2017-07-02.1499035391656.tmp

nextuser 发表于 2017-7-3 08:28:00

yunge2016 发表于 2017-7-2 19:54
配置文件里没有这些,现在是报HDFSIOError的错误呢。
17/07/02 18:44:05 INFO hdfs.BucketWriter: Cre ...

楼主多用点心,下面应该是你的配置吧。
# The channel can be defined as follows.
flumetohdfs_agent.sinks.hdfs_sink.type = hdfs
#flumetohdfs_agent.sinks.hdfs_sink.filePrefix = %{host}
flumetohdfs_agent.sinks.hdfs_sink.hdfs.path = hdfs://192.168.137.3:8020/data/ds=%Y%m%d
## roll every hour (after gz)
flumetohdfs_agent.sinks.hdfs_sink.hdfs.rollSize = 0
flumetohdfs_agent.sinks.hdfs_sink.hdfs.rollCount = 0
flumetohdfs_agent.sinks.hdfs_sink.hdfs.rollInterval = 3600
flumetohdfs_agent.sinks.hdfs_sink.hdfs.threadsPoolSize = 300

页: [1]
查看完整版本: flume整合kafka和hdfs过程中, 报错 hdfs.HDFSEventSink: HDFs IO error