使用flume采集数据一个文件10m左右,1小时14G左右
用spooldir做source,直接存入hdfs,
问题一:
SpoolDirectorySource总是出现channel is full的告警,修改source的batchSize 从100到1000没有效果,我看学习区里说:batchSize要source和sink一致. 我是遵守的.
这里有个小问题,maxBackoff 应该是channel满后多长时间再试,我设了3000没有生效的感觉.他是按着250 500 1000的递增的.
问题二:
可能是引起问题一的原因,hdfs sink报 IO error(报错在下面),我看以前有人解决是把10s的超时,改成40s,这个有点掩耳盗铃.前面的Error while trying to hflushOrSync!是有时出,有时不出.,
影响:
用flume入hdfs的速度超级慢,1小时最多入10G数据.10G数据用put不到10分钟就好了.看到hdfs的IO(用cloudera manager看)没有到达峰值.iostat 看当时硬盘io也不高,没有fs坏块,所以,也不知道怎么搞了.
我的配置
[mw_shl_code=shell,true]
# conf: A single-node Flume configuration
# Name the components on this agent
z2.sources = src1
z2.sinks = sk1
z2.channels = ch1
#
# # Describe/configure the source
z2.sources.src1.type = spooldir
z2.sources.src1.spoolDir = /cup/d0/flume/traffica/vlr
z2.sources.src1.ignorePattern=^(.)*\\.tmp$
z2.sources.src1.deletePolicy =immediate
z2.sources.src1.maxBackoff = 3000
z2.sources.src1.batchSize = 1000
#
# # Describe the sink
z2.sinks.sk1.type = hdfs
z2.sinks.sk1.hdfs.path = /metadata/external/traffica/vlr
z2.sinks.sk1.hdfs.filePrefix = vlr
z2.sinks.sk1.hdfs.rollInterval = 3600
z2.sinks.sk1.hdfs.rollSize = 128000000
z2.sinks.sk1.hdfs.rollCount = 0
z2.sinks.sk1.hdfs.batchSize = 1000
z2.sinks.sk1.hdfs.threadsPoolSize = 1
z2.sinks.sk1.hdfs.fileType = SequenceFile
# z2.sinks.sk1.hdfs.codeC = gzip
#
#
# # Use a channel which buffers events in memory
z2.channels.ch1.type = memory
z2.channels.ch1.capacity = 10000
z2.channels.ch1.transactionCapacity = 1000
# z2.channels.ch1.byteCapacity = 1000000000
#
# # Bind the source and sink to the channel
z2.sources.src1.channels = ch1
z2.sinks.sk1.channel = ch1[/mw_shl_code]
[mw_shl_code=applescript,true]16/03/07 14:41:43 WARN source.SpoolDirectorySource: The channel is full, and cannot write data now. The source will try again after 250 milliseconds
16/03/07 14:41:44 INFO avro.ReliableSpoolingFileEventReader: Last read was never committed - resetting mark position.
16/03/07 14:41:47 WARN source.SpoolDirectorySource: The channel is full, and cannot write data now. The source will try again after 500 milliseconds
16/03/07 14:41:47 INFO avro.ReliableSpoolingFileEventReader: Last read was never committed - resetting mark position.
16/03/07 14:41:50 WARN source.SpoolDirectorySource: The channel is full, and cannot write data now. The source will try again after 1000 milliseconds
16/03/07 14:41:50 ERROR hdfs.AbstractHDFSWriter: Error while trying to hflushOrSync!
16/03/07 14:41:50 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: /metadata/external/traffica/vlr/vlr.1457332875794.tmp
at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:693)
at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:459)
at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:421)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:573)
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:418)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
at java.util.concurrent.FutureTask.get(FutureTask.java:201)
at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:686)
... 7 more[/mw_shl_code]
|
|