yunge2016 发表于 2017-7-2 17:20:58

现在这个目录下可以看到topic对应的数据文件.log和.index文件,打开也可以看到数据。就是想问问flume采集kafka的数据到hdfs如何操作。现在也不确定哪里的问题,请问您这里有没有完整的配置文件呢。各种报错。指定了hdfs的地址,现在报错配置文件中没有找到主机。启动时需要指定主机吗

nextuser 发表于 2017-7-2 19:00:29

yunge2016 发表于 2017-7-2 17:20
现在这个目录下可以看到topic对应的数据文件.log和.index文件,打开也可以看到数据。就是想问问flume采集ka ...

贴出错误来

yunge2016 发表于 2017-7-2 19:52:22

flume采集kafka数据时报错
17/07/02 18:43:11 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /usr/hadoop3data/testdata.txt to /usr/hadoop3data/testdata.txt.COMPLETED
17/07/02 18:43:11 INFO hdfs.HDFSDataStream: Serializer = TEXT, UseRawLocalFileSystem = false
17/07/02 18:43:18 INFO hdfs.BucketWriter: Creating hdfs://192.168.137.3:8020/data/2017-07-02.1499035391653.tmp
17/07/02 18:43:28 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: hdfs://192.168.137.3:8020/data/2017-07-02.1499035391653.tmp
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:682)
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:232)
        at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:504)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:406)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:675)
        ... 6 more
17/07/02 18:43:33 INFO hdfs.BucketWriter: Creating hdfs://192.168.137.3:8020/data/2017-07-02.1499035391654.tmp
17/07/02 18:43:43 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: hdfs://192.168.137.3:8020/data/2017-07-02.1499035391654.tmp
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:682)
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:232)
        at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:504)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:406)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:675)
        ... 6 more
17/07/02 18:43:48 INFO hdfs.BucketWriter: Creating hdfs://192.168.137.3:8020/data/2017-07-02.1499035391655.tmp
17/07/02 18:43:58 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: hdfs://192.168.137.3:8020/data/2017-07-02.1499035391655.tmp
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:682)
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:232)
        at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:504)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:406)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:675)
        ... 6 more
17/07/02 18:44:05 INFO hdfs.BucketWriter: Creating hdfs://192.168.137.3:8020/data/2017-07-02.1499035391656.tmp
17/07/02 18:44:15 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Callable timed out after 10000 ms on file: hdfs://192.168.137.3:8020/data/2017-07-02.1499035391656.tmp
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:682)
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:232)
        at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:504)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:406)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:675)
        ... 6 more

yunge2016 发表于 2017-7-2 20:37:02

启动flume后报错 配置文件中没有找到a2agent
7/07/02 20:08:53 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
17/07/02 20:08:53 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:/opt/modules/flume/conf/final-hdfs.conf
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent: LogAgent
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Processing:HDFS
17/07/02 20:08:53 WARN conf.FlumeConfiguration: Configuration empty for: LogAgent.Removed.
17/07/02 20:08:53 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents:
17/07/02 20:08:53 WARN node.AbstractConfigurationProvider: No configuration found for this host:a2
17/07/02 20:08:53 INFO node.Application: Starting new configuration:{ sourceRunners:{} sinkRunners:{} channels:{} }

nextuser 发表于 2017-7-3 08:31:19

yunge2016 发表于 2017-7-2 20:37
启动flume后报错 配置文件中没有找到a2agent
7/07/02 20:08:53 INFO node.PollingPropertiesFileConfigu ...

错误太多了,建议先理解一些基本的配置的含义。否则漏洞根本补不过来.
a2到底是host,还是代理名称。

yunge2016 发表于 2017-7-3 10:15:18

a2是agent的名称。host是啥。配置文件里没有写的。

nextuser 发表于 2017-7-7 08:56:15

本帖最后由 nextuser 于 2017-7-7 08:57 编辑

yunge2016 发表于 2017-7-3 10:15
a2是agent的名称。host是啥。配置文件里没有写的。全篇好像定义的是flumetohdfs_agent,而非a2
WARN node.AbstractConfigurationProvider: No configuration found for this host:a2

liuyou2036 发表于 2020-7-16 14:25:55

最好加上注释
页: 1 [2]
查看完整版本: 如何把kafka的数据通过flume采集到hdfs中呢或Hbase中