尝试配置了下,环境如下:
Ubuntu 14.04
export JAVA_HOME=/home/ngou/jdk/jdk1.7.0_71
export HBASE_HOME=/home/ngou/apache/hbase-0.94.23
export MAVEN_HOME=/home/ngou/apache/apache-maven-3.2.3
export HADOOP_HOME=/home/ngou/apache/hadoop-2.2.0
export HIVE_HOME=/home/ngou/apache/apache-hive-0.14.0-bin
1. 启动HBase, hbash shell创建表ngou
create 'ngou', 'cf'
2. Flume 配置文件,在conf下新建 hbase.conf
agent.sources = b1
agent.channels = memoryChannel
agent.sinks = k1
agent.sources.b1.type = exec
agent.sources.b1.command = tail -F /home/ngou/flume_data/data.txt
agent.sources.b1.checkperiodic = 1000
agent.sources.b1.channels = memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.keep-alive = 30
agent.channels.memoryChannel.capacity = 10000
agent.channels.memoryChannel.transactionCapacity = 10000
agent.sinks.k1.type = hbase
agent.sinks.k1.table = ngou
agent.sinks.k1.columnFamily = cf
agent.sinks.k1.column = col1
agent.sinks.k1.serializer = org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
agent.sinks.k1.channel = memoryChannel
3. 启动FLUME
./bin/flume-ng agent -n agent -c conf -f ./conf/hbase.conf -Dflume.root.logger=DEBUG,console
4. 模拟日志生成
ngou@ubuntu:~/flume_data$ cat datarun.sh
#!/bin/bash
for i in {1..1000}; do
echo $i;
echo $i >> ./data.txt;
sleep 0.1;
done
5. hbase shell 查看数据
scan 'ngou'
-----------------------
以上是从日志文件中收集数据,下面配置文件的从网络获取, 定义SOURCE, CHANNEL, ,SINK
a1.sources = r1
a1.channels = c1
a1.sinks = k1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
a1.sources.r1.channels = c1
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1
然后在另一个SHELL进程中启动telnet localhost 44444, 输入内容,回车,内容会自动传送到FLUME,可以在日志控制台输出