我的map和reduce代码:
public static class MyMapper extends TableMapper{
private final IntWritable one = new IntWritable(1);
@Override
public void map(ImmutableBytesWritable row, Result value, Context context)
throws IOException,InterruptedException{
for(KeyValue kv : value.list()){
context.write(new Text(kv.getKey()), one);
}
}
}
public static class MyReducer extends Reducer {
private static Map countMap = new HashMap();
public void reduce(Text key, Iterable[I] values, Context context) throws IOException, InterruptedException {
int i = 0;
for (IntWritable val : values) {
i += val.get();
}
countMap.put(key.toString(),i);
}
}
我曾经设想要是把reduce的结果放入对象countMap就好了,不用回写到HDFS,而直接put到一个countMap返回给调用者。但认真一想,每台机子做reduce时都会把结果put进本台机子维护的内存里面,当所有机子reduce完以后怎样把每台机子内存里的countMap对象聚集到一起返回给调用者呢?这个功能hadoop可以实现吗,如果要实现此功能,有其他解决方案吗?
|
|