问题1:setOutputFormatClass、setMapOutputKeyClass、setMapOutputValueClass这三个API的区别是什么,分别的作用是什么,并且是否可以三个同时设置?
问题2: hadoop2.7.1版本的org.apache.hadoop.mapreduce.lib.chain.ChainMapper的API addMapper
public static void addMapper(Job job, Class<? extends Mapper> klass, Class<?> inputKeyClass, Class<?> inputValueClass, Class<?> outputKeyClass, Class<?> outputValueClass, Configuration mapperConf) throws IOExceptionAdds a Mapper class to the chain mapper.The key and values are passed from one element of the chain to the next, by value. For the added Mapper the configuration given for it, mapperConf, have precedence over the job's Configuration. This precedence is in effect when the task is running. IMPORTANT: There is no need to specify the output key/value classes for the ChainMapper, this is done by the addMapper for the last mapper in the chain---最后这句是说,对于ChainMapper类来说,不用设置输出格式吗?
谢谢!!
|