在Hadoop的reducer类中,有3个主要的函数,分别是:setup,clearup,reduce。代码如下:
- /**
- * Called once at the start of the task.
- */
- protected void setup(Context context
- ) throws IOException, InterruptedException {
- // NOTHING
- }
复制代码
-
- /**
- * This method is called once for each key. Most applications will define
- * their reduce class by overriding this method. The default implementation
- * is an identity function.
- */
- @SuppressWarnings("unchecked")
- protected void reduce(KEYIN key, Iterable<VALUEIN> values, Context context
- ) throws IOException, InterruptedException {
- for(VALUEIN value: values) {
- context.write((KEYOUT) key, (VALUEOUT) value);
- }
- }
复制代码
- /**
- * Called once at the end of the task.
- */
- protected void cleanup(Context context
- ) throws IOException, InterruptedException {
- // NOTHING
- }
复制代码
在用户的应用程序中调用到reducer时,会直接调用reducer里面的run函数,其代码如下: - /*
- * control how the reduce task works.
- */
- @SuppressWarnings("unchecked")
- public void run(Context context) throws IOException, InterruptedException {
- setup(context);
- while (context.nextKey()) {
- reduce(context.getCurrentKey(), context.getValues(), context);
- // If a back up store is used, reset it
- ((ReduceContext.ValueIterator)
- (context.getValues().iterator())).resetBackupStore();
- }
- cleanup(context);
- }
- }
复制代码
由上面的代码,我们可以了解到,当调用到reduce时,通常会先执行一个setup函数,最后会执行一个cleanup函数。而默认情况下,这两个函数的内容都是nothing。因此,当reduce不符合应用要求时,可以试着通过增加setup和cleanup的内容来满足应用的需求。
加入qq群(号码:39327136),讨论云技术,获取最新资讯资源等
|