执行example包里的SecondarySort,发现结果正常,也就是说,hadoop框架是没问题的,出问题的是我的程序,下面详细说明程序出现的问题:
情况一、不设置GroupingComparatorClass
- Job userActivityJob = Job.getInstance(userActivityConf);
- userActivityJob.setJarByClass(UserActivity.class);
- userActivityJob.setMapperClass(UserActivityMapper.class);
-
- userActivityJob.setPartitionerClass(UserActivityPartitioner.class);
- //userActivityJob.setGroupingComparatorClass(UserActivityGroupingComparator.class);
-
- userActivityJob.setReducerClass(UserActivityReducer.class);
- userActivityJob.setOutputKeyClass(UserActivityKeyWritable.class);
- userActivityJob.setOutputValueClass(UserActivityValueWritable.class);
复制代码
结果:("****"这个是写在reduce的开始和结束的地方,用于界定一个reduce)
前两个字段是输入的key,其他都是输入的value
- 1 414030933742050 **********************
- 1 414030933742050 4.1.0.2ctch1 1 414030933742050 MainActivity 1411378349000 1411378368000 1
- 1 414030933742050 **********************
- 1 414030933742050 **********************
- 1 414030933742050 4.1.0.2ctch1 1 414030933742050 AppDetailActivty 1411378368000 1411378379000 1
- 1 414030933742050 **********************
- 1 455020096740852 **********************
- 1 455020096740852 4.1.0.2ctch1 1 455020096740852 MainActivity 1413546169000 1413546170000 1
- 1 455020096740852 **********************
- 1 455020096898038 **********************
- 1 455020096898038 4.6.0.3 1 455020096898038 MainActivity 1408844760000 1408844786000 1
- 1 455020096898038 **********************
- 1 455020096898038 **********************
- 1 455020096898038 4.6.0.3 1 455020096898038 MainActivity 1408844788000 1408844792000 1
- 1 455020096898038 **********************
复制代码
情况二:设置GroupingComparatorClass
- @Override
- public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
- return WritableComparator.compareBytes(b1, s1, Integer.SIZE / 8,b2, s2, Integer.SIZE / 8);
- }
-
- /**
- * 按照(APP_ID + IMSI)分组
- */
- @Override
- public int compare(UserActivityKeyWritable o1,UserActivityKeyWritable o2) {
-
- int i = o1.getAppId().compareTo(o2.getAppId());
- if (i != 0) {
- return i;
- } else {
- i = o1.getClientImsi().compareTo(o2.getClientImsi());
- if (i != 0) {
- return i;
- } else {
- return 0;
- }
- }
-
- }
-
- }
复制代码
结果:
- 1 414030933742050 **********************
- 1 414030933742050 4.1.0.2ctch1 1 414030933742050 MainActivity 1411349549000 1411349568000 1
- 1 414030933742050 4.1.0.2ctch1 1 414030933742050 AppDetailActivty 1411349568000 1411349579000 1
- 1 414030933742050 4.1.0.2ctch1 1 455020096740852 MainActivity 1413517369000 1413517370000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 MainActivity 1408815960000 1408815986000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 MainActivity 1408815988000 1408815992000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 WebViewActivity 1408815992000 1408815994000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 MainActivity 1408815994000 1408816024000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 MainActivity 1408816089000 1408816090000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 MainActivity 1408898411000 1408898428000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 MainActivity 1408920621000 1408920623000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 AppDetailActivty 1408920623000 1408920638000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 MainActivity 1408920638000 1408920804000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 MainActivity 1408920886000 1408920925000 1
- 1 414030933742050 4.6.0.3 1 455020096898038 MainActivity 1408920927000 1408920929000 1
复制代码
情况三:设置GroupingComparatorClass,修改Comparator类
- public static class UserActivityGroupingComparator implements RawComparator<UserActivityKeyWritable> {
-
- @Override
- public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
- return WritableComparator.compareBytes(b1, s1, Integer.SIZE / 8,b2, s2, Integer.SIZE / 8);
- }
-
- /**
- * 按照(APP_ID + IMSI)分组
- */
- @Override
- public int compare(UserActivityKeyWritable o1,UserActivityKeyWritable o2) {
- return -1 ;
- }
-
- }
复制代码
结果:
无论是返回0,1,-1,结果都跟情况二一样
这是什么情况呢?是不是哪里配置错了?我找了两天,没找出来
|