About云-梭伦科技»专题 › 技术学习(版主发帖区) › 大数据学习 › Mapreduce › MapReduce初级案例（2）：使用MapReduce数据排序

MapReduce初级案例（2）：使用MapReduce数据排序

查看数: 116995 | 评论数: 16 | 收藏 5

关灯 | 提示：支持键盘翻页<-左右->

帖子模式

pig2

发布时间: 2014-3-3 20:29

正文摘要:

本帖最后由 pig2 于 2014-3-3 20:32 编辑阅读本文可以带着下面问题： 1.你对mapreduce了解多少？ 2.通过排序你是否对mapreduce有新的认识？ ------------------------------------------------------------ ...

yoki 发表于 2014-10-31 19:13:44

为什么我在运行这个例子的时候，输出文件夹里没有任何输出文件呢？

zongcm 发表于 2017-8-5 11:02:34

运行程序后sort_out总是为空，恳请赐教！
1 秒前

changgh 发表于 2016-10-18 10:51:58

谢谢楼主分享

entropy 发表于 2016-9-12 19:24:04

第47行转换数据类型有问题 data.set(Integer.parseInt(line));
一直报错，也不知道怎么改

java.lang.Exception: java.lang.NumberFormatException: For input string: ""
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:592)
at java.lang.Integer.parseInt(Integer.java:615)
at com.test.sort$Map.map(sort.java:45)
at com.test.sort$Map.map(sort.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at

org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

补充内容 (2016-9-13 10:24):
已经解决，因为出现了"".equal(line)的错误，也就是测试数据每两行数据之间不能有空行，删掉空行就运行正确了。

jiadianyan 发表于 2016-7-13 13:55:06

按照例子我的结果有点不对
15 1
16 2
17 3
18 4
19 5
20 6
21 7
22 8
23 9
24 10
25 11
26 12
27 13
28 14

tang 发表于 2015-3-7 14:57:54

8楼说得很有道理，楼主的例子太过单调

536528395 发表于 2015-2-4 18:10:59

//实现reduce函数

      public void reduce(IntWritable key,Iterable<IntWritable> values,Context context)

            throws IOException,InterruptedException{

         for(IntWritable val:values){

            context.write(linenum, key);

            linenum = new IntWritable(linenum.get()+1);

         }

      }
看楼主写的意思是，就是进reduce的时候就数据已经按key 排好顺序了是么？？

test15 发表于 2015-1-6 16:38:07

不错，mark

evababy 发表于 2014-11-12 01:24:34

学习了，楼主的应该不知真正疑义的全排序。
全排序算是分布式中最难的算法，需要考虑多文件、以及文件是否被分割两个因素。
8楼说的有道理，不过设计模式中的全排序实现思路是两次JOB，第一次做分区，第二次做排序，也用到了setPartitionerClass，reduce只负责输出。

图文精华

MapReduce初级案例（2）：使用MapReduce数据排序

正文摘要:

回复

推荐 /2