hadoop : job failed

在《hadoop实战》上看的一个例子，运行总是出错。
public class CitationHistogram extends Configured implements Tool {

public static class MapClass extends MapReduceBase implements Mapper {
private final static IntWritable uno = new IntWritable(1);
private IntWritable citationCount = new IntWritable();

@Override
public void map(Text key, Text value,
OutputCollector[I] output, Reporter reporter)
throws IOException {

citationCount.set(Integer.parseInt(value.toString()));
output.collect(citationCount, uno);
}

}

public static class Reduce extends MapReduceBase implements Reducer[I] {
@Override
public void reduce(IntWritable key, Iterator[I] values,
OutputCollector[I] output, Reporter reporter)
throws IOException {

int count = 0;
while(values.hasNext()) {
count += values.next().get();
}
output.collect(key, new IntWritable(count));

}
}
@Override
public int run(String[] args) throws Exception {
Configuration conf = getConf();
JobConf job = new JobConf(conf, CitationHistogram.class);

Path in = new Path(args[0]);
Path out = new Path(args[1]);
FileInputFormat.setInputPaths(job, in);
FileOutputFormat.setOutputPath(job, out);

job.setJobName("CitationHistogram");
job.setMapperClass(MapClass.class);
job.setReducerClass(Reduce.class);

job.setInputFormat(KeyValueTextInputFormat.class);
job.setOutputFormat(TextOutputFormat.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);

JobClient.runJob(job);

return 0;
}

public static void main(String[] args) throws Exception {
int res = ToolRunner.run(new Configuration(), new CitationHistogram(), args);
System.exit(res);
}
}
错误信息如下：

s060403072 · 发表于 2013-10-16 13:39:38

错误日志没贴全啊

qq864680621 · 发表于 2013-10-16 13:40:33

citationCount.set(Integer.parseInt(value.toString()));你文件中的数据是不是有不能被转化为Int的字符串，从日志上看是在String转Int的时候出错了

ruishenh · 发表于 2013-10-16 13:41:32

citationCount.set(Integer.parseInt(value.toString()));这一句错误 value是一个字符串
这时候你的key是字符偏移量，value是一行字符串的值，所以转换不成int。
代码里还要用StringTokenizer转换：
StringTokenizer itr=new StringTokenizer(value.toString);
while(itr.hasMoreTokens){
//这里转换value的值为key，然后为统计个数1
}

ruishenh · 发表于 2013-10-16 13:42:28

         你的数据有问题啊。。。。明显转换不了啊
我用你写的代码重新写了一下没问题啊。。
数据是这个
[hbase@23260-oozie conf]$ cat test;
1    2
2    3
3    2
我在eclipse下跑没问题啊修改代码
@Override
public int run(String[] args) throws Exception {
args=new String[]{"/user/pig/test","/user/pig/tmp/test"};
Configuration conf = getConf();
System.setProperty("path.separator", ":");
conf.set("mapred.job.tracker", "hadoop-master.xxx.com:8021");
conf.set("fs.default.name", "hdfs://hadoop-master.xxx.com:8020/");
conf.set("mapred.job.queue.name", "erpmerge");
String temJars = "file:///D:/jd/project/emulate/target/testTrivial.jar";
conf.set("tmpjars",temJars);
JobConf job = new JobConf(conf, CitationHistogram.class);
Path in = new Path(args[0]);
Path out = new Path(args[1]);
FileInputFormat.setInputPaths(job, in);
FileOutputFormat.setOutputPath(job, out);

job.setJobName("CitationHistogram");
job.setMapperClass(MapClass.class);
job.setReducerClass(Reduce.class);

job.setInputFormat(KeyValueTextInputFormat.class);
job.setOutputFormat(TextOutputFormat.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);

JobClient.runJob(job);

return 0;
}
跑的结果
- JAAS Configuration already set up for Hadoop, not re-installing.
- No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
- Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
- Snappy native library not loaded
- Total input paths to process : 1
- Running job: job_201308011137_85807
-  map 0% reduce 0%
-  map 50% reduce 0%
-  map 100% reduce 0%
-  map 100% reduce 33%
-  map 100% reduce 100%
- Job complete: job_201308011137_85807
- Counters: 28
- Job Counters
-    Launched reduce tasks=1
-    SLOTS_MILLIS_MAPS=11533
-    Total time spent by all reduces waiting after reserving slots (ms)=0
-    Total time spent by all maps waiting after reserving slots (ms)=0
-    Rack-local map tasks=1
-    Launched map tasks=2
-    Data-local map tasks=1
-    SLOTS_MILLIS_REDUCES=8584
- FileSystemCounters
-    FILE_BYTES_READ=36
-    HDFS_BYTES_READ=215
-    FILE_BYTES_WRITTEN=169541
-    HDFS_BYTES_WRITTEN=8
- Map-Reduce Framework
-    Map input records=3
-    Reduce shuffle bytes=42
-    Spilled Records=6
-    Map output bytes=24
-    CPU time spent (ms)=6930
-    Total committed heap usage (bytes)=733347840
-    Map input bytes=12
-    Combine input records=0
-    SPLIT_RAW_BYTES=196
-    Reduce input records=3
-    Reduce input groups=2
-    Combine output records=0
-    Physical memory (bytes) snapshot=508116992
-    Reduce output records=2
-    Virtual memory (bytes) snapshot=4768018432
-    Map output records=3

图文精华

hadoop : job failed

已有(4)人评论

推荐 /2