网上看到类似报错的提问到是存在,但是未得到解决,google了很多次依然未发现解决方案故此提问,望大神光临!
首先介绍下环境:hadoop-2.5.1+hbase-0.98+nutch-2.3.1+solr-4.10.4
抓取网页正常,抓取完成(查询hbase能查到数据)建立索引时候报错如下:
2016-08-02 00:22:48,533 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 1048576002016-08-02 00:22:48,533 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 26214396; length = 65536002016-08-02 00:22:48,681 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output2016-08-02 00:22:48,929 INFO [main] org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library2016-08-02 00:22:48,968 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate]2016-08-02 00:22:49,101 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.NullPointerException
at org.apache.hadoop.io.Text.encode(Text.java:443)
at org.apache.hadoop.io.Text.set(Text.java:198)
at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrRecordReader.nextKeyValue(SolrDeleteDuplicates.java:234)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 2016-08-02 00:22:49,130 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task2016-08-02 00:22:49,257 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...2016-08-02 00:22:49,258 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.2016-08-02 00:22:49,260 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete. 于是找到源码对应行加上try catch 如下:
@Override public boolean nextKeyValue() throws IOException, InterruptedException {
if (currentDoc >= numDocs) {
return false;
}
SolrDocument doc = solrDocs.get(currentDoc);
String digest = (String) doc.getFieldValue(SolrConstants.DIGEST_FIELD);
//此处加入trycatch语句
try{
text.set(digest);
}catch(Exception e){
System.out.println("**********************************");
}
record.readSolrDocument(doc);
currentDoc++;
return true;
}
}; 在次启动建立索引依旧报错空指针异常如下:
16/08/02 01:50:43 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: starting...16/08/02 01:50:43 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: Solr url: http://192.168.159.120:8983/solr/16/08/02 01:50:48 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:803216/08/02 01:51:27 INFO mapreduce.JobSubmitter: number of splits:216/08/02 01:51:28 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative16/08/02 01:51:28 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces16/08/02 01:51:28 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative16/08/02 01:51:28 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress16/08/02 01:51:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470034090238_003216/08/02 01:51:48 INFO impl.YarnClientImpl: Submitted application application_1470034090238_003216/08/02 01:51:51 INFO mapreduce.Job: The url to track the job: http://CentOS641:8088/proxy/application_1470034090238_0032/16/08/02 01:51:51 INFO mapreduce.Job: Running job: job_1470034090238_003216/08/02 01:54:10 INFO mapreduce.Job: Job job_1470034090238_0032 running in uber mode : false16/08/02 01:54:10 INFO mapreduce.Job: map 0% reduce 0%16/08/02 01:55:34 INFO mapreduce.Job: map 50% reduce 0%16/08/02 01:55:34 INFO mapreduce.Job: Task Id : attempt_1470034090238_0032_m_000000_0, Status : FAILEDError: java.lang.NullPointerException
at org.apache.hadoop.io.Text.encode(Text.java:443)
at org.apache.hadoop.io.Text.encode(Text.java:424)
at org.apache.hadoop.io.Text.writeString(Text.java:473)
at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrRecord.write(SolrDeleteDuplicates.java:140)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1134)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Container killed by the ApplicationMaster.
感激不尽!!!!
|
|