roant 发表于 2015-1-18 12:19 我也遇到了任务一直在pending状态,不能往下运行,经过几天的倒腾的,总算解决了 现在把我的解决方法跟大家分享下,这几天在网上也查了很多资料,没有比较靠谱的回答 因为我设置了yarn.nodemanager.resource.memory-mb 这个的大小为1024MB, 即每个节点上的内存大小为1024,但是我运行的wordcount 需要的内存比我设置的要大,导致我的任务状态一直在pending状态中 如果你配置了yarn.nodemanager.resource.memory-mb这个配置项,你把值改大些,或者直接就用默认的然后再根据需要去调整 希望对纠结于这个问题的童鞋有帮助~~ |
给100个赞。。。。。。。。。。。。。。。6666 |
学习了,老大就是牛 |
版主,你好!我遇到了类似问题,报错如下,莫名出现master/192.168.101.128 (我的namenode结点),其他datanode并没挂掉,是HDFS或者yarn配置文件没配好吗? 16/10/10 23:24:39 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.101.128:8032 16/10/10 23:24:39 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 16/10/10 23:24:40 INFO input.FileInputFormat: Total input paths to process : 1 16/10/10 23:24:41 INFO mapreduce.JobSubmitter: number of splits:1 16/10/10 23:24:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1476096585439_0007 16/10/10 23:24:41 INFO impl.YarnClientImpl: Submitted application application_1476096585439_0007 16/10/10 23:24:41 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1476096585439_0007/ 16/10/10 23:24:41 INFO mapreduce.Job: Running job: job_1476096585439_0007 16/10/10 23:25:32 INFO mapreduce.Job: Job job_1476096585439_0007 running in uber mode : false 16/10/10 23:25:32 INFO mapreduce.Job: map 0% reduce 0% 16/10/10 23:25:59 INFO mapreduce.Job: map 100% reduce 0% 16/10/10 23:25:59 INFO mapreduce.Job: Task Id : attempt_1476096585439_0007_m_000000_0, Status : FAILED Error: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1564366186-192.168.101.128-1464249004321:blk_1073741877_1054 file=/StJoinTest/input/file1.txt at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:945) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:604) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896) at java.io.DataInputStream.read(DataInputStream.java:100) at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180) at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:143) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:183) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 16/10/10 23:26:00 INFO mapreduce.Job: map 0% reduce 0% |
wfqwang82 发表于 2016-4-20 16:02 这里面看不出什么来,去看看日志。 看看权限是否属于当前运行账号 |
各位大侠求助啦,我用是的hadoop2.6 搭的集群 集群规划: 主机名 IP 安装的软件 运行的进程 1. Master *.1 jdk、hadoop NameNode、DFSZKFailoverController(zkfc) 2. Slave1 *.2 jdk、hadoop NameNode、DFSZKFailoverController(zkfc) 3. Slave2 *.3 jdk、hadoop ResourceManager 4. Slave3 *.4 jdk、hadoop ResourceManager 5. Slave4 *.5 jdk、hadoop、zookeeper DataNode、NodeManager、JournalNode、QuorumPeerMain 6. Slave5 *.6 jdk、hadoop、zookeeper DataNode、NodeManager、JournalNode、QuorumPeerMain 7. Slave6 *.7 jdk、hadoop、zookeeper DataNode、NodeManager、JournalNode、QuorumPeerMain 运行示例报错如下,一直找不到原因: [root@Master sbin]# hadoop jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /profile /out 16/04/20 15:48:13 INFO input.FileInputFormat: Total input paths to process : 1 16/04/20 15:48:13 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library from the embedded binaries 16/04/20 15:48:13 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev f761f10a818fa286fbad0466c3279da6739a01ff] 16/04/20 15:48:13 INFO mapreduce.JobSubmitter: number of splits:1 16/04/20 15:48:13 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1461138383033_0001 16/04/20 15:48:13 INFO impl.YarnClientImpl: Submitted application application_1461138383033_0001 16/04/20 15:48:13 INFO mapreduce.Job: The url to track the job: http://Slave2:8088/proxy/application_1461138383033_0001/ 16/04/20 15:48:13 INFO mapreduce.Job: Running job: job_1461138383033_0001 16/04/20 15:48:18 INFO mapreduce.Job: Job job_1461138383033_0001 running in uber mode : false 16/04/20 15:48:18 INFO mapreduce.Job: map 0% reduce 0% 16/04/20 15:48:18 INFO mapreduce.Job: Job job_1461138383033_0001 failed with state FAILED due to: Application application_1461138383033_0001 failed 2 times due to AM Container for appattempt_1461138383033_0001_000002 exited with exitCode: 1 For more detailed output, check application tracking page:http://Slave2:8088/proxy/application_1461138383033_0001/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1461138383033_0001_02_000001 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. 16/04/20 15:48:19 INFO mapreduce.Job: Counters: 0 |
本帖最后由 arsenduan 于 2015-11-4 20:24 编辑 arsenduan 发表于 2015-11-4 18:53 EMP-PC是我电脑名字 192.168.1.1是虚拟机的网关 EMP-PC/192.168.1.1 这个有问题 ,网关不行的,应该是你的ip地址。 |
hebina 发表于 2015-11-4 18:37 EMP-PC/192.168.1.1 to TestDemo-1:9001 为何 EMP-PC/192.168.1.1连接的是windows??!!!!! EMP-PC/192.168.1.1是什么 |
2015-11-04 18:36:46,837ERROR [org.apache.hadoop.security.UserGroupInformation] - PriviledgedActionException as:EMP (auth:SIMPLE) cause:java.net.ConnectException: Call From EMP-PC/192.168.1.1 to TestDemo-1:9001 failed on connection exception: java.net.ConnectException: Connection refused: no further information; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused Exception in thread "main" java.net.ConnectException: Call From EMP-PC/192.168.1.1 to TestDemo-1:9001 failed on connection exception: java.net.ConnectException: Connection refused: no further information; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730) at org.apache.hadoop.ipc.Client.call(Client.java:1351) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy7.getFileInfo(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy7.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397) at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:145) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:456) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:342) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286) at org.dragon.hadoop.hdfs.util.worldcount.main(worldcount.java:67) Caused by: java.net.ConnectException: Connection refused: no further information at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642) at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399) at org.apache.hadoop.ipc.Client.call(Client.java:1318) ... 28 more 还是老样子!!咋办 |
本帖最后由 hebina 于 2015-11-4 18:29 编辑 然后重启虚拟机和hadoop试试? |