分享

hive在hadoop yarn下执行sql报错,job.xml does not exist

hletian 发表于 2014-11-3 13:40:12 [显示全部楼层] 只看大图 回帖奖励 阅读模式 关闭右栏 11 46004
复制代码
yarn-hadoop-resourcemanager-redhat6-master1.log
2014-11-03 13:22:16,703 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1414987379617_0006
2014-11-03 13:22:16,703 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application Attempt appattempt_1414987379617_0006_000002 is done. finalState=FAILED
2014-11-03 13:22:16,703 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1414987379617_0006 failed 2 times due to AM Container for appattempt_1414987379617_0006_000002 exited with  exitCode: -1000 due to: File file:/data/app/hadoop-yarn/staging/hadoop/.staging/job_1414987379617_0006/job.xml does not exist
.Failing this attempt.. Failing the application.
2014-11-03 13:22:16,703 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1414987379617_0006 requests cleared
2014-11-03 13:22:16,703 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1414987379617_0006 user: hadoop queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2014-11-03 13:22:16,703 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1414987379617_0006 State change from FINAL_SAVING to FAILED
2014-11-03 13:22:16,705 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application removed - appId: application_1414987379617_0006 user: hadoop leaf-queue of parent: root #applications: 0
2014-11-03 13:22:16,705 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop        OPERATION=Application Finished - Failed        TARGET=RMAppManager        RESULT=FAILURE        DESCRIPTION=App failed with state: FAILED        PERMISSIONS=Application application_1414987379617_0006 failed 2 times due to AM Container for appattempt_1414987379617_0006_000002 exited with  exitCode: -1000 due to: File file:/data/app/hadoop-yarn/staging/hadoop/.staging/job_1414987379617_0006/job.xml does not exist
.Failing this attempt.. Failing the application.        APPID=application_1414987379617_0006



原来我以为是权限的问题,原来的hive安装在hive用户下,hadoop在hadoop用户下,两个分别属于两个用户组,原来报的也是类似的错误,不过路径是/tmp/hadoop-yarn/staging/hadoop/.staging/后来通过在hive中修改yarn.app.mapreduce.am.staging-dir的值为/data/app/hadoop-yarn/后,报错信息发生变化,如上面的日志。

经过排查发现是与权限有关,后来把hive配置到和hadoop同一个用户下,但是问题依旧存在,不同的是,用hadoop用户可以打开它日志中说的不存在的文件/data/app/hadoop-yarn/staging/hadoop/.staging/job_1414987379617_0006/job.xml
现在,这个问题憋住我一个星期了,请问谁有遇到过,该如何解决!谢谢各位!


hive的命令行,报出的错误日志信息
QQ截图20141103133900.png



已有(11)人评论

跳转到指定楼层
bioger_hit 发表于 2014-11-3 14:02:32
感觉你的方向是正确的,虽然同一用户,但是不用的方式,同样权限是不一样的。
把他们的权限截图发一下。


回复

使用道具 举报

howtodown 发表于 2014-11-3 14:03:26
如果不确定,最好把他们的权限都赋值为777,特别是找不到的文件。

回复

使用道具 举报

hletian 发表于 2014-11-3 14:20:52
howtodown 发表于 2014-11-3 14:03
如果不确定,最好把他们的权限都赋值为777,特别是找不到的文件。

我试过了,不行的,这个问题我找到原因了,是hive-env.sh里的HADOOP_CONF_DIR配错了。。。。,但是又有新的错误
Diagnostic Messages for this Task:
Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
        at org.apache.hadoop.util.Shell.run(Shell.java:418)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

回复

使用道具 举报

sstutu 发表于 2014-11-3 14:28:08
检查下这里的配置


  1. <property>
  2.     <name>yarn.application.classpath</name>
  3.     <value>
  4.         $HADOOP_CONF_DIR,
  5.         $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,
  6.         $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,
  7.         $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,
  8.         $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*
  9.     </value>
  10.   </property>
复制代码




回复

使用道具 举报

hletian 发表于 2014-11-3 14:35:46
bioger_hit 发表于 2014-11-3 14:02
感觉你的方向是正确的,虽然同一用户,但是不用的方式,同样权限是不一样的。
把他们的权限截图发一下。
...

我试过了,不行的,这个问题我找到原因了,是hive-env.sh里的HADOOP_CONF_DIR配错了。。。。,但是又有新的错误
Diagnostic Messages for this Task:
Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
        at org.apache.hadoop.util.Shell.run(Shell.java:418)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

回复

使用道具 举报

hletian 发表于 2014-11-3 14:36:50
执行inset override ,发生了虚拟内存不够用的情况,刚才那个错误算是搞定了,又来了新的问题,这玩意比较坑啊

Task with the most failures(4):
-----
Task ID:
  task_1414987379617_0012_m_000002

URL:
  http://redhat6-master1:8088/taskdetails.jsp?jobid=job_1414987379617_0012&tipid=task_1414987379617_0012_m_000002
-----
Diagnostic Messages for this Task:
Container [pid=437,containerID=container_1414987379617_0012_01_000025] is running beyond virtual memory limits. Current usage: 24.9 MB of 1 GB physical memory used; 6.5 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1414987379617_0012_01_000025 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 559 437 437 437 (java) 22 8 6900985856 6066 /usr/java/jdk1.6.0_26/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN 1024 -Djava.io.tmpdir=/data/app/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1414987379617_0012/container_1414987379617_0012_01_000025/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/data/app/hadoop/hadoop-2.4.0/logs/userlogs/application_1414987379617_0012/container_1414987379617_0012_01_000025 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.1.194.187 58718 attempt_1414987379617_0012_m_000002_3 25
        |- 437 53997 437 437 (bash) 3 5 108642304 299 /bin/bash -c /usr/java/jdk1.6.0_26/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  1024 -Djava.io.tmpdir=/data/app/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1414987379617_0012/container_1414987379617_0012_01_000025/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/data/app/hadoop/hadoop-2.4.0/logs/userlogs/application_1414987379617_0012/container_1414987379617_0012_01_000025 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.1.194.187 58718 attempt_1414987379617_0012_m_000002_3 25 1>/data/app/hadoop/hadoop-2.4.0/logs/userlogs/application_1414987379617_0012/container_1414987379617_0012_01_000025/stdout 2>/data/app/hadoop/hadoop-2.4.0/logs/userlogs/application_1414987379617_0012/container_1414987379617_0012_01_000025/stderr  

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
回复

使用道具 举报

hletian 发表于 2014-11-3 16:38:15
sstutu 发表于 2014-11-3 14:28
检查下这里的配置

分别在yarn-site加了
<property>
    <name>yarn.application.classpath</name>
    <value>     
    $HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/*,
    $HADOOP_COMMON_HOME/share/hadoop/common/lib/*,
    $HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,
    $YARN_HOME/share/hadoop/yarn/*,$YARN_HOME/share/hadoop/yarn/lib/*,
    $YARN_HOME/share/hadoop/mapreduce/*,$YARN_HOME/share/hadoop/mapreduce/lib/*
    </value>
  </property>

和mapperd-site加了
<property>
                <name>mapreduce.application.classpath</name>
                <value>$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/*,
    $HADOOP_COMMON_HOME/share/hadoop/common/lib/*,
    $HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,
    $YARN_HOME/share/hadoop/yarn/*,$YARN_HOME/share/hadoop/yarn/lib/*,
    $YARN_HOME/share/hadoop/mapreduce/*,$YARN_HOME/share/hadoop/mapreduce/lib/*</value>
        </property>


结果还是那个错误,没起作用
回复

使用道具 举报

sstutu 发表于 2014-11-3 16:48:30
hletian 发表于 2014-11-3 16:38
分别在yarn-site加了

    yarn.application.classpath
只要在yarn-site下面加就可以了,并且保证不要有空格,换行。修改完毕,最好重启下集群。
回复

使用道具 举报

howtodown 发表于 2014-11-3 16:51:28
hletian 发表于 2014-11-3 16:38
分别在yarn-site加了

    yarn.application.classpath

最好都有截图,一图胜万语,否则,不知道你该的是对,还是不对。


下面目录红字,都换成自己的实际目录

$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/*,
    $HADOOP_COMMON_HOME/share/hadoop/common/lib/*,
    $HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,
    $YARN_HOME/share/hadoop/yarn/*,$YARN_HOME/share/hadoop/yarn/lib/*,
    $YARN_HOME/share/hadoop/mapreduce/*,$YARN_HOME/share/hadoop/mapreduce/lib/*


回复

使用道具 举报

12下一页
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条