本帖最后由 Sempre 于 2015-11-11 17:54 编辑
我想用oozie监控本地的shell脚本,在shell脚本中执行以下任务:把HDFS上的输出数据下载到本地,之后在本地进行一些处理
参考了这篇文章http://ju.outofmemory.cn/entry/31131
我的workflow.xml内容如下
<workflow-app xmlns="uri:oozie:workflow:0.4" name="oozie-shell">
<start to="bc-shell"/>
<action name='bc-shell'>
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>test.sh</exec>
<file>/user/root/myproject/apps/lib/test.sh</file>
</shell>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
我的job.properties如下:
nameNode=hdfs://mycluster
jobTracker=hadoop1:8050
queueName=default
projectRoot=myproject
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/${projectRoot}/apps
我的shell脚本如下
#!/bin/sh
hadoop fs -get /user/root/myproject/output-data/keywords/part-r-00000 keywords
执行的时候报错
Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]
hadoop-yarn-nodemanager的日志如下:
2015-11-11 17:13:56,457 INFO ipc.Server (Server.java:saslProcess(1311)) - Auth successful for appattempt_1447222078870_0003_000001 (auth:SIMPLE)
2015-11-11 17:13:56,463 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:startContainerInternal(804)) - Start request for container_1447222078870_0003_01_000002 by user root
2015-11-11 17:13:56,463 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:startContainerInternal(852)) - Creating a new application reference for app application_1447222078870_0003
2015-11-11 17:13:56,463 INFO nodemanager.NMAuditLogger (NMAuditLogger.java:logSuccess(89)) - USER=root IP=10.133.16.185 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1447222078870_0003 CONTAINERID=container_1447222078870_0003_01_000002
2015-11-11 17:13:56,464 INFO application.Application (ApplicationImpl.java:handle(464)) - Application application_1447222078870_0003 transitioned from NEW to INITING
2015-11-11 17:13:56,464 INFO application.Application (ApplicationImpl.java:transition(304)) - Adding container_1447222078870_0003_01_000002 to application application_1447222078870_0003
2015-11-11 17:13:56,468 WARN logaggregation.LogAggregationService (LogAggregationService.java:verifyAndCreateRemoteLogDir(195)) - Remote Root Log Dir [/app-logs] already exist, but with incorrect permissions. Expected: [rwxrwxrwt], Found: [rwxrwxrwx]. The cluster may have problems with multiple users.
2015-11-11 17:13:56,469 WARN logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:<init>(182)) - rollingMonitorInterval is set as -1. The log rolling mornitoring interval is disabled. The logs will be aggregated after this application is finished.
2015-11-11 17:13:56,486 INFO application.Application (ApplicationImpl.java:handle(464)) - Application application_1447222078870_0003 transitioned from INITING to RUNNING
2015-11-11 17:13:56,487 INFO container.Container (ContainerImpl.java:handle(1074)) - Container container_1447222078870_0003_01_000002 transitioned from NEW to LOCALIZING
2015-11-11 17:13:56,487 INFO containermanager.AuxServices (AuxServices.java:handle(196)) - Got event CONTAINER_INIT for appId application_1447222078870_0003
2015-11-11 17:13:56,487 INFO containermanager.AuxServices (AuxServices.java:handle(196)) - Got event APPLICATION_INIT for appId application_1447222078870_0003
2015-11-11 17:13:56,487 INFO containermanager.AuxServices (AuxServices.java:handle(200)) - Got APPLICATION_INIT for service mapreduce_shuffle
2015-11-11 17:13:56,487 INFO mapred.ShuffleHandler (ShuffleHandler.java:addJobToken(560)) - Added token for job_1447222078870_0003
2015-11-11 17:13:56,488 INFO localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://mycluster/user/root/.staging/job_1447222078870_0003/job.xml transitioned from INIT to DOWNLOADING
2015-11-11 17:13:56,488 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:handle(696)) - Created localizer for container_1447222078870_0003_01_000002
2015-11-11 17:13:56,530 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:writeCredentials(1161)) - Writing credentials to the nmPrivate file /hadoop/yarn/local/nmPrivate/container_1447222078870_0003_01_000002.tokens. Credentials list:
2015-11-11 17:13:56,549 INFO nodemanager.CompositeContainerExecutor (CompositeContainerExecutor.java:getContainerExecutor(166)) - Launch container container_1447222078870_0003_01_000002 using org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
2015-11-11 17:13:59,529 INFO localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://mycluster/user/root/.staging/job_1447222078870_0003/job.xml(->/hadoop/yarn/local/usercache/root/appcache/application_1447222078870_0003/filecache/10/job.xml) transitioned from DOWNLOADING to LOCALIZED
2015-11-11 17:13:59,530 INFO container.Container (ContainerImpl.java:handle(1074)) - Container container_1447222078870_0003_01_000002 transitioned from LOCALIZING to LOCALIZED
2015-11-11 17:13:59,530 INFO nodemanager.CompositeContainerExecutor (CompositeContainerExecutor.java:getContainerExecutor(166)) - Launch container container_1447222078870_0003_01_000002 using org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
2015-11-11 17:13:59,552 INFO container.Container (ContainerImpl.java:handle(1074)) - Container container_1447222078870_0003_01_000002 transitioned from LOCALIZED to RUNNING
2015-11-11 17:14:01,301 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(371)) - Starting resource-monitoring for container_1447222078870_0003_01_000002
2015-11-11 17:14:01,302 INFO nodemanager.CompositeContainerExecutor (CompositeContainerExecutor.java:getContainerExecutor(166)) - Launch container container_1447222078870_0003_01_000002 using org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
2015-11-11 17:14:01,333 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(459)) - Memory usage of ProcessTree 17464 for container-id container_1447222078870_0003_01_000002: 110.3 MB of 1 GB physical memory used; 1.4 GB of 2.1 GB virtual memory used
2015-11-11 17:14:04,163 INFO ipc.Server (Server.java:saslProcess(1311)) - Auth successful for appattempt_1447222078870_0003_000001 (auth:SIMPLE)
2015-11-11 17:14:04,188 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:stopContainerInternal(962)) - Stopping container with container Id: container_1447222078870_0003_01_000002
2015-11-11 17:14:04,188 INFO nodemanager.NMAuditLogger (NMAuditLogger.java:logSuccess(89)) - USER=root IP=10.133.16.185 OPERATION=Stop Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1447222078870_0003 CONTAINERID=container_1447222078870_0003_01_000002
2015-11-11 17:14:04,188 INFO container.Container (ContainerImpl.java:handle(1074)) - Container container_1447222078870_0003_01_000002 transitioned from RUNNING to KILLING
2015-11-11 17:14:04,188 INFO launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(370)) - Cleaning up container container_1447222078870_0003_01_000002
2015-11-11 17:14:04,193 WARN nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:launchContainer(306)) - Exit code from container container_1447222078870_0003_01_000002 is : 143
2015-11-11 17:14:04,378 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(459)) - Memory usage of ProcessTree 17464 for container-id container_1447222078870_0003_01_000002: 0B of 1 GB physical memory used; 0B of 2.1 GB virtual memory used
2015-11-11 17:14:04,484 INFO container.Container (ContainerImpl.java:handle(1074)) - Container container_1447222078870_0003_01_000002 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL
2015-11-11 17:14:04,485 INFO nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:deleteAsUser(406)) - Deleting absolute path : /hadoop/yarn/local/usercache/root/appcache/application_1447222078870_0003/container_1447222078870_0003_01_000002
2015-11-11 17:14:04,486 INFO nodemanager.NMAuditLogger (NMAuditLogger.java:logSuccess(89)) - USER=root OPERATION=Container Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1447222078870_0003 CONTAINERID=container_1447222078870_0003_01_000002
2015-11-11 17:14:04,486 INFO container.Container (ContainerImpl.java:handle(1074)) - Container container_1447222078870_0003_01_000002 transitioned from CONTAINER_CLEANEDUP_AFTER_KILL to DONE
2015-11-11 17:14:04,486 INFO application.Application (ApplicationImpl.java:transition(347)) - Removing container_1447222078870_0003_01_000002 from application application_1447222078870_0003
2015-11-11 17:14:04,486 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:startContainerLogAggregation(488)) - Considering container container_1447222078870_0003_01_000002 for log-aggregation
2015-11-11 17:14:04,486 INFO containermanager.AuxServices (AuxServices.java:handle(196)) - Got event CONTAINER_STOP for appId application_1447222078870_0003
2015-11-11 17:14:07,378 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(385)) - Stopping resource-monitoring for container_1447222078870_0003_01_000002
2015-11-11 17:14:12,202 INFO application.Application (ApplicationImpl.java:handle(464)) - Application application_1447222078870_0003 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP
2015-11-11 17:14:12,203 INFO nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:deleteAsUser(406)) - Deleting absolute path : /hadoop/yarn/local/usercache/root/appcache/application_1447222078870_0003
2015-11-11 17:14:12,203 INFO containermanager.AuxServices (AuxServices.java:handle(196)) - Got event APPLICATION_STOP for appId application_1447222078870_0003
2015-11-11 17:14:12,203 INFO application.Application (ApplicationImpl.java:handle(464)) - Application application_1447222078870_0003 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED
2015-11-11 17:14:12,203 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:finishLogAggregation(496)) - Application just finished : application_1447222078870_0003
2015-11-11 17:14:12,224 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:doContainerLogAggregation(525)) - Uploading logs for container container_1447222078870_0003_01_000002. Current good log dirs are /hadoop/yarn/log
2015-11-11 17:14:12,226 INFO nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:deleteAsUser(411)) - Deleting path : /hadoop/yarn/log/application_1447222078870_0003/container_1447222078870_0003_01_000002/syslog
2015-11-11 17:14:12,227 INFO nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:deleteAsUser(411)) - Deleting path : /hadoop/yarn/log/application_1447222078870_0003/container_1447222078870_0003_01_000002/stdout
2015-11-11 17:14:12,227 INFO nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:deleteAsUser(411)) - Deleting path : /hadoop/yarn/log/application_1447222078870_0003/container_1447222078870_0003_01_000002/stderr
2015-11-11 17:14:12,265 INFO nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:deleteAsUser(411)) - Deleting path : /hadoop/yarn/log/application_1447222078870_0003
求问各位大神这是什么原因?oozie的shell action是用来做什么的?可以监控本地的shell吗?
补充内容 (2015-11-12 11:24):
请问oozie的shell action 和 java action 可以用来监控本地的任务吗?还是只能监控HDS上的任务啊??急求解答!!!!!! |