hapjin 发表于 2016-5-16 14:53:48

Oozie 中Suspend状态的作业可以通过resume来恢复执行

比如,我提交了一个作业:因底层HDFS的问题而导致作业被挂起了。

2016-05-16 11:12:52,724 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER USER GROUP[-] TOKEN[] APP JOB ACTION Error starting action . ErrorType , ErrorCode , Message [JA009: Cannot delete /user/xxx/oozie-oozi/0000007-160516095026479-oozie-oozi-W/mr-node--map-reduce.tmp. Name node is in safe mode.
The reported blocks 2955 needs additional 3 blocks to reach the threshold 0.9990 of total blocks 2960.
The number of live datanodes 2 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1416)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4056)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4014)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3998)

--------------------------------------------------------
当把HDFS问题解决(退出安全模式)后,可以用:
oozie job -resume 0000007-160516095026479-oozie-oozi-W
将作业恢复。

hapjin 发表于 2016-5-16 14:56:16

Suspending a Workflow, Coordinator or Bundle Job
Example:
$ oozie job -oozie http://localhost:11000/oozie -suspend 14-20090525161321-oozie-joe
The suspend option suspends a workflow job in RUNNING status. After the command is executed the workflow job will be in SUSPENDED status.

Resuming a Workflow, Coordinator or Bundle Job
Example:
$ oozie job -oozie http://localhost:11000/oozie -resume 14-20090525161321-oozie-joe
The resume option resumes a workflow job in SUSPENDED status.
After the command is executed the workflow job will be in RUNNING status.
页: [1]
查看完整版本: Oozie 中Suspend状态的作业可以通过resume来恢复执行