poptang4 发表于 2013-10-25 10:42:08

删除DataNode遇到Decommission In Progress不变化问题

我在删除一个节点时,遇到一个
从block的数目上看,该删除DataNode的所有block都已经移到到其他的DataNode上了
但是该待删除的DataNode一直为Decommission In Progress
从日志上看:
该DataNode一直在进行DataBlockScanner,日志如下:
2012-09-24 16:47:51,301 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_3921915826611683588_8027. Its ok since it not in datanode dataset anymore.
2012-09-24 16:51:32,742 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_-902553692494195260_8082. Its ok since it not in datanode dataset anymore.
2012-09-24 16:56:06,284 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_9180843115150344245_25940. Its ok since it not in datanode dataset anymore.
2012-09-24 16:56:49,371 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_-5002984851235119528_31752. Its ok since it not in datanode dataset anymore.
2012-09-24 16:57:10,412 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_2897706191202607999_15164. Its ok since it not in datanode dataset anymore.
2012-09-24 16:57:24,440 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_3569549920590190118_27137. Its ok since it not in datanode dataset anymore.
2012-09-24 16:58:45,601 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_9070131173655120747_33437. Its ok since it not in datanode dataset anymore.
2012-09-24 16:59:23,681 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_-8158061507642883210_23376. Its ok since it not in datanode dataset anymore.
2012-09-24 16:59:32,698 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_1569989466833547348_21624. Its ok since it not in datanode dataset anymore.
2012-09-24 16:59:36,707 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_5530553709255233999_32763. Its ok since it not in datanode dataset anymore.
2012-09-24 16:59:38,711 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_7165837478027158747_24918. Its ok since it not in datanode dataset anymore.
NameNode的日志如下:(这个一直都有,我在想是不是这个的原因,10.10.10.150就是要删除的DataNode)
2012-09-24 17:04:39,082 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Block: blk_7215589835729902944_1601, Expected Replicas: 10, live replicas: 3, corrupt replicas: 0, decommissioned replicas: 1, excess replicas: 0, Is Open File: false, Datanodes having this block: 10.10.10.158:50010 10.10.10.151:50010 10.10.10.150:50010 10.10.10.23:50010 , Current Datanode: 10.10.10.150:50010, Is current datanode decommissioning: true

sq331335144 发表于 2013-10-25 10:42:08

这个大概找到了原因
你看到了Expected Replicas为10,不是配置的为3吗吗?为什么会出现10了,这个就要从MR的JobClient中说起了
在JobClient提交MR的时候,会上传MR的jar,在上传jar包到hdfs时指定了replication
该replication通过配置文件指定,默认为10
该replication你可以在你提交MR后到你的HDFS中验证(前提你的MR还没有执行完成)
MR执行完成后会删除该上传的Jar(猜测,因为成功的MR在HDFS中不存在其jar)
若该MR失败,其jar资源没有被删除,就出现了需要复制10份的要求了
我们可以使用hadoop fsck /这个命令来检查整个文件系统的健康情况
如果出现/data/hadoopdata/tmp/mapred/staging/root/.staging/job_201209061129_2679/job.jar:Under replicated blk_-8552132555561444140_34127. Target Replicas is 10 but found 4 replica(s).
类似的信息,就说明复制份数不够了(我这里就只有4个DataNode)
好了,解释了为什么会出现“Expected Replicas为10”
在我们删除DataNode的时候,你会发现数据都已经移动走了(一般指block数),但是其状态不会变化,是因为NameNode检测到这个block的Replicas不够,他认为数据不完整,所以他一直都不会让这个DataNode下架。(此处只是大概的看了一下代码,有很大的猜测成分,不对的请各位大侠指出)
解决方案(未测试,明天上班检验)
将这些执行失败的MR的jar进行删除
当然我认为这些MR的jar根本也就没有什么左右,可以直接把这个DataNode停掉
images/smilies/default/lol.gif

oChengZi1234 发表于 2013-10-25 10:42:08

如果你胆子大点,可以不用直接运行退役这个节点的命令,可以直接把这个节点下掉,系统会自动检查自己的副本是否有缺失,然后会复制这些数据(这个过程是异步的)。

dgxl 发表于 2013-10-25 10:42:08

你说的这种方案是我们也想到了,在没有办法的情况下再采用
最终删除这些没用的MR的jar,就ok了
images/smilies/default/victory.gif

llike90 发表于 2013-10-25 10:42:08

[*]private static FSDataOutputStream createFile(FileSystem fs, Path splitFile,
[*]      Configuration job)throws IOException {
[*]    FSDataOutputStream out = FileSystem.create(fs, splitFile,
[*]      new FsPermission(JobSubmissionFiles.JOB_FILE_PERMISSION));
[*]    int replication = job.getInt("mapred.submit.replication", 10);
[*]    fs.setReplication(splitFile, (short)replication);
[*]    writeSplitHeader(out);
[*]    return out;
[*]}复制代码就是上面这个10.不知为何会设为10,有何故事吗?难道作者是在个1000节点的群集上写的这段程序,随手写了个10吗?
页: [1]
查看完整版本: 删除DataNode遇到Decommission In Progress不变化问题