分享

hadoop fsck 命令详解

lxs_huntingjob 发表于 2013-10-25 10:45:06 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 5 16866
本帖最后由 pig2 于 2014-10-17 17:46 编辑


hadoop  fsck
Usage: DFSck  [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]
                     检查这个目录中的文件是否完整
        -move               破损的文件移至/lost+found目录
        -delete             删除破损的文件
        -openforwrite   打印正在打开写操作的文件
        -files                 打印正在check的文件名
        -blocks             打印block报告 (需要和-files参数一起使用)
        -locations         打印每个block的位置信息(需要和-files参数一起使用)
        -racks               打印位置信息的网络拓扑图 (需要和-files参数一起使用)
hadoop  fsck /
用这个命令可以检查整个文件系统的健康状况,但是要注意它不会主动恢复备份缺失的block,这个是由NameNode单独的线程异步处理的。

....................................................................................................
.................................
/user/distribute-hadoop-boss/tmp/pgv/20090813/1000000103/input/JIFEN.QQ.COM.2009-08-13-18.30:  Replica placement policy is violated for blk_7596595208988121840_5377589. Block should be additionally replicated on 1 more rack(s).
....................................................
/user/distribute-hadoop-boss/tmp/pgv/20090813/1000000310/input/PAY.QQ.COM.2009-08-13-20.30:  Replica placement policy is violated for blk_8146588794511444453_5379501. Block should be additionally replicated on 1 more rack(s).
...............
....................................................................................................
....................................................................................................
.........................................................................................Status:
HEALTHY

Total size:    5042961147529 B (Total open files size: 1610612736 B)
Total dirs:    723
Total files:   128089 (Files currently being written: 2)
Total blocks (validated):      171417 (avg. block size 29419259 B) (Total open file blocks (not validated): 24)
Minimally replicated blocks:   171417 (100.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         476 (0.2776854 %)
Default replication factor:   
3  缺省的备份参数3

Average block replication:     3.000146
Corrupt blocks:               
0     破损的block数0

Missing replicas:              0 (0.0 %)
Number of data-nodes:        
  107

Number of racks:               
4

The filesystem under path '/' is HEALTHY
hadoop  fsck /user/distribute-hadoop-boss/tmp/pgv/20090813/1000000103/input/JIFEN.QQ.COM.2009-08-13-18.30 -files -blocks -locations  -racks
打印出了这个文件每个block的详细信息包括datanode的机架信息。

/user/distribute-hadoop-boss/tmp/pgv/20090813/1000000103/input/JIFEN.QQ.COM.2009-08-13-18.30 74110492 bytes, 2 block(s):  Replica placement policy is violated for
blk_7596595208988121840_5377589. Block should be additionally replicated on 1 more rack(s).  
这个block虽然有三份拷贝,但是都在一个rack里,应该有一个副本放在不同的机架,详细见上一节(副本放置策略
0. blk_-4839761191731553520_5377588 len=67108864 repl=3 [/lg/dminterface0/172.16.236.158:50010, /lg/dminterface1/172.16.218.108:50010, /lg/dminterface1/172.16.236.36:50010]

1. blk_7596595208988121840_5377589
len=7001628 repl=3  [/lg/dminterface2/172.16.236.51:50010, /lg/dminterface2/172.16.218.217:50010, /lg/dminterface2/172.16.218.200:50010]

三份拷贝的datanode信息,都在/lg/dminterface2里

Status: HEALTHY
Total size:    74110492 B
Total dirs:    0
Total files:   1
Total blocks (validated):      2 (avg. block size 37055246 B)
Minimally replicated blocks:   2 (100.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Mis-replicated blocks:         1 (50.0 %)
Default replication factor:    3
Average block replication:     3.0
Corrupt blocks:                0
Missing replicas:              0 (0.0 %)
Number of data-nodes:          107
Number of racks:               4
The filesystem under path '/user/distribute-hadoop-boss/tmp/pgv/20090813/1000000103/input/JIFEN.QQ.COM.2009-08-13-18.30' is HEALTHY
本文来自CSDN博客,转载请标明出处:http://blog.csdn.net/fiberlijun/archive/2009/11/18/4825772.aspx

已有(5)人评论

跳转到指定楼层
sq331335144 发表于 2013-10-25 10:45:06
我想问下,这些shell命令具体调用了源文件里面的哪些方法在哪里可以看到吗?
回复

使用道具 举报

skaterxu 发表于 2013-10-25 10:45:06
应该可以吧!不过得从hadoop.sh文件看起,估计也很麻烦!
回复

使用道具 举报

yunjisuanxue 发表于 2013-10-25 10:45:06
回复 3# hroger
    具体是在哪里包里啊吗?具体点,谢谢哈。
回复

使用道具 举报

kaif22 发表于 2013-10-25 10:45:06
回复  hroger
    具体是在哪里包里啊吗?具体点,谢谢哈。
appletreer 发表于 2009-12-3 14:17



# figure out which class to run
if [ "$COMMAND" = "namenode" ] ; then
  CLASS='org.apache.hadoop.hdfs.server.namenode.NameNode'
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NAMENODE_OPTS"
elif [ "$COMMAND" = "secondarynamenode" ] ; then
  CLASS='org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode'
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_SECONDARYNAMENODE_OPTS"
elif [ "$COMMAND" = "datanode" ] ; then
  CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_DATANODE_OPTS"
elif [ "$COMMAND" = "fs" ] ; then
  CLASS=org.apache.hadoop.fs.FsShell
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "dfs" ] ; then
  CLASS=org.apache.hadoop.fs.FsShell
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "dfsadmin" ] ; then
  CLASS=org.apache.hadoop.hdfs.tools.DFSAdmin
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "mradmin" ] ; then
  CLASS=org.apache.hadoop.mapred.tools.MRAdmin
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "fsck" ] ; then
  CLASS=org.apache.hadoop.hdfs.tools.DFSck
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "balancer" ] ; then
  CLASS=org.apache.hadoop.hdfs.server.balancer.Balancer
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_BALANCER_OPTS"
elif [ "$COMMAND" = "jobtracker" ] ; then
  CLASS=org.apache.hadoop.mapred.JobTracker
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_JOBTRACKER_OPTS"
elif [ "$COMMAND" = "tasktracker" ] ; then
  CLASS=org.apache.hadoop.mapred.TaskTracker
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_TASKTRACKER_OPTS"
elif [ "$COMMAND" = "job" ] ; then
  CLASS=org.apache.hadoop.mapred.JobClient
elif [ "$COMMAND" = "queue" ] ; then
  CLASS=org.apache.hadoop.mapred.JobQueueClient
elif [ "$COMMAND" = "pipes" ] ; then
  CLASS=org.apache.hadoop.mapred.pipes.Submitter
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "version" ] ; then
  CLASS=org.apache.hadoop.util.VersionInfo
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "jar" ] ; then
  CLASS=org.apache.hadoop.util.RunJar
elif [ "$COMMAND" = "distcp" ] ; then
  CLASS=org.apache.hadoop.tools.DistCp
  CLASSPATH=${CLASSPATH}:${TOOL_PATH}
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "daemonlog" ] ; then
  CLASS=org.apache.hadoop.log.LogLevel
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "archive" ] ; then
  CLASS=org.apache.hadoop.tools.HadoopArchives
  CLASSPATH=${CLASSPATH}:${TOOL_PATH}
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "sampler" ] ; then
  CLASS=org.apache.hadoop.mapred.lib.InputSampler
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
回复

使用道具 举报

einhep 发表于 2013-10-25 10:45:06
# figure out which class to run
if [ "$COMMAND" = "namenode" ] ; then
  CLASS='org.apache.ha ...
若冰 发表于 2009-12-11 15:13



崩溃ing
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条