用户组:游客
使用道具 举报
1. 用TestDFSIO基准测试HDFS Hadoop带有一些基准测试程序,基准测试程序被打包在测试程序JAR文件中。其中,TestDFSIO用来测试HDFS的I/O性能。大多数新系统硬件的故障都是硬盘故障。通过运行I/O密集型基准测试,可以对集群的使用进行热身。它通过使用MapReduce作业来完成测试作为并行读写文件的便捷方法。每个文件的读写都在单独的map任务中进行,并且map的输出可以用来收集统计刚刚处理过的文件。这个统计数据在reduce中累加起来得出一个汇总。以下命令写了10个文件,每个1000MB: [root@slave1 hadoop-0.20.2]# hadoop jar hadoop-0.20.2-test.jar TestDFSIO -write -nrFiles 10 -fileSize 1000 以下内容是TestDFSIO基准测试的运行结果,结果被写入控制台并同时记录在一个本地文件。 [hadoop@hadoop-namenode hadoop]$ cat TestDFSIO_results.log ----- TestDFSIO ----- : write Date & time: Tue Jan 18 19:04:37 CST 2011 Number of files: 10 Total MBytes processed: 10000 Throughput mb/sec: 45.45867806164197 Average IO rate mb/sec: 46.181983947753906 IO rate std deviation: 5.620244800553667 Test exec time sec: 94.833 完成基准测试后,可通过参数-clean从HDFS上删除所有生成的文件: [root@slave1 hadoop-0.20.2]# hadoop jar hadoop-0.20.2-test.jar TestDFSIO -clean 2. 用排序测试MapReduce Hadoop自带一个部分排序的程序。这对测试整个MapReduce系统很有用,因为整个输入数据集都会通过洗牌传输至reducer。一共三个步骤:生成一些随机的数据,执行排序,然后验证结果。 首先我们通过使用RandomWriter生成一些随机的数据。它以每个节点10个map的方式运行一个MapReduce作业,并且每一个map生成近似10GB的随机二进制数据,带有不同长度的键和值。 [hadoop@hadoop-namenode hadoop]$ bin/hadoop jar hadoop-0.20.2-examples.jar randomwriter random-data Running 20 maps. Job started: Tue Jan 18 19:05:21 CST 2011 11/01/18 19:05:22 INFO mapred.JobClient: Running job: job_201101181725_0009 11/01/18 19:05:23 INFO mapred.JobClient: map 0% reduce 0% 11/01/18 19:06:17 INFO mapred.JobClient: map 5% reduce 0% 11/01/18 19:06:21 INFO mapred.JobClient: map 10% reduce 0% 11/01/18 19:06:23 INFO mapred.JobClient: map 15% reduce 0% 11/01/18 19:06:24 INFO mapred.JobClient: map 20% reduce 0% 11/01/18 19:07:06 INFO mapred.JobClient: map 25% reduce 0% 11/01/18 19:07:09 INFO mapred.JobClient: map 35% reduce 0% 11/01/18 19:07:21 INFO mapred.JobClient: map 40% reduce 0% 11/01/18 19:07:57 INFO mapred.JobClient: map 45% reduce 0% 11/01/18 19:08:00 INFO mapred.JobClient: map 55% reduce 0% 11/01/18 19:08:09 INFO mapred.JobClient: map 60% reduce 0% 11/01/18 19:08:45 INFO mapred.JobClient: map 65% reduce 0% 11/01/18 19:08:51 INFO mapred.JobClient: map 70% reduce 0% 11/01/18 19:08:54 INFO mapred.JobClient: map 80% reduce 0% 11/01/18 19:09:31 INFO mapred.JobClient: map 85% reduce 0% 11/01/18 19:09:40 INFO mapred.JobClient: map 95% reduce 0% 11/01/18 19:09:43 INFO mapred.JobClient: map 100% reduce 0% 11/01/18 19:09:45 INFO mapred.JobClient: Job complete: job_201101181725_0009 11/01/18 19:09:45 INFO mapred.JobClient: Counters: 8 11/01/18 19:09:45 INFO mapred.JobClient: Job Counters 11/01/18 19:09:45 INFO mapred.JobClient: Launched map tasks=22 11/01/18 19:09:45 INFO mapred.JobClient: org.apache.hadoop.examples.RandomWriter$Counters 11/01/18 19:09:45 INFO mapred.JobClient: BYTES_WRITTEN=21474942228 11/01/18 19:09:45 INFO mapred.JobClient: RECORDS_WRITTEN=2044390 11/01/18 19:09:45 INFO mapred.JobClient: FileSystemCounters 11/01/18 19:09:45 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=21545680248 11/01/18 19:09:45 INFO mapred.JobClient: Map-Reduce Framework 11/01/18 19:09:45 INFO mapred.JobClient: Map input records=20 11/01/18 19:09:45 INFO mapred.JobClient: Spilled Records=0 11/01/18 19:09:45 INFO mapred.JobClient: Map input bytes=0 11/01/18 19:09:45 INFO mapred.JobClient: Map output records=2044390 Job ended: Tue Jan 18 19:09:45 CST 2011 The job took 263 seconds.
desehawk 发表于 2014-12-29 23:53 这个帖子应该对你有所帮助 hadoop2(2.2)集群基准测试
本版积分规则 发表回复 回帖后跳转到最后一页
经常参与各类话题的讨论,发帖内容较有主见
经常帮助其他会员答疑
活跃且尽责职守的版主
中级会员
12
主题
24
帖子
3
粉丝
查看 »