问题导读:
1.机架感知配置可以通过哪些方法来配置?
2.用python脚本方法如何修改?
3.用TableMapping(配置表文件)方法如何修改?
-- 方法1:用python脚本方法:
-- 由如下4台机器组成的一个hadoop集群(一个namenode,三个datanode)(hadoop版本是2.4.1):
192.168.117.203 funshion-hadoop203 -- namenode
192.168.117.148 funshion-hadoop148 -- datanode /rack1
192.168.117.32 funshion-hadoop32 -- datanode /rack2
192.168.117.62 funshion-hadoop62 -- datanode /rack2
-----------------------------------------------------------------------------------------------------
-- Step 01).
创建/usr/local/hadoop/etc/hadoop/RackAware.py文件,添加如下内容(每个datanode的IP、主机名各一行)
-- (记得确认一下你的python的安装路径,我的是/usr/local/bin/python):
[hadoop@funshion-hadoop203 hadoop]$ vi /usr/local/hadoop/etc/hadoop/RackAware.py
#!/usr/local/bin/python
#-*-coding:UTF-8-*-
import sys
rack = {"192.168.117.148":"rack1",
"funshion-hadoop148":"rack1",
"192.168.117.32":"rack2",
"funshion-hadoop32":"rack2",
"192.168.117.62":"rack2",
"funshion-hadoop62":"rack2",
}
if __name__=="__main__":
print "/" + rack.get(sys.argv[1],"rack0")
-----------------------------------------------------------------------------------------------------
-- Step 02).
在每个节点(namenode和datanode所在的安装用户(我的是hadoop用户)的 .bash_profile文件中添加如下两行:
[hadoop@funshion-hadoop203 hadoop]$ vi ~/.bash_profile
export PYTHONPATH=$PYTHONPATH:/usr/local/hadoop/etc/hadoop
export PATH=$PATH:$PYTHONPATH
-- 然后source 一下,使环境变量修改生效:
[hadoop@funshion-hadoop203 hadoop]$ source ~/.bash_profile
-- 验证环境变量是否生效的方法很简单,你可以在任意路径下调用RackAware.py脚本,操作如下:
[hadoop@funshion-hadoop203 hadoop]$ cd ~
[hadoop@funshion-hadoop203 ~]$ RackAware.py funshion-hadoop148
/rack1
-- 如上所示,没有返回command not found 之类的错误,就表示OK的。
-----------------------------------------------------------------------------------------------------
-- Step 03). 在 core-site.xml 文件中添加如下两个属性(如果是hadop 1.x 的话,两个属性名应该为topology.script.file.name和topology.script.number.args):
[hadoop@funshion-hadoop203 hadoop]$ vi /usr/local/hadoop/etc/hadoop/core-site.xml
<property>
<name>net.topology.script.file.name</name>
<value>/usr/local/hadoop/etc/hadoop/RackAware.py</value>
</property>
<property>
<name>net.topology.script.number.args</name>
<value>4</value>
</property>
-----------------------------------------------------------------------------------------------------
-- 最后重启hadoop集群将看到效果类似如下(如果是hadoop 1.x版本的话,用hadoop dfsadmin -report命令):
[hadoop@funshion-hadoop203 sbin]$ hdfs dfsadmin -report
14/07/22 14:27:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 65778005278720 (59.82 TB)
Present Capacity: 63274784411996 (57.55 TB)
DFS Remaining: 62549150711808 (56.89 TB)
DFS Used: 725633700188 (675.80 GB)
DFS Used%: 1.15%
Under replicated blocks: 1929
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)
Live datanodes:
Name: 192.168.117.62:50010 (funshion-hadoop62)
Hostname: funshion-hadoop62
Rack: /rack2
Decommission Status : Normal
Configured Capacity: 21281119354880 (19.36 TB)
DFS Used: 229177624307 (213.44 GB)
Non DFS Used: 997284997389 (928.79 GB)
DFS Remaining: 20054656733184 (18.24 TB)
DFS Used%: 1.08%
DFS Remaining%: 94.24%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Tue Jul 22 14:27:17 CST 2014
Name: 192.168.117.148:50010 (funshion-hadoop148)
Hostname: funshion-hadoop148
Rack: /rack1
Decommission Status : Normal
Configured Capacity: 23215766568960 (21.11 TB)
DFS Used: 251706528175 (234.42 GB)
Non DFS Used: 790752472657 (736.45 GB)
DFS Remaining: 22173307568128 (20.17 TB)
DFS Used%: 1.08%
DFS Remaining%: 95.51%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Tue Jul 22 14:27:14 CST 2014
Name: 192.168.117.32:50010 (funshion-hadoop32)
Hostname: funshion-hadoop32
Rack: /rack2
Decommission Status : Normal
Configured Capacity: 21281119354880 (19.36 TB)
DFS Used: 244540665856 (227.75 GB)
Non DFS Used: 715183140864 (666.07 GB)
DFS Remaining: 20321395548160 (18.48 TB)
DFS Used%: 1.15%
DFS Remaining%: 95.49%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Tue Jul 22 14:27:17 CST 2014
-- 方法二:用TableMapping(配置表文件)方法:
-- Step 01).
在 core-site.xml 文件中添加如下两个属性(只适合2.x版本,好像net.topology.table.file.name参考是2.x新加的)
<property>
<name>net.topology.node.switch.mapping.impl</name>
<value>org.apache.hadoop.net.TableMapping</value>
<description>
The default implementation of the DNSToSwitchMapping. It invokes a script specified in net.topology.script.file.name to resolve node names.
If the value for net.topology.script.file.name is not set, the default value of DEFAULT_RACK is returned for all node names.
</description>
</property>
<property>
<name>net.topology.table.file.name</name>
<value>/usr/local/hadoop/etc/hadoop/topology.csv</value>
<description>
The file name for a topology file, which is used when the net.topology.node.switch.mapping.impl property is set to org.apache.hadoop.net.TableMapping.
The file format is a two column text file, with columns separated by whitespace. The first column is a DNS or IP address and the second column specifies the rack where the address maps.
If no entry corresponding to a host in the cluster is found, then /default-rack is assumed.
</description>
</property>
-----------------------------------------------------------------------------------------------------
-- Step 02).
创建/usr/local/hadoop/etc/hadoop/topology.csv文件,添加如下内容(每个datanode的IP、主机名各一行)
[hadoop@funshion-hadoop203 hadoop]$ vi /usr/local/hadoop/etc/hadoop/topology.csv
funshion-hadoop148 /rack11
192.168.117.148 /rack11
funshion-hadoop32 /rack12
192.168.117.32 /rack12
funshion-hadoop62 /rack13
192.168.117.62 /rack13
-- 保存,同步到每个节点,然后执行如下命令验证配置是否成功:
[hadoop@funshion-hadoop203 sbin]$ hdfs dfsadmin -report
14/07/23 11:48:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 65778005278720 (59.82 TB)
Present Capacity: 63267951845098 (57.54 TB)
DFS Remaining: 62507149918208 (56.85 TB)
DFS Used: 760801926890 (708.55 GB)
DFS Used%: 1.20%
Under replicated blocks: 91
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)
Live datanodes:
Name: 192.168.117.62:50010 (funshion-hadoop62)
Hostname: funshion-hadoop62
Rack: /rack13
Decommission Status : Normal
Configured Capacity: 21281119354880 (19.36 TB)
DFS Used: 129386176628 (120.50 GB)
Non DFS Used: 1003920744332 (934.97 GB)
DFS Remaining: 20147812433920 (18.32 TB)
DFS Used%: 0.61%
DFS Remaining%: 94.67%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Wed Jul 23 11:48:50 CST 2014
Name: 192.168.117.148:50010 (funshion-hadoop148)
Hostname: funshion-hadoop148
Rack: /rack11
Decommission Status : Normal
Configured Capacity: 23215766568960 (21.11 TB)
DFS Used: 380012317309 (353.91 GB)
Non DFS Used: 790889186691 (736.57 GB)
DFS Remaining: 22044865064960 (20.05 TB)
DFS Used%: 1.64%
DFS Remaining%: 94.96%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Wed Jul 23 11:48:50 CST 2014
Name: 192.168.117.32:50010 (funshion-hadoop32)
Hostname: funshion-hadoop32
Rack: /rack12
Decommission Status : Normal
Configured Capacity: 21281119354880 (19.36 TB)
DFS Used: 251403432953 (234.14 GB)
Non DFS Used: 715243502599 (666.12 GB)
DFS Remaining: 20314472419328 (18.48 TB)
DFS Used%: 1.18%
DFS Remaining%: 95.46%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Wed Jul 23 11:48:50 CST 2014
|
|