一、分布式集群中配置2个slave主机。nodemanager节点配置内存为2G,CPU核数为1。
二、yarn-site.xml配置如下:
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>jackielee1.hadoop.com</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- ###########################NodeManager Resouce############################## -->
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>1</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>640800</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>4096</value>
</property>
</configuration>
三、mapred-site.xml配置如下:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>jackielee3.hadoop.com:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>jackielee3.hadoop.com:19888</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>1024</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>1024</value>
</property>
</configuration>
四、检查各个节点相关进程启动正常,执行jar程序时,一直停留在running job阶段,无法继续进行。
16/12/05 14:52:51 INFO client.RMProxy: Connecting to ResourceManager at jackielee1.hadoop.com/192.168.20.5:8032
16/12/05 14:52:52 INFO input.FileInputFormat: Total input paths to process : 2
16/12/05 14:52:52 INFO mapreduce.JobSubmitter: number of splits:2
16/12/05 14:52:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1480917988140_0004
16/12/05 14:52:53 INFO impl.YarnClientImpl: Submitted application application_1480917988140_0004
16/12/05 14:52:53 INFO mapreduce.Job: The url to track the job: http://jackielee1.hadoop.com:808 ... 1480917988140_0004/
16/12/05 14:52:53 INFO mapreduce.Job: Running job: job_1480917988140_0004
五、查看日志文件 yarn-jackie-resourcemanager-jackielee1.hadoop.com.log中有如下告警:
2016-12-05 14:56:46,636 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Node : jackielee2.hadoop.com:36934 does not have sufficient resource for request : {Priority: 0, Capability: <memory:2048, vCores:1>, # Containers: 1, Location: *, Relax Locality: true} node total capability : <memory:1024, vCores:1>
2016-12-05 14:56:47,182 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Node : jackielee3.hadoop.com:49362 does not have sufficient resource for request : {Priority: 0, Capability: <memory:2048, vCores:1>, # Containers: 1, Location: *, Relax Locality: true} node total capability : <memory:1024, vCores:1>
2016-12-05 14:56:47,638 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Node : jackielee2.hadoop.com:36934 does not have sufficient resource for request : {Priority: 0, Capability: <memory:2048, vCores:1>, # Containers: 1, Location: *, Relax Locality: true} node total capability : <memory:1024, vCores:1>
2016-12-05 14:56:48,184 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Node : jackielee3.hadoop.com:49362 does not have sufficient resource for request : {Priority: 0, Capability: <memory:2048, vCores:1>, # Containers: 1, Location: *, Relax Locality: true} node total capability : <memory:1024, vCores:1>
根据告警信息应该是节点资源不足导致,内存已从原来的1G提升为2G还是这种情况,请高手协助解决,谢谢!!!
|
|