Yran的ResourceManager异常退出

查看数: 15618 | 评论数: 6 | 收藏 0
关灯 | 提示:支持键盘翻页<-左 右->
    组图打开中,请稍候......
发布时间: 2018-7-23 13:42

正文摘要:

【情况描述】:Yran的ResourceManager异常退出,什么作业都没有进行,全部都停了,就算是重启完(无论是整个Hadoop还是单个Yran重启),其中一个节点的ResourceManager都还是会异常退出,偶尔看到黄色的GC持续警告。 大 ...

回复

wwwyibin518 发表于 2018-7-23 19:19:04
634320089 发表于 2018-7-23 17:22
50M肯定带不动啊,1G吧,应该没什么问题,这个没什么规则,根据集群规则来定

哈问题解决了,一语中的,感谢相助啊。

ResourceManager 的 Java 堆栈大小
这个参数的作用,对Yran ResourceManager的影响,Yran的执行理解还得再深入了解。
634320089 发表于 2018-7-23 17:22:54

50M肯定带不动啊,1G吧,应该没什么问题,这个没什么规则,根据集群规则来定
634320089 发表于 2018-7-23 17:00:10
wwwyibin518 发表于 2018-7-23 16:33
集群共有4个节点。
每个节点16核CPU,16G内存。

你用的CDH的话,在页面搜“ResourceManager 的 Java 堆栈大小”,你贴的都是nodemanager调度job的参数,和你这个报错没有关系
wwwyibin518 发表于 2018-7-23 16:33:35
本帖最后由 wwwyibin518 于 2018-7-23 16:35 编辑
634320089 发表于 2018-7-23 16:20
看你这个错是启动ResourceManager就内存溢出了,分配给ResourceManager多少内存啊,能贴个配置的图么

集群共有4个节点。
每个节点16核CPU,16G内存。

我能知道的和Yran 内存有关的参数是这样了,刚接触,最后调整到这样还是情况依旧。
yarn.scheduler.minimum-allocation-mb 1G
yarn.scheduler.maximum-allocation-mb 8G
yarn.nodemanager.resource.memory-mb 8G
yarn.scheduler.increment-allocation-mb 512M
mapreduce.map.memory.mb 1G
mapreduce.reduce.memory.mb 2G
mapreduce.map.java.opts 768M
mapreduce.reduce.java.opts 1536M


【yarn-site.xml】文件内容
<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera Manager-->
<configuration>
  <property>
    <name>yarn.acl.enable</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.admin.acl</name>
    <value>*</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>qoe02:2181,qoe04:2181,qoe03:2181</value>
  </property>
  <property>
    <name>yarn.resourcemanager.store.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
  </property>
  <property>
    <name>yarn.client.failover-sleep-base-ms</name>
    <value>100</value>
  </property>
  <property>
    <name>yarn.client.failover-sleep-max-ms</name>
    <value>2000</value>
  </property>
  <property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>yarnRM</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address.rm76</name>
    <value>qoe01:8032</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address.rm76</name>
    <value>qoe01:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address.rm76</name>
    <value>qoe01:8031</value>
  </property>
  <property>
    <name>yarn.resourcemanager.admin.address.rm76</name>
    <value>qoe01:8033</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.rm76</name>
    <value>qoe01:8088</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.https.address.rm76</name>
    <value>qoe01:8090</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address.rm107</name>
    <value>qoe02:8032</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address.rm107</name>
    <value>qoe02:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address.rm107</name>
    <value>qoe02:8031</value>
  </property>
  <property>
    <name>yarn.resourcemanager.admin.address.rm107</name>
    <value>qoe02:8033</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.rm107</name>
    <value>qoe02:8088</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.https.address.rm107</name>
    <value>qoe02:8090</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>rm76,rm107</value>
  </property>
  <property>
    <name>yarn.resourcemanager.client.thread-count</name>
    <value>50</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.client.thread-count</name>
    <value>50</value>
  </property>
  <property>
    <name>yarn.resourcemanager.admin.client.thread-count</name>
    <value>1</value>
  </property>
  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>1024</value>
  </property>
  <property>
    <name>yarn.scheduler.increment-allocation-mb</name>
    <value>512</value>
  </property>
  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>8192</value>
  </property>
  <property>
    <name>yarn.scheduler.minimum-allocation-vcores</name>
    <value>16</value>
  </property>
  <property>
    <name>yarn.scheduler.increment-allocation-vcores</name>
    <value>1</value>
  </property>
  <property>
    <name>yarn.scheduler.maximum-allocation-vcores</name>
    <value>16</value>
  </property>
  <property>
    <name>yarn.resourcemanager.amliveliness-monitor.interval-ms</name>
    <value>1000</value>
  </property>
  <property>
    <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
    <value>600000</value>
  </property>
  <property>
    <name>yarn.resourcemanager.am.max-attempts</name>
    <value>2</value>
  </property>
  <property>
    <name>yarn.resourcemanager.container.liveness-monitor.interval-ms</name>
    <value>600000</value>
  </property>
  <property>
    <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
    <value>1000</value>
  </property>
  <property>
    <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
    <value>600000</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.client.thread-count</name>
    <value>50</value>
  </property>
  <property>
    <name>yarn.application.classpath</name>
    <value>$HADOOP_CLIENT_CONF_DIR,$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
  </property>
  <property>
    <name>yarn.scheduler.fair.user-as-default-queue</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.scheduler.fair.preemption</name>
    <value>false</value>
  </property>
  <property>
    <name>yarn.scheduler.fair.sizebasedweight</name>
    <value>false</value>
  </property>
  <property>
    <name>yarn.scheduler.fair.assignmultiple</name>
    <value>false</value>
  </property>
  <property>
    <name>yarn.resourcemanager.max-completed-applications</name>
    <value>10000</value>
  </property>
</configuration>





【mapred-site.xml】文件内容
<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera Manager-->
<configuration>
  <property>
    <name>mapreduce.job.split.metainfo.maxsize</name>
    <value>10000000</value>
  </property>
  <property>
    <name>mapreduce.job.counters.max</name>
    <value>120</value>
  </property>
  <property>
    <name>mapreduce.output.fileoutputformat.compress</name>
    <value>false</value>
  </property>
  <property>
    <name>mapreduce.output.fileoutputformat.compress.type</name>
    <value>BLOCK</value>
  </property>
  <property>
    <name>mapreduce.output.fileoutputformat.compress.codec</name>
    <value>org.apache.hadoop.io.compress.DefaultCodec</value>
  </property>
  <property>
    <name>mapreduce.map.output.compress.codec</name>
    <value>org.apache.hadoop.io.compress.SnappyCodec</value>
  </property>
  <property>
    <name>mapreduce.map.output.compress</name>
    <value>true</value>
  </property>
  <property>
    <name>zlib.compress.level</name>
    <value>DEFAULT_COMPRESSION</value>
  </property>
  <property>
    <name>mapreduce.task.io.sort.factor</name>
    <value>64</value>
  </property>
  <property>
    <name>mapreduce.map.sort.spill.percent</name>
    <value>0.8</value>
  </property>
  <property>
    <name>mapreduce.reduce.shuffle.parallelcopies</name>
    <value>10</value>
  </property>
  <property>
    <name>mapreduce.task.timeout</name>
    <value>600000</value>
  </property>
  <property>
    <name>mapreduce.client.submit.file.replication</name>
    <value>2</value>
  </property>
  <property>
    <name>mapreduce.job.reduces</name>
    <value>2</value>
  </property>
  <property>
    <name>mapreduce.task.io.sort.mb</name>
    <value>256</value>
  </property>
  <property>
    <name>mapreduce.map.speculative</name>
    <value>false</value>
  </property>
  <property>
    <name>mapreduce.reduce.speculative</name>
    <value>false</value>
  </property>
  <property>
    <name>mapreduce.job.reduce.slowstart.completedmaps</name>
    <value>0.8</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>qoe01:10020</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>qoe01:19888</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.webapp.https.address</name>
    <value>qoe01:19890</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.admin.address</name>
    <value>qoe01:10033</value>
  </property>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.am.staging-dir</name>
    <value>/user</value>
  </property>
  <property>
    <name>mapreduce.am.max-attempts</name>
    <value>2</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.am.resource.mb</name>
    <value>1024</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.am.resource.cpu-vcores</name>
    <value>1</value>
  </property>
  <property>
    <name>mapreduce.job.ubertask.enable</name>
    <value>false</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.am.command-opts</name>
    <value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value>
  </property>
  <property>
    <name>mapreduce.map.java.opts</name>
    <value>-Djava.net.preferIPv4Stack=true -Xmx805306368</value>
  </property>
  <property>
    <name>mapreduce.reduce.java.opts</name>
    <value>-Djava.net.preferIPv4Stack=true -Xmx1610612736</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.am.admin.user.env</name>
    <value>LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native:$JAVA_LIBRARY_PATH</value>
  </property>
  <property>
    <name>mapreduce.map.memory.mb</name>
    <value>1024</value>
  </property>
  <property>
    <name>mapreduce.map.cpu.vcores</name>
    <value>2</value>
  </property>
  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>2048</value>
  </property>
  <property>
    <name>mapreduce.reduce.cpu.vcores</name>
    <value>1</value>
  </property>
  <property>
    <name>mapreduce.job.heap.memory-mb.ratio</name>
    <value>0.8</value>
  </property>
  <property>
    <name>mapreduce.application.classpath</name>
    <value>$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$MR2_CLASSPATH</value>
  </property>
  <property>
    <name>mapreduce.admin.user.env</name>
    <value>LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native:$JAVA_LIBRARY_PATH</value>
  </property>
  <property>
    <name>mapreduce.shuffle.max.connections</name>
    <value>80</value>
  </property>
</configuration>



634320089 发表于 2018-7-23 16:20:00
看你这个错是启动ResourceManager就内存溢出了,分配给ResourceManager多少内存啊,能贴个配置的图么
关闭

推荐上一条 /2 下一条