wwwyibin518 发表于 2018-7-23 13:42:07

Yran的ResourceManager异常退出

【情况描述】:Yran的ResourceManager异常退出,什么作业都没有进行,全部都停了,就算是重启完(无论是整个Hadoop还是单个Yran重启),其中一个节点的ResourceManager都还是会异常退出,偶尔看到黄色的GC持续警告。 大概是因为内存溢出?吧,没有思路方向找原因啊,已经调整了Yran的内存,
困扰了一个星期,先谢谢各位大神。
用的是 CDH 5.7.2


【stdout.log 日志信息如下】:

2018年 07月 20日 星期五 14:20:06 CST
JAVA_HOME=/usr/java/jdk1.8.0_131
using /usr/java/jdk1.8.0_131 as JAVA_HOME
using 5 as CDH_VERSION
using /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-yarn as CDH_YARN_HOME
using /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-mapreduce as CDH_MR2_HOME
using /opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER as CONF_DIR
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="/opt/cloudera-manager/cm-5.7.2/lib64/cmf/service/common/killparent.sh"
#   Executing /bin/sh -c "/opt/cloudera-manager/cm-5.7.2/lib64/cmf/service/common/killparent.sh"...




【stderr.log日志信息如下】:
2018年 07月 20日 星期五 14:20:06 CST
+ source_parcel_environment
+ '[' '!' -z /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/meta/cdh_env.sh ']'
+ OLD_IFS='   
'
+ IFS=:
+ SCRIPT_ARRAY=($SCM_DEFINES_SCRIPTS)
+ DIRNAME_ARRAY=($PARCEL_DIRNAMES)
+ IFS='   
'
+ COUNT=1
++ seq 1 1
+ for i in '`seq 1 $COUNT`'
+ SCRIPT=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/meta/cdh_env.sh
+ PARCEL_DIRNAME=CDH-5.7.2-1.cdh5.7.2.p0.18
+ . /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/meta/cdh_env.sh
++ CDH_DIRNAME=CDH-5.7.2-1.cdh5.7.2.p0.18
++ export CDH_HADOOP_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop
++ CDH_HADOOP_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop
++ export CDH_MR1_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-0.20-mapreduce
++ CDH_MR1_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-0.20-mapreduce
++ export CDH_HDFS_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-hdfs
++ CDH_HDFS_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-hdfs
++ export CDH_HTTPFS_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-httpfs
++ CDH_HTTPFS_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-httpfs
++ export CDH_MR2_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-mapreduce
++ CDH_MR2_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-mapreduce
++ export CDH_YARN_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-yarn
++ CDH_YARN_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-yarn
++ export CDH_HBASE_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hbase
++ CDH_HBASE_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hbase
++ export CDH_ZOOKEEPER_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/zookeeper
++ CDH_ZOOKEEPER_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/zookeeper
++ export CDH_HIVE_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hive
++ CDH_HIVE_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hive
++ export CDH_HUE_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hue
++ CDH_HUE_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hue
++ export CDH_OOZIE_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/oozie
++ CDH_OOZIE_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/oozie
++ export CDH_HUE_PLUGINS_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop
++ CDH_HUE_PLUGINS_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop
++ export CDH_FLUME_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/flume-ng
++ CDH_FLUME_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/flume-ng
++ export CDH_PIG_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/pig
++ CDH_PIG_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/pig
++ export CDH_HCAT_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hive-hcatalog
++ CDH_HCAT_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hive-hcatalog
++ export CDH_SQOOP2_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/sqoop2
++ CDH_SQOOP2_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/sqoop2
++ export CDH_LLAMA_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/llama
++ CDH_LLAMA_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/llama
++ export CDH_SENTRY_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/sentry
++ CDH_SENTRY_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/sentry
++ export TOMCAT_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/bigtop-tomcat
++ TOMCAT_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/bigtop-tomcat
++ export JSVC_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/bigtop-utils
++ JSVC_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/bigtop-utils
++ export CDH_HADOOP_BIN=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop/bin/hadoop
++ CDH_HADOOP_BIN=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop/bin/hadoop
++ export CDH_IMPALA_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/impala
++ CDH_IMPALA_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/impala
++ export CDH_SOLR_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/solr
++ CDH_SOLR_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/solr
++ export CDH_HBASE_INDEXER_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hbase-solr
++ CDH_HBASE_INDEXER_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hbase-solr
++ export SEARCH_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/search
++ SEARCH_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/search
++ export CDH_SPARK_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/spark
++ CDH_SPARK_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/spark
++ export WEBHCAT_DEFAULT_XML=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/etc/hive-webhcat/conf.dist/webhcat-default.xml
++ WEBHCAT_DEFAULT_XML=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/etc/hive-webhcat/conf.dist/webhcat-default.xml
++ export CDH_KMS_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-kms
++ CDH_KMS_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-kms
++ export CDH_PARQUET_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/parquet
++ CDH_PARQUET_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/parquet
++ export CDH_AVRO_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/avro
++ CDH_AVRO_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/avro
+ locate_cdh_java_home
+ '[' -z '' ']'
+ '[' -z /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/bigtop-utils ']'
+ local BIGTOP_DETECT_JAVAHOME=
+ for candidate in '"${JSVC_HOME}"' '"${JSVC_HOME}/.."' '"/usr/lib/bigtop-utils"' '"/usr/libexec"'
+ '[' -e /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/bigtop-utils/bigtop-detect-javahome ']'
+ BIGTOP_DETECT_JAVAHOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/bigtop-utils/bigtop-detect-javahome
+ break
+ '[' -z /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/bigtop-utils/bigtop-detect-javahome ']'
+ . /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/bigtop-utils/bigtop-detect-javahome
++ BIGTOP_DEFAULTS_DIR=/etc/default
++ '[' -n /etc/default -a -r /etc/default/bigtop-utils ']'
++ JAVA6_HOME_CANDIDATES=('/usr/lib/j2sdk1.6-sun' '/usr/lib/jvm/java-6-sun' '/usr/lib/jvm/java-1.6.0-sun-1.6.0' '/usr/lib/jvm/j2sdk1.6-oracle' '/usr/lib/jvm/j2sdk1.6-oracle/jre' '/usr/java/jdk1.6' '/usr/java/jre1.6')
++ OPENJAVA6_HOME_CANDIDATES=('/usr/lib/jvm/java-1.6.0-openjdk' '/usr/lib/jvm/jre-1.6.0-openjdk')
++ JAVA7_HOME_CANDIDATES=('/usr/java/jdk1.7' '/usr/java/jre1.7' '/usr/lib/jvm/j2sdk1.7-oracle' '/usr/lib/jvm/j2sdk1.7-oracle/jre' '/usr/lib/jvm/java-7-oracle')
++ OPENJAVA7_HOME_CANDIDATES=('/usr/lib/jvm/java-1.7.0-openjdk' '/usr/lib/jvm/java-7-openjdk')
++ JAVA8_HOME_CANDIDATES=('/usr/java/jdk1.8' '/usr/java/jre1.8' '/usr/lib/jvm/j2sdk1.8-oracle' '/usr/lib/jvm/j2sdk1.8-oracle/jre' '/usr/lib/jvm/java-8-oracle')
++ OPENJAVA8_HOME_CANDIDATES=('/usr/lib/jvm/java-1.8.0-openjdk' '/usr/lib/jvm/java-8-openjdk')
++ MISCJAVA_HOME_CANDIDATES=('/Library/Java/Home' '/usr/java/default' '/usr/lib/jvm/default-java' '/usr/lib/jvm/java-openjdk' '/usr/lib/jvm/jre-openjdk')
++ case ${BIGTOP_JAVA_MAJOR} in
++ JAVA_HOME_CANDIDATES=(${JAVA7_HOME_CANDIDATES[@]} ${JAVA8_HOME_CANDIDATES[@]} ${MISCJAVA_HOME_CANDIDATES[@]} ${OPENJAVA7_HOME_CANDIDATES[@]} ${OPENJAVA8_HOME_CANDIDATES[@]})
++ '[' -z '' ']'
++ for candidate_regex in '${JAVA_HOME_CANDIDATES[@]}'
+++ ls -rvd '/usr/java/jdk1.7*'
++ for candidate_regex in '${JAVA_HOME_CANDIDATES[@]}'
+++ ls -rvd '/usr/java/jre1.7*'
++ for candidate_regex in '${JAVA_HOME_CANDIDATES[@]}'
+++ ls -rvd '/usr/lib/jvm/j2sdk1.7-oracle*'
++ for candidate_regex in '${JAVA_HOME_CANDIDATES[@]}'
+++ ls -rvd '/usr/lib/jvm/j2sdk1.7-oracle/jre*'
++ for candidate_regex in '${JAVA_HOME_CANDIDATES[@]}'
+++ ls -rvd '/usr/lib/jvm/java-7-oracle*'
++ for candidate_regex in '${JAVA_HOME_CANDIDATES[@]}'
+++ ls -rvd /usr/java/jdk1.8.0_131
++ for candidate in '`ls -rvd ${candidate_regex}* 2>/dev/null`'
++ '[' -e /usr/java/jdk1.8.0_131/bin/java ']'
++ export JAVA_HOME=/usr/java/jdk1.8.0_131
++ JAVA_HOME=/usr/java/jdk1.8.0_131
++ break 2
+ verify_java_home
+ '[' -z /usr/java/jdk1.8.0_131 ']'
+ echo JAVA_HOME=/usr/java/jdk1.8.0_131
+ . /opt/cloudera-manager/cm-5.7.2/lib64/cmf/service/common/cdh-default-hadoop
++ [[ -z 5 ]]
++ '[' 5 = 3 ']'
++ '[' 5 = -3 ']'
++ '[' 5 -ge 4 ']'
++ export HADOOP_HOME_WARN_SUPPRESS=true
++ HADOOP_HOME_WARN_SUPPRESS=true
++ export HADOOP_PREFIX=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop
++ HADOOP_PREFIX=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop
++ export HADOOP_LIBEXEC_DIR=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop/libexec
++ HADOOP_LIBEXEC_DIR=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop/libexec
++ export HADOOP_CONF_DIR=/opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER
++ HADOOP_CONF_DIR=/opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER
++ export HADOOP_COMMON_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop
++ HADOOP_COMMON_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop
++ export HADOOP_HDFS_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-hdfs
++ HADOOP_HDFS_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-hdfs
++ export HADOOP_MAPRED_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-mapreduce
++ HADOOP_MAPRED_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-mapreduce
++ '[' 5 = 4 ']'
++ '[' 5 = 5 ']'
++ export HADOOP_YARN_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-yarn
++ HADOOP_YARN_HOME=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-yarn
+ echo 'using /usr/java/jdk1.8.0_131 as JAVA_HOME'
+ echo 'using 5 as CDH_VERSION'
+ echo 'using /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-yarn as CDH_YARN_HOME'
+ echo 'using /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-mapreduce as CDH_MR2_HOME'
+ echo 'using /opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER as CONF_DIR'
+ export YARN_CONF_DIR=/opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER
+ YARN_CONF_DIR=/opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER
+ YARN=/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-yarn/bin/yarn
+ set_hadoop_classpath
+ set_classpath_in_var HADOOP_CLASSPATH
+ '[' -z HADOOP_CLASSPATH ']'
+ [[ -n /opt/cloudera-manager/cm-5.7.2/share/cmf ]]
++ find /opt/cloudera-manager/cm-5.7.2/share/cmf/lib/plugins -maxdepth 1 -name '*.jar'
++ tr '\n' :
+ ADD_TO_CP=/opt/cloudera-manager/cm-5.7.2/share/cmf/lib/plugins/tt-instrumentation-5.7.2.jar:/opt/cloudera-manager/cm-5.7.2/share/cmf/lib/plugins/event-publish-5.7.2-shaded.jar:
+ [[ -n '' ]]
+ eval 'OLD_VALUE=$HADOOP_CLASSPATH'
++ OLD_VALUE=
+ NEW_VALUE=/opt/cloudera-manager/cm-5.7.2/share/cmf/lib/plugins/tt-instrumentation-5.7.2.jar:/opt/cloudera-manager/cm-5.7.2/share/cmf/lib/plugins/event-publish-5.7.2-shaded.jar:
+ export HADOOP_CLASSPATH=/opt/cloudera-manager/cm-5.7.2/share/cmf/lib/plugins/tt-instrumentation-5.7.2.jar:/opt/cloudera-manager/cm-5.7.2/share/cmf/lib/plugins/event-publish-5.7.2-shaded.jar
+ HADOOP_CLASSPATH=/opt/cloudera-manager/cm-5.7.2/share/cmf/lib/plugins/tt-instrumentation-5.7.2.jar:/opt/cloudera-manager/cm-5.7.2/share/cmf/lib/plugins/event-publish-5.7.2-shaded.jar
+ set -x
+ replace_conf_dir
+ find /opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER -type f '!' -path '/opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER/logs/*' '!' -name '*.log' '!' -name '*.keytab' '!' -name '*jceks' -exec perl -pi -e 's#{{CMF_CONF_DIR}}#/opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER#g' '{}' ';'
Can't open /opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER/supervisor.conf: 权限不够.
+ perl -pi -e 's#{{CGROUP_GROUP_CPU}}##g' /opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER/yarn-site.xml
++ replace_pid
++ echo
++ sed 's#{{PID}}#15589#g'
+ export YARN_NODEMANAGER_OPTS=
+ YARN_NODEMANAGER_OPTS=
++ replace_pid -Xms52428800 -Xmx52428800 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dhadoop.event.appender=,EventCatcher -XX:OnOutOfMemoryError=/opt/cloudera-manager/cm-5.7.2/lib64/cmf/service/common/killparent.sh
++ sed 's#{{PID}}#15589#g'
++ echo -Xms52428800 -Xmx52428800 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dhadoop.event.appender=,EventCatcher -XX:OnOutOfMemoryError=/opt/cloudera-manager/cm-5.7.2/lib64/cmf/service/common/killparent.sh
+ export 'YARN_RESOURCEMANAGER_OPTS=-Xms52428800 -Xmx52428800 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dhadoop.event.appender=,EventCatcher -XX:OnOutOfMemoryError=/opt/cloudera-manager/cm-5.7.2/lib64/cmf/service/common/killparent.sh'
+ YARN_RESOURCEMANAGER_OPTS='-Xms52428800 -Xmx52428800 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dhadoop.event.appender=,EventCatcher -XX:OnOutOfMemoryError=/opt/cloudera-manager/cm-5.7.2/lib64/cmf/service/common/killparent.sh'
++ replace_pid
++ echo
++ sed 's#{{PID}}#15589#g'
+ export HADOOP_JOB_HISTORYSERVER_OPTS=
+ HADOOP_JOB_HISTORYSERVER_OPTS=
+ make_scripts_executable
+ find /opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent/process/871-yarn-RESOURCEMANAGER -regex '.*\.\(py\|sh\)$' -exec chmod u+x '{}' ';'
+ export 'YARN_OPTS=-Djava.net.preferIPv4Stack=true '
+ YARN_OPTS='-Djava.net.preferIPv4Stack=true '
+ acquire_kerberos_tgt yarn.keytab
+ '[' -z yarn.keytab ']'
+ '[' -n '' ']'
+ '[' resourcemanager = RefreshQueuesAndNodes ']'
+ '[' resourcemanager = historyserver ']'
+ '[' resourcemanager = application-diagnostic-data ']'
+ '[' resourcemanager = install-mr-framework ']'
+ '[' resourcemanager = nodemanager ']'
+ exec /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/lib/hadoop-yarn/bin/yarn resourcemanager
七月 20, 2018 2:20:29 下午 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
警告: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
七月 20, 2018 2:20:30 下午 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
信息: Registering org.apache.hadoop.yarn.server.resourcemanager.webapp.JAXBContextResolver as a provider class
七月 20, 2018 2:20:30 下午 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
信息: Registering org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices as a root resource class
七月 20, 2018 2:20:30 下午 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
信息: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
七月 20, 2018 2:20:30 下午 com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
信息: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM'
七月 20, 2018 2:20:30 下午 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
警告: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
七月 20, 2018 2:20:30 下午 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
信息: Binding org.apache.hadoop.yarn.server.resourcemanager.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
七月 20, 2018 2:20:32 下午 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
警告: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
七月 20, 2018 2:20:33 下午 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
警告: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
七月 20, 2018 2:20:35 下午 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
警告: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
七月 20, 2018 2:20:35 下午 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
信息: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
七月 20, 2018 2:20:36 下午 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
警告: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
七月 20, 2018 2:20:37 下午 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
警告: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
七月 20, 2018 2:20:39 下午 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
警告: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
七月 20, 2018 2:20:41 下午 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
警告: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
七月 20, 2018 2:20:41 下午 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
警告: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
七月 20, 2018 2:20:41 下午 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
信息: Binding org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices to GuiceManagedComponentProvider with the scope "Singleton"
+ grep -q OnOutOfMemoryError /proc/15589/cmdline
+ RET=0
+ '[' 0 -eq 0 ']'
+ TARGET=15589
++ date
+ echo $'2018\345\271\264' $'07\346\234\210' $'20\346\227\245' $'\346\230\237\346\234\237\344\272\224' 14:20:48 CST
+ kill -9 15589













634320089 发表于 2018-7-23 16:20:00

看你这个错是启动ResourceManager就内存溢出了,分配给ResourceManager多少内存啊,能贴个配置的图么

wwwyibin518 发表于 2018-7-23 16:33:35

本帖最后由 wwwyibin518 于 2018-7-23 16:35 编辑

634320089 发表于 2018-7-23 16:20
看你这个错是启动ResourceManager就内存溢出了,分配给ResourceManager多少内存啊,能贴个配置的图么
集群共有4个节点。
每个节点16核CPU,16G内存。

我能知道的和Yran 内存有关的参数是这样了,刚接触,最后调整到这样还是情况依旧。
yarn.scheduler.minimum-allocation-mb 1G
yarn.scheduler.maximum-allocation-mb 8G
yarn.nodemanager.resource.memory-mb 8G
yarn.scheduler.increment-allocation-mb 512M
mapreduce.map.memory.mb 1G
mapreduce.reduce.memory.mb 2G
mapreduce.map.java.opts 768M
mapreduce.reduce.java.opts 1536M


【yarn-site.xml】文件内容
<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
    <name>yarn.acl.enable</name>
    <value>true</value>
</property>
<property>
    <name>yarn.admin.acl</name>
    <value>*</value>
</property>
<property>
    <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
</property>
<property>
    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
    <value>true</value>
</property>
<property>
    <name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
    <value>true</value>
</property>
<property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
</property>
<property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>qoe02:2181,qoe04:2181,qoe03:2181</value>
</property>
<property>
    <name>yarn.resourcemanager.store.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
    <name>yarn.client.failover-sleep-base-ms</name>
    <value>100</value>
</property>
<property>
    <name>yarn.client.failover-sleep-max-ms</name>
    <value>2000</value>
</property>
<property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>yarnRM</value>
</property>
<property>
    <name>yarn.resourcemanager.address.rm76</name>
    <value>qoe01:8032</value>
</property>
<property>
    <name>yarn.resourcemanager.scheduler.address.rm76</name>
    <value>qoe01:8030</value>
</property>
<property>
    <name>yarn.resourcemanager.resource-tracker.address.rm76</name>
    <value>qoe01:8031</value>
</property>
<property>
    <name>yarn.resourcemanager.admin.address.rm76</name>
    <value>qoe01:8033</value>
</property>
<property>
    <name>yarn.resourcemanager.webapp.address.rm76</name>
    <value>qoe01:8088</value>
</property>
<property>
    <name>yarn.resourcemanager.webapp.https.address.rm76</name>
    <value>qoe01:8090</value>
</property>
<property>
    <name>yarn.resourcemanager.address.rm107</name>
    <value>qoe02:8032</value>
</property>
<property>
    <name>yarn.resourcemanager.scheduler.address.rm107</name>
    <value>qoe02:8030</value>
</property>
<property>
    <name>yarn.resourcemanager.resource-tracker.address.rm107</name>
    <value>qoe02:8031</value>
</property>
<property>
    <name>yarn.resourcemanager.admin.address.rm107</name>
    <value>qoe02:8033</value>
</property>
<property>
    <name>yarn.resourcemanager.webapp.address.rm107</name>
    <value>qoe02:8088</value>
</property>
<property>
    <name>yarn.resourcemanager.webapp.https.address.rm107</name>
    <value>qoe02:8090</value>
</property>
<property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>rm76,rm107</value>
</property>
<property>
    <name>yarn.resourcemanager.client.thread-count</name>
    <value>50</value>
</property>
<property>
    <name>yarn.resourcemanager.scheduler.client.thread-count</name>
    <value>50</value>
</property>
<property>
    <name>yarn.resourcemanager.admin.client.thread-count</name>
    <value>1</value>
</property>
<property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>1024</value>
</property>
<property>
    <name>yarn.scheduler.increment-allocation-mb</name>
    <value>512</value>
</property>
<property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>8192</value>
</property>
<property>
    <name>yarn.scheduler.minimum-allocation-vcores</name>
    <value>16</value>
</property>
<property>
    <name>yarn.scheduler.increment-allocation-vcores</name>
    <value>1</value>
</property>
<property>
    <name>yarn.scheduler.maximum-allocation-vcores</name>
    <value>16</value>
</property>
<property>
    <name>yarn.resourcemanager.amliveliness-monitor.interval-ms</name>
    <value>1000</value>
</property>
<property>
    <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
    <value>600000</value>
</property>
<property>
    <name>yarn.resourcemanager.am.max-attempts</name>
    <value>2</value>
</property>
<property>
    <name>yarn.resourcemanager.container.liveness-monitor.interval-ms</name>
    <value>600000</value>
</property>
<property>
    <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
    <value>1000</value>
</property>
<property>
    <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
    <value>600000</value>
</property>
<property>
    <name>yarn.resourcemanager.resource-tracker.client.thread-count</name>
    <value>50</value>
</property>
<property>
    <name>yarn.application.classpath</name>
    <value>$HADOOP_CLIENT_CONF_DIR,$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*</value>
</property>
<property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
    <name>yarn.scheduler.fair.user-as-default-queue</name>
    <value>true</value>
</property>
<property>
    <name>yarn.scheduler.fair.preemption</name>
    <value>false</value>
</property>
<property>
    <name>yarn.scheduler.fair.sizebasedweight</name>
    <value>false</value>
</property>
<property>
    <name>yarn.scheduler.fair.assignmultiple</name>
    <value>false</value>
</property>
<property>
    <name>yarn.resourcemanager.max-completed-applications</name>
    <value>10000</value>
</property>
</configuration>





【mapred-site.xml】文件内容
<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
    <name>mapreduce.job.split.metainfo.maxsize</name>
    <value>10000000</value>
</property>
<property>
    <name>mapreduce.job.counters.max</name>
    <value>120</value>
</property>
<property>
    <name>mapreduce.output.fileoutputformat.compress</name>
    <value>false</value>
</property>
<property>
    <name>mapreduce.output.fileoutputformat.compress.type</name>
    <value>BLOCK</value>
</property>
<property>
    <name>mapreduce.output.fileoutputformat.compress.codec</name>
    <value>org.apache.hadoop.io.compress.DefaultCodec</value>
</property>
<property>
    <name>mapreduce.map.output.compress.codec</name>
    <value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
<property>
    <name>mapreduce.map.output.compress</name>
    <value>true</value>
</property>
<property>
    <name>zlib.compress.level</name>
    <value>DEFAULT_COMPRESSION</value>
</property>
<property>
    <name>mapreduce.task.io.sort.factor</name>
    <value>64</value>
</property>
<property>
    <name>mapreduce.map.sort.spill.percent</name>
    <value>0.8</value>
</property>
<property>
    <name>mapreduce.reduce.shuffle.parallelcopies</name>
    <value>10</value>
</property>
<property>
    <name>mapreduce.task.timeout</name>
    <value>600000</value>
</property>
<property>
    <name>mapreduce.client.submit.file.replication</name>
    <value>2</value>
</property>
<property>
    <name>mapreduce.job.reduces</name>
    <value>2</value>
</property>
<property>
    <name>mapreduce.task.io.sort.mb</name>
    <value>256</value>
</property>
<property>
    <name>mapreduce.map.speculative</name>
    <value>false</value>
</property>
<property>
    <name>mapreduce.reduce.speculative</name>
    <value>false</value>
</property>
<property>
    <name>mapreduce.job.reduce.slowstart.completedmaps</name>
    <value>0.8</value>
</property>
<property>
    <name>mapreduce.jobhistory.address</name>
    <value>qoe01:10020</value>
</property>
<property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>qoe01:19888</value>
</property>
<property>
    <name>mapreduce.jobhistory.webapp.https.address</name>
    <value>qoe01:19890</value>
</property>
<property>
    <name>mapreduce.jobhistory.admin.address</name>
    <value>qoe01:10033</value>
</property>
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
<property>
    <name>yarn.app.mapreduce.am.staging-dir</name>
    <value>/user</value>
</property>
<property>
    <name>mapreduce.am.max-attempts</name>
    <value>2</value>
</property>
<property>
    <name>yarn.app.mapreduce.am.resource.mb</name>
    <value>1024</value>
</property>
<property>
    <name>yarn.app.mapreduce.am.resource.cpu-vcores</name>
    <value>1</value>
</property>
<property>
    <name>mapreduce.job.ubertask.enable</name>
    <value>false</value>
</property>
<property>
    <name>yarn.app.mapreduce.am.command-opts</name>
    <value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value>
</property>
<property>
    <name>mapreduce.map.java.opts</name>
    <value>-Djava.net.preferIPv4Stack=true -Xmx805306368</value>
</property>
<property>
    <name>mapreduce.reduce.java.opts</name>
    <value>-Djava.net.preferIPv4Stack=true -Xmx1610612736</value>
</property>
<property>
    <name>yarn.app.mapreduce.am.admin.user.env</name>
    <value>LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native:$JAVA_LIBRARY_PATH</value>
</property>
<property>
    <name>mapreduce.map.memory.mb</name>
    <value>1024</value>
</property>
<property>
    <name>mapreduce.map.cpu.vcores</name>
    <value>2</value>
</property>
<property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>2048</value>
</property>
<property>
    <name>mapreduce.reduce.cpu.vcores</name>
    <value>1</value>
</property>
<property>
    <name>mapreduce.job.heap.memory-mb.ratio</name>
    <value>0.8</value>
</property>
<property>
    <name>mapreduce.application.classpath</name>
    <value>$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$MR2_CLASSPATH</value>
</property>
<property>
    <name>mapreduce.admin.user.env</name>
    <value>LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native:$JAVA_LIBRARY_PATH</value>
</property>
<property>
    <name>mapreduce.shuffle.max.connections</name>
    <value>80</value>
</property>
</configuration>



634320089 发表于 2018-7-23 17:00:10

wwwyibin518 发表于 2018-7-23 16:33
集群共有4个节点。
每个节点16核CPU,16G内存。



你用的CDH的话,在页面搜“ResourceManager 的 Java 堆栈大小”,你贴的都是nodemanager调度job的参数,和你这个报错没有关系

wwwyibin518 发表于 2018-7-23 17:18:29

634320089 发表于 2018-7-23 17:00
你用的CDH的话,在页面搜“ResourceManager 的 Java 堆栈大小”,你贴的都是nodemanager调度 ...
额..........50兆


这个值要设多少合适,根据什么规则?



634320089 发表于 2018-7-23 17:22:54

wwwyibin518 发表于 2018-7-23 17:18
额..........50兆




50M肯定带不动啊,1G吧,应该没什么问题,这个没什么规则,根据集群规则来定

wwwyibin518 发表于 2018-7-23 19:19:04

634320089 发表于 2018-7-23 17:22
50M肯定带不动啊,1G吧,应该没什么问题,这个没什么规则,根据集群规则来定

哈问题解决了,一语中的,感谢相助啊。

ResourceManager 的 Java 堆栈大小
这个参数的作用,对Yran ResourceManager的影响,Yran的执行理解还得再深入了解。
页: [1]
查看完整版本: Yran的ResourceManager异常退出