hadoop 内存错误
hadoop 内存给定错误
[mw_shl_code=bash,true]11/09/06 09:20:25 WARN mapred.JobClient: Error reading task outputhttp://server4:50060/tasklog?plaintext=true&taskid=attempt_201109060853_0005_r_000008_0&filter=stdout
11/09/06 09:20:25 WARN mapred.JobClient: Error reading task outputhttp://server4:50060/tasklog?plaintext=true&taskid=attempt_201109060853_0005_r_000008_0&filter=stderr
11/09/06 09:20:34 INFO mapred.JobClient: Task Id : attempt_201109060853_0005_m_000009_1, Status : FAILED
java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)[/mw_shl_code]
运行map reduce任务的时候 报这个错误,查了些文章 说是 应该吧 userlogs 下面的文件都删除掉
老外的文章:
Just an FYI, found the solution to this problem.
Apparently, it's an OS limit on the number of sub-directories that can be reated in another directory. In this case, we had 31998 sub-directories uder hadoop/userlogs/, so any new tasks would fail in Job Setup.
From the unix command line, mkdir fails as well:
$ mkdir hadoop/userlogs/testdir
mkdir: cannot create directory `hadoop/userlogs/testdir': Too many links
Difficult to track down because the Hadoop error message gives no hint whasoever. And normally, you'd look in the userlog itself for more info, butin this case the userlog couldn't be created.
但是 我的问题在userlogs下 可以mkdir test 是成功的 所以 删除这个userlogs下的所有文件 仍然报错
于是查看uerlogs下的文件:
[suse@server6 userlogs]$ cat attempt_201109060853_0005_m_000009_2/
[suse@server6 attempt_201109060853_0005_m_000009_2]$ cat stdout
Error occurred during initialization of VM
Incompatible minimum and maximum heap sizes specified
发现是 jvm 内存给定错误:
[mw_shl_code=xml,true] <property>
<name>mapred.child.java.opts</name>
<value>-Xmx1024m -Xms1024m -Xmn192m -XX:+UseConcMarkSweepGC -XX:CMSFullGCsBeforeCompaction=5 -XX:+UseParNewGC -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=31 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -Xloggc:$HADOOP_HOME/logs/gc.log</value>
<description>Java opts for the task tracker child processes.
The following symbol, if present, will be interpolated: @taskid@ is replaced
by current TaskID. Any other occurrences of '@' will go unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
The configuration variable mapred.child.ulimit can be used to control the
maximum virtual memory of the child processes.
</description>
</property>[/mw_shl_code]
原来我写的是
[mw_shl_code=xml,true] <property>
<name>mapred.child.java.opts</name>
<value>-Xmx512m -Xms1024m -Xmn192m -XX:+UseConcMarkSweepGC -XX:CMSFullGCsBeforeCompaction=5 -XX:+UseParNewGC -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=31 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -Xloggc:$HADOOP_HOME/logs/gc.log</value>
<description>Java opts for the task tracker child processes.
The following symbol, if present, will be interpolated: @taskid@ is replaced
by current TaskID. Any other occurrences of '@' will go unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
The configuration variable mapred.child.ulimit can be used to control the
maximum virtual memory of the child processes.
</description>
</property>[/mw_shl_code]
不小心写错了:
下面介绍下这几个参数的意思
-Xss 20000k
这个参数的意思是 每增加一个线程 jvm 会增加 20M 的内存,而最佳值应该是128K,默认值好像是512k.
-Xmx jvm 启动最大内存,Java Heap最大值,默认值为物理内存的1 / 4 ,最佳设值应该视物理内存大小及计算机内其他内存开销而定
-Xms jvm Java Heap初始值,Server端JVM最好将-Xms和-Xmx设为相同值,开发测试机JVM可以保留默认值;
-Xmn Java Heap Young区大小,不熟悉最好保留默认值;
-Xss 每个线程的Stack大小,不熟悉最好保留默认值;
|
|