大家好:
我搭建了一个实验的spark+hadoop环境,
1)系统
centos 6.5
2)服务器
192.168.1.101 hadoop-cluster-master
192.168.1.102 hadoop-cluster-slave1
192.168.1.103 hadoop-cluster-slave2
3台服务器都是相同的配置,dell 720,100G内存
3)hadoop +spark环境
使用的是社区版的二进制包:
hbase-1.2.1-bin.tar.gz
hadoop-2.6.4.tar.gz
jdk-8u92-linux-x64.gz
spark-1.6.1-bin-hadoop2.6.tgz
4)环境变量配置
PRESTO_HOME=/home/hadoop/presto
SPARK_HOME=/home/hadoop/spark
HADOOP_HOME=/home/hadoop/hadoop
SQOOP_HOME=/home/hadoop/sqoop
YARN_HOME=/home/hadoop/hadoop
ZOO_KEEPER_HOME=/home/hadoop/zookeeper
HBASE_HOME=/home/hadoop/hbase
FLUME_HOME=/home/hadoop/flume
HADOOP_HDFS_HOME=/home/hadoop/hadoop
HIVE_HOME=/home/hadoop/hive
HADOOP_COMMON_HOME=/home/hadoop/hadoop
JAVA_HOME=/home/hadoop/jdk
HADOOP_MAPRED_HOME=/home/hadoop/hadoop
OOZIE_HOME=/home/hadoop/oozie
SCALA_HOME=/home/hadoop/scala
HADOOP_CONF_DIR=/home/hadoop/hadoop/etc/hadoop
YARN_CONF_DIR=/home/hadoop/hadoop/etc/hadoop
5)spark集群使用的yarn模式,3台服务器上运行的程序为:
hadoop-cluster-master:
7872 NameNode
27122 RunJar
26035 HRegionServer
17604 ResourceManager
25861 HMaster
8071 SecondaryNameNode
30601 Jps
17018 HistoryServer
31082 PrestoServer
11135 QuorumPeerMain
hadoop-cluster-slave1:
19360 NodeManager
23650 Jps
6291 HRegionServer
9942 PrestoServer
6088 QuorumPeerMain
4731 DataNode
hadoop-cluster-slave2:
6722 HRegionServer
11881 PrestoServer
5004 DataNode
27901 NodeManager
6573 QuorumPeerMain
32430 Jps
现在使用spark-submit 命令提交作业,会包入学问题:
1)spark-submit --master yarn /home/hadoop/spark/examples/src/main/python/pi.py
INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
16/05/26 17:19:16 ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.NullPointerException
at org.apache.spark.SparkContext.<init>(SparkContext.scala:584)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:745)
16/05/26 17:19:16 INFO spark.SparkContext: SparkContext already stopped.
Traceback (most recent call last):
File "/home/hadoop/spark/examples/src/main/python/pi.py", line 30, in <module>
sc = SparkContext(appName="PythonPi")
File "/home/hadoop/spark/python/lib/pyspark.zip/pyspark/context.py", line 115, in __init__
File "/home/hadoop/spark/python/lib/pyspark.zip/pyspark/context.py", line 172, in _do_init
File "/home/hadoop/spark/python/lib/pyspark.zip/pyspark/context.py", line 235, in _initialize_context
File "/home/hadoop/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 1064, in __call__
File "/home/hadoop/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NullPointerException
at org.apache.spark.SparkContext.<init>(SparkContext.scala:584)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:745)
2)spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 4g --executor-cores 1 --queue thequeue /home/hadoop/spark/lib/spark-examples*.jar 10
16/05/26 17:20:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/05/26 17:20:24 INFO client.RMProxy: Connecting to ResourceManager at hadoop-cluster-master/192.168.1.101:8080
16/05/26 17:20:25 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
16/05/26 17:20:25 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/05/26 17:20:25 INFO yarn.Client: Will allocate AM container, with 4505 MB memory including 409 MB overhead
16/05/26 17:20:25 INFO yarn.Client: Setting up container launch context for our AM
16/05/26 17:20:25 INFO yarn.Client: Setting up the launch environment for our AM container
16/05/26 17:20:25 INFO yarn.Client: Preparing resources for our AM container
16/05/26 17:20:25 INFO yarn.Client: Uploading resource file:/home/hadoop/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar -> hdfs://hadoop-cluster-master:9000/user/hcxtuser/.sparkStaging/application_1464235518179_0018/spark-assembly-1.6.1-hadoop2.6.0.jar
16/05/26 17:20:41 INFO yarn.Client: Uploading resource file:/home/hadoop/spark/lib/spark-examples-1.6.1-hadoop2.6.0.jar -> hdfs://hadoop-cluster-master:9000/user/hcxtuser/.sparkStaging/application_1464235518179_0018/spark-examples-1.6.1-hadoop2.6.0.jar
16/05/26 17:20:52 INFO yarn.Client: Uploading resource file:/tmp/spark-29b6164b-3927-4f38-b2e0-2bfd94e7e1e1/__spark_conf__3733414854292196573.zip -> hdfs://hadoop-cluster-master:9000/user/hcxtuser/.sparkStaging/application_1464235518179_0018/__spark_conf__3733414854292196573.zip
16/05/26 17:20:52 INFO spark.SecurityManager: Changing view acls to: hcxtuser
16/05/26 17:20:52 INFO spark.SecurityManager: Changing modify acls to: hcxtuser
16/05/26 17:20:52 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hcxtuser); users with modify permissions: Set(hcxtuser)
16/05/26 17:20:52 INFO yarn.Client: Submitting application 18 to ResourceManager
16/05/26 17:20:52 INFO impl.YarnClientImpl: Submitted application application_1464235518179_0018
16/05/26 17:20:53 INFO yarn.Client: Application report for application_1464235518179_0018 (state: FAILED)
16/05/26 17:20:53 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1464235518179_0018 submitted by user hcxtuser to unknown queue: thequeue
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: thequeue
start time: 1464254452389
final status: FAILED
tracking URL: http://hadoop-cluster-master:8088/proxy/application_1464235518179_0018/
user: hcxtuser
16/05/26 17:20:53 INFO yarn.Client: Deleting staging directory .sparkStaging/application_1464235518179_0018
Exception in thread "main" org.apache.spark.SparkException: Application application_1464235518179_0018 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/05/26 17:20:53 INFO util.ShutdownHookManager: Shutdown hook called
16/05/26 17:20:53 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-29b6164b-3927-4f38-b2e0-2bfd94e7e1e1
请大家帮我分析一下,这是什么原因造成的,谢谢
|
|