分享

spark on yarn出现的问题

smfswxj 发表于 2017-11-13 10:16:25 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 2 10391
./pyspark --master=yarn
/usr/local/lib/python3.4/site-packages/IPython/core/history.py:226: UserWarning: IPython History requires SQLite, your history will not be saved
  warn("IPython History requires SQLite, your history will not be saved")
Python 3.4.5 (default, Nov 12 2017, 09:10:34)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.
Warning: Ignoring non-spark config property: export=PYSPARK_DRIVER_PYTHON=ipython
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/spark-2.1.0-bin-hadoop2.7/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/flume-ng/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/parquet/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.11.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/11/13 10:01:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/11/13 10:01:39 ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: Required executor memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
    at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:333)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:167)
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:236)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:214)
    at java.lang.Thread.run(Thread.java:748)
17/11/13 10:01:39 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
17/11/13 10:01:39 WARN metrics.MetricsSystem: Stopping a MetricsSystem that is not running
17/11/13 10:01:39 WARN spark.SparkContext: Another SparkContext is being constructed (or threw an exception in its constructor).  This may indicate an error, since only one SparkContext may be running in this JVM (see SPARK-2243). The other SparkContext was created at:
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.lang.reflect.Constructor.newInstance(Constructor.java:423)
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
py4j.Gateway.invoke(Gateway.java:236)
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
py4j.GatewayConnection.run(GatewayConnection.java:214)
java.lang.Thread.run(Thread.java:748)
17/11/13 10:01:39 ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: Required executor memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
    at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:333)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:167)
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:236)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:214)
    at java.lang.Thread.run(Thread.java:748)
17/11/13 10:01:39 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
17/11/13 10:01:39 WARN metrics.MetricsSystem: Stopping a MetricsSystem that is not running
[TerminalIPythonApp] WARNING | Unknown error in handling PYTHONSTARTUP file /opt/spark2/python/pyspark/shell.py:
---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
/opt/spark2/python/pyspark/shell.py in <module>()
     42     SparkContext._jvm.org.apache.hadoop.hive.conf.HiveConf()
---> 43     spark = SparkSession.builder\
     44         .enableHiveSupport()\

/opt/spark2/python/pyspark/sql/session.py in getOrCreate(self)
    168                         sparkConf.set(key, value)
--> 169                     sc = SparkContext.getOrCreate(sparkConf)
    170                     # This SparkContext may be an existing one.

/opt/spark2/python/pyspark/context.py in getOrCreate(cls, conf)
    306             if SparkContext._active_spark_context is None:
--> 307                 SparkContext(conf=conf or SparkConf())
    308             return SparkContext._active_spark_context

/opt/spark2/python/pyspark/context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    117             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
--> 118                           conf, jsc, profiler_cls)
    119         except:

/opt/spark2/python/pyspark/context.py in _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, jsc, profiler_cls)
    178         # Create the Java SparkContext through Py4J
--> 179         self._jsc = jsc or self._initialize_context(self._conf._jconf)
    180         # Reset the SparkConf to the one actually used by the SparkContext in JVM.

/opt/spark2/python/pyspark/context.py in _initialize_context(self, jconf)
    245         """
--> 246         return self._jvm.JavaSparkContext(jconf)
    247

/opt/spark2/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1400         return_value = get_return_value(
-> 1401             answer, self._gateway_client, None, self._fqn)
   1402

/opt/spark2/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    318                     "An error occurred while calling {0}{1}{2}.\n".
--> 319                     format(target_id, ".", name), value)
    320             else:

Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.IllegalArgumentException: Required executor memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
    at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:333)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:167)
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:236)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:214)
    at java.lang.Thread.run(Thread.java:748)


During handling of the above exception, another exception occurred:

Py4JJavaError                             Traceback (most recent call last)
/usr/local/lib/python3.4/site-packages/IPython/core/shellapp.py in _exec_file(self, fname, shell_futures)
    321                                                  self.shell.user_ns,
    322                                                  shell_futures=shell_futures,
--> 323                                                  raise_exceptions=True)
    324         finally:
    325             sys.argv = save_argv

/usr/local/lib/python3.4/site-packages/IPython/core/interactiveshell.py in safe_execfile(self, fname, exit_ignore, raise_exceptions, shell_futures, *where)
   2489                 py3compat.execfile(
   2490                     fname, glob, loc,
-> 2491                     self.compile if shell_futures else None)
   2492             except SystemExit as status:
   2493                 # If the call was made with 0 or None exit status (sys.exit(0)

/usr/local/lib/python3.4/site-packages/IPython/utils/py3compat.py in execfile(fname, glob, loc, compiler)
    184         with open(fname, 'rb') as f:
    185             compiler = compiler or compile
--> 186             exec(compiler(f.read(), fname, 'exec'), glob, loc)
    187
    188     # Refactor print statements in doctests.

/opt/spark2/python/pyspark/shell.py in <module>()
     45         .getOrCreate()
     46 except py4j.protocol.Py4JError:
---> 47     spark = SparkSession.builder.getOrCreate()
     48 except TypeError:
     49     spark = SparkSession.builder.getOrCreate()

/opt/spark2/python/pyspark/sql/session.py in getOrCreate(self)
    167                     for key, value in self._options.items():
    168                         sparkConf.set(key, value)
--> 169                     sc = SparkContext.getOrCreate(sparkConf)
    170                     # This SparkContext may be an existing one.
    171                     for key, value in self._options.items():

/opt/spark2/python/pyspark/context.py in getOrCreate(cls, conf)
    305         with SparkContext._lock:
    306             if SparkContext._active_spark_context is None:
--> 307                 SparkContext(conf=conf or SparkConf())
    308             return SparkContext._active_spark_context
    309

/opt/spark2/python/pyspark/context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    116         try:
    117             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
--> 118                           conf, jsc, profiler_cls)
    119         except:
    120             # If an error occurs, clean up in order to allow future SparkContext creation:

/opt/spark2/python/pyspark/context.py in _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, jsc, profiler_cls)
    177
    178         # Create the Java SparkContext through Py4J
--> 179         self._jsc = jsc or self._initialize_context(self._conf._jconf)
    180         # Reset the SparkConf to the one actually used by the SparkContext in JVM.
    181         self._conf = SparkConf(_jconf=self._jsc.sc().conf())

/opt/spark2/python/pyspark/context.py in _initialize_context(self, jconf)
    244         Initialize SparkContext in function to allow subclass specific initialization
    245         """
--> 246         return self._jvm.JavaSparkContext(jconf)
    247
    248     @classmethod

/opt/spark2/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1399         answer = self._gateway_client.send_command(command)
   1400         return_value = get_return_value(
-> 1401             answer, self._gateway_client, None, self._fqn)
   1402
   1403         for temp_arg in temp_args:

/opt/spark2/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    317                 raise Py4JJavaError(
    318                     "An error occurred while calling {0}{1}{2}.\n".
--> 319                     format(target_id, ".", name), value)
    320             else:
    321                 raise Py4JError(

Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.IllegalArgumentException: Required executor memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
    at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:333)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:167)
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:236)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:214)
    at java.lang.Thread.run(Thread.java:748)


In [1]: sc.master
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-b336792d302d> in <module>()
----> 1 sc.master

NameError: name 'sc' is not defined

In [2]:

已有(2)人评论

跳转到指定楼层
smfswxj 发表于 2017-11-13 10:48:03
这个只需要调高yarn.nodemanager.resource.memory-mb的值即可
回复

使用道具 举报

langke93 发表于 2017-11-13 10:54:36
smfswxj 发表于 2017-11-13 10:48
这个只需要调高yarn.nodemanager.resource.memory-mb的值即可

'yarn.scheduler.maximum-allocation-mb'  'yarn.nodemanager.resource.memory-mb'

两个可以都试试,超过1024+384 MB即可
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条