VM options = -Dspark.master=spark://master:7077 (master应该替换为所需要的集群主机名)
Program arguments = 文件在本地机器上的绝对路径(或者hdfs://... 或者其他路径)
按照如上参数直接在idea启动spark任务,会出现异常
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 4.0 failed 4 times, most recent failure: Lost task 0.3 in stage 4.0 (TID 77, 192.168.1.194): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD (见 https://issues.apache.org/jira/browse/SPARK-9219)
因此,改用submit的方式提交
①点击 File - project structure -artifacts - jar - from modules with dependency,选择对应的module和main class。