代码:
[mw_shl_code=scala,true]import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object wordCount {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("wordcount").setMaster("spark://master:7077")
val sc = new SparkContext(conf)
val file = sc.textFile("hdfs://master:9000/data")
val count=file.flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey(_+_)
count.collect()
}
}[/mw_shl_code]
错误如下:
2、Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 4, 192.168.86.132, executor 1): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
报错是:count.collect()出错
|