不会飞的小凯凯 发表于 2015-12-18 19:00:45

Spark Windows开发环境搭建(1.3.0)

本帖最后由 不会飞的小凯凯 于 2015-12-18 19:09 编辑



问题导读:
1.怎么样在winows环境下安装Spark?
2.怎么样安装IDE工具?
3.怎么样配置JDK和SDK?

static/image/hrline/4.gif


版本配套Spark: 1.3.0
Scala: 2.10.6
软件安装1、安装JDK手工配置JAVA_HOME环境变量,并将JDK的bin目录加入Path环境变量中。
2、安装Scala Windows版通过.msi软件包安装。安装完成后自动配置环境变量SCALA_HOME,并将scala下的bin目录加入Path环境变量中。下载地址:htpp://www.scala-lang.org
3、下载Spark软件包解压即可。
4、安装IntelliJ IDEA下载地址:http://www.jetbrains.com/idea/,使用免费的社区版即可。
配置
[*]安装IntelliJ的Scala插件。
[*]配置Java SDK和Scala SDK
创建新项目,勾选Scala类型。New一个SDK。选择JDK,再选择JDK所在的目录。New一个Scala的SDK,再选择Scala所在的目录

[*]配置Global libraries
“File” -> “Project Structure…” -> “Global Libraries"选择"scala-sdk-2.10.5”,点击"+“将其他所有未加入的scala lib目录下的jar加入进来。

[*]配置Global libraries
"File” -> “Project Structure…” -> “Libraries"增加一个Spark SDK。Library选择Spark lib目录下的spark-assembly-1.3.0-hadoop2.4.0.jar文件。

[*]配置项目的源代码目录
"File” -> “Project Structure…” -> “Modules” -> “Sources”, 创建src\main\scala目录,并将此目录设置为源码目录,取消src为源码目录。

[*]写代码
创建net.cheyo包路径,创建类SparkPi,代码如下:
package net.cheyo
import scala.math.random

import org.apache.spark._

/** Computes an approximation to pi */
object SparkPi {
def main(args: Array) {
    val conf = new SparkConf().setAppName("Spark Pi")
    val spark = new SparkContext(conf)
    val slices = if (args.length > 0) args(0).toInt else 2
    val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow
    val count = spark.parallelize(1 until n, slices).map { i =>
      val x = random * 2 - 1
      val y = random * 2 - 1
      if (x*x + y*y < 1) 1 else 0
      }.reduce(_ + _)
    println("Pi is roughly " + 4.0 * count / n)
    spark.stop()
}
}





[*]编译工程
“Build” -> “Make Project”

[*]运行程序
“Run” -> “Edit configuration…” 增加一个Application, main class填"net.cheyo.SparkPi",VM options填"-Dspark.master=local"
打包成jar依次选择"File"–> “Project Structure” –> “Artifact",选择”+“–> "Jar” –> “From Modules with dependencies",选择Main Class, 点击OK. 并在弹出框中选择输出jar位置,并选择"OK"。打包jar包时: 按OK后,对artifacts进行配置,删除Output Layout中week2.jar中的几个依赖包,只剩MySparkPi项目本身最后依次选择"Build"–> "Build Artifact"编译生成jar包
常用问题Scala版本不配套引起运行错误
[*]现象描述
运行时出现如下错误:Exception in thread "main" java.lang.NoSuchMethodError: scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet;
    at akka.actor.ActorCell$.<init>(ActorCell.scala:336)
    at akka.actor.ActorCell$.<clinit>(ActorCell.scala)
    at akka.actor.RootActorPath.$div(ActorPath.scala:159)
    at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:464)
    at akka.remote.RemoteActorRefProvider.<init>(RemoteActorRefProvider.scala:124)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$2.apply(DynamicAccess.scala:78)
    at scala.util.Try$.apply(Try.scala:191)
    at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:73)
    at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
    at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
    at scala.util.Success.flatMap(Try.scala:230)
    at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:84)
    at akka.actor.ActorSystemImpl.liftedTree1$1(ActorSystem.scala:584)
    at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:577)
    at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)
    at akka.actor.ActorSystem$.apply(ActorSystem.scala:118)
    at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:122)
    at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:55)
    at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54)
    at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1832)
    at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:166)
    at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1823)
    at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:57)
    at org.apache.spark.SparkEnv$.create(SparkEnv.scala:223)
    at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:163)
    at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:267)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:270)
    at SparkPi$.main(SparkPi.scala:12)
    at SparkPi.main(SparkPi.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)

[*]问题原因
Scala的版本与Spark版本存在兼容性问题。版本是配套的,估计此问题是Bug。

[*]解决措施
将Scala换成与Spark相匹配的版本。Spark是1.3.0时,该Scala从2.11版本换成2.10版本。


bighammer 发表于 2015-12-19 10:36:59

沙发,受教了,多谢
页: [1]
查看完整版本: Spark Windows开发环境搭建(1.3.0)