分享

Spark的Rest Service

nettman 2014-8-28 15:55:14 发表于 介绍解说 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 0 18997
问题导读:
1.如何下载spark-jobserver?
2.如何启动jobserver?
3.Rest如何提交job?






Job Server形式的rest service

https://github.com/ooyala/spark-jobserver

一、安装和启动jobserver

1、git clone spark-jobserver

2、进入该目录,敲sbt

  sbt安装见  http://www.scala-sbt.org/release ... l-Installation.html

3、re-start,在本机启动jobserver

  1. re-start --- -XX:PermSize=512M
复制代码


web访问:localhost:8090

二、上传jar包
1、上传示例
  1. sbt job-server-tests/package
复制代码
  1. curl --data-binary @job-server-tests/target/job-server-tests-0.3.1.jar localhost:8090/jars/test
复制代码
  1. curl localhost:8090/jars/
复制代码

2、提交job
  1. curl -d "input.string = a b c a b see" 'localhost:8090/jobs?appName=test&classPath=spark.jobserver.WordCountExample'
  2. #返回
  3. {
  4.   "status": "STARTED",
  5.   "result": {
  6.     "jobId": "d56eadea-5a73-46f4-803e-418e6f5b990f",
  7.     "context": "f6a1c9b7-spark.jobserver.WordCountExample"
  8.   }
复制代码

3、查看结果
  1. curl localhost:8090/jobs/d56eadea-5a73-46f4-803e-418e6f5b990f
复制代码

三、预先启动Context
  1. curl -d "" 'localhost:8090/contexts/test-context?num-cpu-cores=4&mem-per-node=512m'
复制代码

(1)配置文件配置
  1. ~/app/spark-jobserver/config# vim local.conf.template
复制代码

(2)url参数配置
  1. POST    /contexts/my-new-context?num-cpu-cores=10
复制代码

四、部署
1、复制config/local.sh.template到environment.sh ,并且设置相关参数。

  1. ~/app/spark-jobserver/config# cp local.sh.template ~/app/spark-jobserver-src/bin/config/cfg.sh
  2. ~/app/spark-jobserver-src/bin# ./server_deploy.sh cfg
复制代码

cfg.sh
  1. DEPLOY_HOSTS="192.168.2.215"
  2. APP_USER=root
  3. APP_GROUP=root
  4. INSTALL_DIR=/root/app/spark-job-server
  5. LOG_DIR=/var/log/job-server
  6. SPARK_HOME=/root/app/spark-1.0.0-bin-hadoop2
复制代码

2、运行bin/server_deploy.sh  ,将打好包后与相关配置一起放到指定服务器的目录
  出现Error during sbt execution: java.lang.OutOfMemoryError: PermGen space

  1. ~/app/spark-jobserver-src/bin# cat /bin/sbt
  2. SBT_OPTS="-Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=512m"
  3. java $SBT_OPTS -jar `dirname $0`/sbt-launch.jar "$@"
  4. ~/app/spark-jobserver-src/bin# vim /bin/sbt
复制代码

3、进入服务器指定目录,运行server_start.sh
  deploy可能有问题,需要把config下的local.conf.template 放到 server的spark-job-server目录下,改名为local.conf,另外,把cfg.sh 拷贝到 spark-job-server目录下 改名为 settings.sh

五、创建JobServer工程(1)创建sbt项目
   新建目录JobServer,建立build.sbt,内容如下

  1. name := "job server demo"
  2. version := "1.0"
  3. scalaVersion := "2.10.4"
  4. resolvers += "Ooyala Bintray" at "http://dl.bintray.com/ooyala/maven"
  5. libraryDependencies += "ooyala.cnd" % "job-server" % "0.3.1" % "provided"
  6. libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.0.0"
复制代码

  sbt添加eclipse插件
  1. C:\Users\Administrator\.sbt\0.13\plugins\plugins.sbt
  2. #如果没有则创建,添加
  3. addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "2.5.0")
  4. #然后
  5. sbt,sbt update
  6. eclipse
复制代码

(2)继承SparkJob
    继承SparkJob,重写validate与runJob

  1. val str = "a a a b b c"
  2. val rdd = sc.parallelize(str.split(" ").toSeq)
  3. val rsList = rdd.map((_,1)).reduceByKey(_+_).map(x => (x._2,x._1)).sortByKey(false).collect
  4. rsList(0)._2
复制代码

完整代码
  1. import com.typesafe.config.{Config, ConfigFactory}
  2. import org.apache.spark._
  3. import org.apache.spark.SparkContext._
  4. import scala.util.Try
  5. import spark.jobserver.SparkJob
  6. import spark.jobserver.SparkJobValidation
  7. import spark.jobserver.SparkJobValid
  8. import spark.jobserver.SparkJobInvalid
  9. object WordCount extends SparkJob{
  10. def main(args: Array[String]) {
  11.     val sc = new SparkContext("local[4]", "WordCountExample")
  12.     val config = ConfigFactory.parseString("")
  13.     val results = runJob(sc, config)
  14.     println("Result is " + results)
  15.   }
  16.   override def validate(sc: SparkContext, config: Config): SparkJobValidation = {
  17.     Try(config.getString("input.string"))
  18.       .map(x => SparkJobValid)
  19.       .getOrElse(SparkJobInvalid("No input.string config param"))
  20.   }
  21.   override def runJob(sc: SparkContext, config: Config): Any = {
  22.     val dd = sc.parallelize(config.getString("input.string").split(" ").toSeq)
  23.     val rsList = dd.map((_,1)).reduceByKey(_+_).map(x => (x._2,x._1)).sortByKey(false).collect
  24.     rsList(0)._2
  25.   }
  26. }
复制代码

打包出jar
  1. sbt package
  2. d:\scalaWorkplace\JobServer>sbt package
  3. [info] Loading global plugins from C:\Users\Administrator\.sbt\0.13\plugins
  4. [info] Set current project to job server demo (in build file:/D:/scalaWorkplace/JobServer/)
  5. [info] Compiling 1 Scala source to D:\scalaWorkplace\JobServer\target\scala-2.10\classes...
  6. [info] 'compiler-interface' not yet compiled for Scala 2.10.1. Compiling...
  7. [info]   Compilation completed in 38.446 s
  8. [info] Packaging D:\scalaWorkplace\JobServer\target\scala-2.10\job-server-demo_2.10-1.0.jar ...
  9. [info] Done packaging.
  10. [success] Total time: 45 s, completed 2014-7-12 16:59:06
复制代码

(3)提交jar
(4)测试
  1. curl -i -d "input.string=a a a b b c" 'localhost:8090/jobs?appName=example&classPath=com.persia.spark.WordCount'
  2. HTTP/1.1 202 Accepted
  3. Server: spray-can/1.2.0
  4. Date: Sat, 12 Jul 2014 09:22:26 GMT
  5. Content-Type: application/json; charset=UTF-8
  6. Content-Length: 150
  7. {
  8.   "status": "STARTED",
  9.   "result": {
  10.     "jobId": "b5b2e80f-1992-471f-8a5d-44c08c3a9731",
  11.     "context": "6bd9aa29-com.persia.spark.WordCount"
  12.   }
  13. }
复制代码

查看结果
  1. curl localhost:8090/jobs/b5b2e80f-1992-471f-8a5d-44c08c3a9731
复制代码

报找不到方法,修改源代码的

Dependencies.scala

  1. root@caibosi:~/app/spark-jobserver/project# ls
  2. Assembly.scala  Build.scala  Dependencies.scala  build.properties  plugins.sbt  project  target
  3. root@caibosi:~/app/spark-jobserver/project# vim Dependencies.scala
  4. lazy val sparkDeps = Seq(
  5.     "org.apache.spark" %% "spark-core" % "1.0.0" exclude("io.netty", "netty-all"),
  6.     // Force netty version.  This avoids some Spark netty dependency problem.
  7.     "io.netty" % "netty" % "3.6.6.Final"
  8.   )
复制代码










加微信w3aboutyun,可拉入技术爱好者群

没找到任何评论,期待你打破沉寂

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条