本帖最后由 breaking 于 2016-3-4 21:24 编辑
问题导读:
1.怎么去安装hadoop?
2.怎么去安装zookeeper?
3.怎么去安装spark?
4.怎么去测试安装正确性?
一、集群规划 主机名 IP地址 安装软件 运行进程 Master 10.0.0.170 JDK Scala HadoopSpark HistoryServer QuorumPeerMain SecondaryNameNode NameNodeMaster ResourceManag
Worker1 10.0.0.171 JDK Scala HadoopSpark QuorumPeerMain NodeManager DataNode Worker
Worker1 10.0.0.172 JDK Scala HadoopSpark QuorumPeerMain NodeManager DataNode Worker
Worker1 10.0.0.173 JDK Scala Hadoop Spark NodeManager DataNode Worker
Worker1 10.0.0.174 JDK Scala HadoopSpark NodeManager DataNode Worker
二 安装zookeeper集群
上传zookeeper
我们有rz 上传 要先安装lrzsz
找到zookeeper文件 点添加 再确定
上传完毕
解压 root@master:/tools# tar -zxf zookeeper-3.4.6.tar.gz -C /usr/local/zookeeper/
修改配置文件 root@master:/tools# cd /usr/local/zookeeper/zookeeper-3.4.6/conf/ root@master:/usr/local/zookeeper/zookeeper-3.4.6/conf# ll total 20 drwxr-xr-x 2 citic citic 4096 Feb 20 2014 ./ drwxr-xr-x 10 citic citic 4096 Feb 20 2014 ../ -rw-rw-r-- 1 citic citic 535 Feb 20 2014 configuration.xsl -rw-rw-r-- 1 citic citic 2161 Feb 20 2014 log4j.properties -rw-rw-r-- 1 citic citic 922 Feb 20 2014 zoo_sample.cfg root@master:/usr/local/zookeeper/zookeeper-3.4.6/conf# mv zoo_sample.cfg zoo.cfg 在12行添加以下内容 dataDir=/usr/local/zookeeper/zookeeper-3.4.6/data dataLogDir=/usr/local/zookeeper/zookeeper-3.4.6/logs server.0=Master:2888:3888 server.1=Worker1:2888:3888 server.2=Worker2:2888:3888
创建一个data和logs文件夹
root@master:/usr/local/zookeeper/zookeeper-3.4.6/conf# mkdir -p /usr/local/zookeeper/zookeeper-3.4.6/data root@master:/usr/local/zookeeper/zookeeper-3.4.6/conf# mkdir -p /usr/local/zookeeper/zookeeper-3.4.6/logs root@master:/usr/local/zookeeper/zookeeper-3.4.6/conf# echo 0> /usr/local/zookeeper/zookeeper-3.4.6/data/myid root@master:/usr/local/zookeeper/zookeeper-3.4.6/conf# cat /usr/local/zookeeper/zookeeper-3.4.6/data/myid 1
配置好zookeeper免秘钥登陆
把zookeeper 目录scp到work1和work2 上去 root@master:/usr/local/zookeeper/zookeeper-3.4.6/conf# cd ~
在Worker1上执行下面命令 root@master:/usr/local/zookeeper/zookeeper-3.4.6/conf# echo1 > /usr/local/zookeeper/zookeeper-3.4.6/data/myid 在Worker2上执行下面命令 root@master:/usr/local/zookeeper/zookeeper-3.4.6/conf# echo2 > /usr/local/zookeeper/zookeeper-3.4.6/data/myid
启动zookeeper集群 ###注意:严格按照下面的步骤
2.1 在 MsaterWorker1 Worker2
上启动zookeeper Master root@master:~# cd /usr/local/zookeeper/zookeeper-3.4.6/bin root@master:/usr/local/zookeeper/zookeeper-3.4.6/bin# ./zkServer.sh start
Worker1 root@work1:~# cd /usr/local/zookeeper/zookeeper-3.4.6/bin root@work1:/usr/local/zookeeper/zookeeper-3.4.6/bin# ./zkServer.sh start Worker2 root@work2:~# cd /usr/local/zookeeper/zookeeper-3.4.6/bin root@work2:/usr/local/zookeeper/zookeeper-3.4.6/bin# ./zkServer.sh start #查看状态:一个leader,两个follower
三、安装配置hadoop集群
3.1hadoop集群规划: namenode Master resourcemanage Master QuorumPeerMain Master Worker1 Worker2 datenode Worker1 Worker2 Worker3 Worker4 NodeManager Worker1 Worker2 Worker3 Worker4
3.2 解压 root@Master:/tools# tar -zxf hadoop-2.6.0.tar.gz -C /usr/local/hadoop/ root@Master:/tools# cd /usr/local/hadoop/hadoop-2.6.0/etc/hadoop/
3.3修改配置文件
3.3.1修改hadoo-env.sh root@Master:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# vim hadoop-env.sh
3.3.2修改core-site.xml
root@Master:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# vim core-site.xml 添加以下内容: <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://Master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/hadoop-2.6.0/tmp</value> </property> <property> <name>hadoop.native.lib</name>
3.3.3修改hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>Master:50090</value> <description>The secondary namenode http server address and port.</description> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/hadoop/hadoop-2.6.0/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/hadoop/hadoop-2.6.0/dfs/data</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>file:///usr/local/hadoop/hadoop-2.6.0/dfs/namesecondary</value> <description>Determines where on the local filesystem the DFSsecondary name node should store the temporary images to merge. If this is acomma-delimited list of directories then the image is replicated in all of thedirectories for redundancy.</descri ption> </property> </configuration>
3.3.4修改yarn-site.xml root@Master:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# vim yarn-site.xml <configuration>
<!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.hostname</name> <value>Master</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
3.3.5修改 mapred-site.xml root@Master:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# mv mapred-site.xml.template mapred-site.xml root@Master:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# vim mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
3.3.6修改slaves ( slaves是指定子节点的位置) root@Master:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# vim slaves Worker1 Worker2 Worker3 Worker4
3.3.7 一键分发到各个服务器 #!/bin/sh for i in 1 2 3 4 do done
4 启动hadoop
4.1先启动zookeeper集群 格式 化HDFS
root@Master:/usr/local/hadoop/hadoop-2.6.0/bin# ./hdfs namenode -format
看到successful就表示格式成功
4.2启动HDFS(在Master上执行即可) root@Master:/usr/local/hadoop/hadoop-2.6.0/bin# .cd ../sbin/ root@Master:/usr/local/hadoop/hadoop-2.6.0/sbin# ../start-dfs.sh
4.3 启动yarn root@Master:/usr/local/hadoop/hadoop-2.6.0/sbin./start-yarn.sh
5 安装spark 集群
5.1 解压 root@Master:/tools# tar -zxf spark-1.6.0-bin-hadoop2.6.tgz -C /usr/local/spark/ root@Master:/tools# cd /usr/local/spark/spark-1.6.0-bin-hadoop2.6/conf/
5.2 修改配置文件
5.2.1修改spark-env.sh root@Master:/usr/local/spark/spark-1.6.0-bin-hadoop2.6#mv spark-env.sh.template spark-env.sh root@Master:/usr/local/spark/spark-1.6.0-bin-hadoop2.6# vim spark-env.sh 在最后添加以下内容 export JAVA_HOME=/usr/local/jdk/jdk1.8.0_60 export SCALA_HOME=/usr/local/scala/scala-2.10.4 export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.0 export HADOOP_CONF_DIR=/usr/local/hadoop/hadoop-2.6.0/etc/hadoop export SPARK_MASTER_IP=Master #export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=Master:2181,Worker1:2181,Worker2:2 181 -Dspark.deploy.zookeeper.dir=/spark" export SPARK_WORKER_MEMORY=1g export SPARK_EXECUTOR_MEMORY=1g export SPARK_DRIVER_MEMORY=1G export SPARK_WORKER_CORES=2
5.2.2 修改 slaves root@Master:/usr/local/spark/spark-1.6.0-bin-hadoop2.6/conf#mv slaves.template slaves root@Master:/usr/local/spark/spark-1.6.0-bin-hadoop2.6/conf# vim slaves Worker1 Worker2 Worker3 Worker4
5.2.3 修改 spark-defaults.conf root@Master:/usr/local/spark/spark-1.6.0-bin-hadoop2.6/conf#mv spark-defaults.conf.template spark-defaults.conf root@Master:/usr/local/spark/spark-1.6.0-bin-hadoop2.6/conf# vim spark-defaults.conf spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three" spark.eventLog.enabled true spark.eventLog.dir hdfs://Master:9000/historyServerforSpark spark.yarn.historyServer.address Master:18080 spark.history.fs.logDirectory hdfs://Master:9000/historyServerforSpark
5.6 一键分发到各个服务器上去
#!/bin/sh for i in 1 2 3 4 do done
5.7.需要到hdfs 系统上创建/historyServerforSpark目录
5.8.启动spark集群 root@Master:~# cd /usr/local/spark/spark-1.6.0-bin-hadoop2.6/sbin/ root@Master:/usr/local/spark/spark-1.6.0-bin-hadoop2.6/sbin# ./start-all.sh
5.9.启动history-serve root@Master:/usr/local/spark/spark-1.6.0-bin-hadoop2.6/sbin# ./start-history-server.sh
5.10 验证 Master
Worker1 Worker2 Worker3 Worker4
集群搭建完毕
|