在Linux上编译Hadoop-2.4.0实践与总结
问题导读:1.编译源码前需要安装哪些软件?
2.安装之后该如何设置环境变量?
3.为什么不要使用JDK1.8?
4.mvn package -Pdist -DskipTests -Dtar的作用是什么?
static/image/hrline/4.gif
1. 前言
Hadoop-2.4.0的源码目录下有个BUILDING.txt文件,它介绍了如何在Linux和Windows下编译源代码,本文基本是遵照BUILDING.txt指示来操作的,这里再做一下简单的提炼。
第一次编译要求能够访问互联网,Hadoop的编译依赖非常多的东西,一定要保证机器可访问互联网,否则难逐一解决所有的编译问题,但第一次之后的编译则不用再下载了。
如不能上网可以参考:虚拟机三种网络模式该如何上网指导
2. 安装依赖
在编译Hadoop 2.4.0源码之前,需要将下列几个依赖的东西安装好:
1) JDK 1.6或更新版本(本文使用JDK1.7,请不要安装JDK1.8版本,JDK1.8和Hadoop 2.4.0不匹配,编译Hadoop 2.4.0源码时会报很多错误)
2) Maven 3.0或更新版本
3) ProtocolBuffer 2.5.0
4) CMake 2.6或更新版本
5) Findbugs 1.3.9,可选的(本文编译时未安装)
在安装好之后,还需要设置一下环境变量,可以修改/etc/profile,也可以是修改~/.profile,增加如下内容:
export JAVA_HOME=/root/jdk
export CLASSPATH=$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH
export CMAKE_HOME=/root/cmake
export PATH=$CMAKE_HOME/bin:$PATH
export PROTOC_HOME=/root/protobuf
export PATH=$PROTOC_HOME/bin:$PATH
export MAVEN_HOME=/root/maven
export PATH=$MAVEN_HOME/bin:$PATH
本文以root用户在/root目录下进行安装,但实际可以选择非root用户及非/root目录进行安装。
2.1. 安装ProtocolBuffer
标准的automake编译安装方式:
1) cd /root
2) tar xzf protobuf-2.5.0.tar.gz
3) cd protobuf-2.5.0
4) ./conigure --prefix=/root/protobuf
5) make
6) make install
2.2. 安装CMake
1) cd /root
2) tar xzf cmake-2.8.12.2.tar.gz
3) cd cmake-2.8.12.2
4) ./bootstrap --prefix=/root/cmake
5) make
6) make install
2.3. 安装JDK
1) cd /root
2) tar xzf jdk-7u55-linux-x64.gz
3) cd jdk1.7.0_55
4) ln -s jdk1.7.0_55 jdk
2.4. 安装Maven
1) cd /root
2) tar xzf apache-maven-3.0.5-bin.tar.gz
3) ln -s apache-maven-3.0.5 maven
3. 编译Hadoop源代码
完成上述准备工作后,即可通过执行命令:mvn package -Pdist -DskipTests -Dtar,启动对Hadoop源代码的编译。请注意一定不要使用JDK1.8。
编译成功后,jar文件会放在target子目录下,可以在Hadoop源码目录下借用find命令搜索各个target子目录。
编译成功后,会生成Hadoop二进制安装包hadoop-2.4.0.tar.gz,放在源代码的hadoop-dist/target子目录下:
main:
$ tar cf hadoop-2.4.0.tar hadoop-2.4.0
$ gzip -f hadoop-2.4.0.tar
Hadoop dist tar available at: /root/hadoop-2.4.0-src/hadoop-dist/target/hadoop-2.4.0.tar.gz
Executed tasks
--- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ hadoop-dist ---
Building jar: /root/hadoop-2.4.0-src/hadoop-dist/target/hadoop-dist-2.4.0-javadoc.jar
------------------------------------------------------------------------
Reactor Summary:
Apache Hadoop Main ................................ SUCCESS
Apache Hadoop Project POM ......................... SUCCESS
Apache Hadoop Annotations ......................... SUCCESS
Apache Hadoop Assemblies .......................... SUCCESS
Apache Hadoop Project Dist POM .................... SUCCESS
Apache Hadoop Maven Plugins ....................... SUCCESS
Apache Hadoop MiniKDC ............................. SUCCESS
Apache Hadoop Auth ................................ SUCCESS
Apache Hadoop Auth Examples ....................... SUCCESS
Apache Hadoop Common .............................. SUCCESS
Apache Hadoop NFS ................................. SUCCESS
Apache Hadoop Common Project ...................... SUCCESS
Apache Hadoop HDFS ................................ SUCCESS
Apache Hadoop HttpFS .............................. SUCCESS
Apache Hadoop HDFS BookKeeper Journal ............. SUCCESS
Apache Hadoop HDFS-NFS ............................ SUCCESS
Apache Hadoop HDFS Project ........................ SUCCESS
hadoop-yarn ....................................... SUCCESS
hadoop-yarn-api ................................... SUCCESS
hadoop-yarn-common ................................ SUCCESS
hadoop-yarn-server ................................ SUCCESS
hadoop-yarn-server-common ......................... SUCCESS
hadoop-yarn-server-nodemanager .................... SUCCESS
hadoop-yarn-server-web-proxy ...................... SUCCESS
hadoop-yarn-server-applicationhistoryservice ...... SUCCESS
hadoop-yarn-server-resourcemanager ................ SUCCESS
hadoop-yarn-server-tests .......................... SUCCESS
hadoop-yarn-client ................................ SUCCESS
hadoop-yarn-applications .......................... SUCCESS
hadoop-yarn-applications-distributedshell ......... SUCCESS
hadoop-yarn-applications-unmanaged-am-launcher .... SUCCESS
hadoop-yarn-site .................................. SUCCESS
hadoop-yarn-project ............................... SUCCESS
hadoop-mapreduce-client ........................... SUCCESS
hadoop-mapreduce-client-core ...................... SUCCESS
hadoop-mapreduce-client-common .................... SUCCESS
hadoop-mapreduce-client-shuffle ................... SUCCESS
hadoop-mapreduce-client-app ....................... SUCCESS
hadoop-mapreduce-client-hs ........................ SUCCESS
hadoop-mapreduce-client-jobclient ................. SUCCESS
hadoop-mapreduce-client-hs-plugins ................ SUCCESS
Apache Hadoop MapReduce Examples .................. SUCCESS
hadoop-mapreduce .................................. SUCCESS
Apache Hadoop MapReduce Streaming ................. SUCCESS
Apache Hadoop Distributed Copy .................... SUCCESS
Apache Hadoop Archives ............................ SUCCESS
Apache Hadoop Rumen ............................... SUCCESS
Apache Hadoop Gridmix ............................. SUCCESS
Apache Hadoop Data Join ........................... SUCCESS
Apache Hadoop Extras .............................. SUCCESS
Apache Hadoop Pipes ............................... SUCCESS
Apache Hadoop OpenStack support ................... SUCCESS
Apache Hadoop Client .............................. SUCCESS
Apache Hadoop Mini-Cluster ........................ SUCCESS
Apache Hadoop Scheduler Load Simulator ............ SUCCESS
Apache Hadoop Tools Dist .......................... SUCCESS
Apache Hadoop Tools ............................... SUCCESS
Apache Hadoop Distribution ........................ SUCCESS
------------------------------------------------------------------------
BUILD SUCCESS
------------------------------------------------------------------------
Total time: 21:57.892s
Finished at: Mon Apr 21 14:33:22 CST 2014
Final Memory: 88M/243M
------------------------------------------------------------------------
static/image/hrline/2.gif
附1:编译环境
整个过程是在阿里云64位主机上进行的,2.30GHz单核1G内存:
# uname -a
Linux AY140408105805619186Z 2.6.18-308.el5 #1 SMP Tue Feb 21 20:06:06 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
# cat /etc/redhat-release
CentOS release 5.8 (Final)
附2:版本信息
名称
版本
包名
说明
Maven
3.0.5
apache-maven-3.0.5-bin.tar.gz
使用3.2.1可能会有问题
CMake
2.8.12.2
cmake-2.8.12.2.tar.gz
JDK
1.7.0
jdk-7u55-linux-x64.gz
不能使用JDK1.8.0
Protocol Buffers
2.5.0
protobuf-2.5.0.tar.gz
Hadoop
2.4.0
hadoop-2.4.0-src.tar.gz
附3:常见错误
1) unexpected end tag: </ul>
Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:2.8.1:jar (module-javadocs) on project hadoop-annotations: MavenReportException: Error while creating archive:
Exit code: 1 - /root/hadoop-2.4.0-src/hadoop-common-project/hadoop-annotations/src/main/java/org/apache/hadoop/classification/InterfaceStability.java:27: error: unexpected end tag: </ul>
* </ul>
^
Command line was: /root/jdk1.8.0/jre/../bin/javadoc @options @packages
原因是InterfaceStability.java中的注释问题:
解决办法,将JDK换成1.7版本,使用JDK1.8编译就会遇到上述问题,将</ul>行删除可以解决问题,但后续还会遇到类似的问题,所以不要使用JDK1.8编译Hadoop 2.4.0。
{:soso_e179:}果然强大 学习了,好经验,适合我这样的小白菜。{:soso_e100:}
学习了,好经验
页:
[1]