pig2 发表于 2014-6-23 19:04:19

从零教你如何获取hadoop2.4源码并使用eclipse关联hadoop2.4源码

问题导读:
1.如何通过官网src包,获取hadoop的全部代码
2.通过什么样的操作,可以查看hadoop某个函数或则类的实现?
3.maven的作用是什么?

static/image/hrline/4.gif




我们如果想搞开发,研究源码对我们的帮助很大。不明白原理就如同黑盒子,遇到问题,我们也摸不着思路。所以这里交给大家
一.如何获取源码
二.如何关联源码

一.如何获取源码

1.下载hadoop的maven程序包

(1)官网下载
这里我们先从官网上下载maven包hadoop-2.4.0-src.tar.gz。
官网下载地址

对于不知道怎么去官网下载,可以查看:新手指导:hadoop官网介绍及如何下载hadoop(2.4)各个版本与查看hadoop API介绍

(2)网盘下载
也可以从网盘下载:
http://pan.baidu.com/s/1kToPuGB

2.通过maven获取源码
获取源码的方式有两种,一种是通过命令行的方式,一种是通过eclipse。这里主要讲通过命令的方式

通过命令的方式获取源码:
1.解压包




解压包的时候遇到了下面问题。不过不用管,我们继续往下走
1      : 无法创建文件:D:\hadoop2\hadoop-2.4.0-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-applicationhistoryservice\target\classes\org\apache\hadoop\yarn\server\applicationhistoryservice\ApplicationHistoryClientService$ApplicationHSClientProtocolHandler.class:
路径和文件名总长度不能超过260个字符
系统找不到指定的路径。      D:\hadoop2\hadoop-2.4.0-src.zip
2      : 无法创建文件:D:\hadoop2\hadoop-2.4.0-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-applicationhistoryservice\target\classes\org\apache\hadoop\yarn\server\applicationhistoryservice\timeline\LeveldbTimelineStore$LockMap$CountingReentrantLock.class:系统找不到指定的路径。      D:\hadoop2\hadoop-2.4.0-src.zip
3      : 无法创建文件:D:\hadoop2\hadoop-2.4.0-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-applicationhistoryservice\target\test-classes\org\apache\hadoop\yarn\server\applicationhistoryservice\webapp\TestAHSWebApp$MockApplicationHistoryManagerImpl.class:系统找不到指定的路径。      D:\hadoop2\hadoop-2.4.0-src.zip
4      : 无法创建文件:D:\hadoop2\hadoop-2.4.0-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\target\test-classes\org\apache\hadoop\yarn\server\resourcemanager\monitor\capacity\TestProportionalCapacityPreemptionPolicy$IsPreemptionRequestFor.class:
路径和文件名总长度不能超过260个字符
系统找不到指定的路径。      D:\hadoop2\hadoop-2.4.0-src.zip
5      : 无法创建文件:D:\hadoop2\hadoop-2.4.0-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\target\test-classes\org\apache\hadoop\yarn\server\resourcemanager\recovery\TestFSRMStateStore$TestFSRMStateStoreTester$TestFileSystemRMStore.class:系统找不到指定的路径。      D:\hadoop2\hadoop-2.4.0-src.zip
6      : 无法创建文件:D:\hadoop2\hadoop-2.4.0-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\target\test-classes\org\apache\hadoop\yarn\server\resourcemanager\recovery\TestZKRMStateStore$TestZKRMStateStoreTester$TestZKRMStateStoreInternal.class:
路径和文件名总长度不能超过260个字符
系统找不到指定的路径。      D:\hadoop2\hadoop-2.4.0-src.zip
7      : 无法创建文件:D:\hadoop2\hadoop-2.4.0-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\target\test-classes\org\apache\hadoop\yarn\server\resourcemanager\recovery\TestZKRMStateStoreZKClientConnections$TestZKClient$TestForwardingWatcher.class:
路径和文件名总长度不能超过260个字符
系统找不到指定的路径。      D:\hadoop2\hadoop-2.4.0-src.zip
8      : 无法创建文件:D:\hadoop2\hadoop-2.4.0-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\target\test-classes\org\apache\hadoop\yarn\server\resourcemanager\recovery\TestZKRMStateStoreZKClientConnections$TestZKClient$TestZKRMStateStore.class:
路径和文件名总长度不能超过260个字符
系统找不到指定的路径。      D:\hadoop2\hadoop-2.4.0-src.zip
9      : 无法创建文件:D:\hadoop2\hadoop-2.4.0-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\target\test-classes\org\apache\hadoop\yarn\server\resourcemanager\rmapp\attempt\TestRMAppAttemptTransitions$TestApplicationAttemptEventDispatcher.class:
路径和文件名总长度不能超过260个字符
系统找不到指定的路径。      D:\hadoop2\hadoop-2.4.0-src.zip


2.通过maven获取源码

这里需要说明的是,在使用maven的时候,需要先安装jdk,protoc ,如果没有安装可以参考win7如何安装maven、安装protoc

(1)进入hadoop-2.4.0-src\hadoop-maven-plugins,运行mvn install
D:\hadoop2\hadoop-2.4.0-src\hadoop-maven-plugins>mvn install

显示如下信息

Scanning for projects...

Some problems were encountered while building the effective model for
org.apache.hadoop:hadoop-maven-plugins:maven-plugin:2.4.0
'build.plugins.plugin.(groupId:artifactId)' must be unique but found d
uplicate declaration of plugin org.apache.maven.plugins:maven-enforcer-plugin @
org.apache.hadoop:hadoop-project:2.4.0, D:\hadoop2\hadoop-2.4.0-src\hadoop-proje
ct\pom.xml, line 1015, column 15

It is highly recommended to fix these problems because they threaten t
he stability of your build.

For this reason, future Maven versions might no longer support buildin
g such malformed projects.


Using the builder org.apache.maven.lifecycle.internal.builder.singlethrea
ded.SingleThreadedBuilder with a thread count of 1

------------------------------------------------------------------------
Building Apache Hadoop Maven Plugins 2.4.0
------------------------------------------------------------------------

--- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-maven-plugins
---
Executing tasks

main:
Executed tasks

--- maven-plugin-plugin:3.0:descriptor (default-descriptor) @ hadoop-mave
n-plugins ---
Using 'UTF-8' encoding to read mojo metadata.
Applying mojo extractor for language: java-annotations
Mojo extractor for language: java-annotations found 2 mojo descriptors.
Applying mojo extractor for language: java
Mojo extractor for language: java found 0 mojo descriptors.
Applying mojo extractor for language: bsh
Mojo extractor for language: bsh found 0 mojo descriptors.

--- maven-resources-plugin:2.2:resources (default-resources) @ hadoop-mav
en-plugins ---
Using default encoding to copy filtered resources.

--- maven-compiler-plugin:2.5.1:compile (default-compile) @ hadoop-maven-
plugins ---
Nothing to compile - all classes are up to date

--- maven-plugin-plugin:3.0:descriptor (mojo-descriptor) @ hadoop-maven-p
lugins ---
Using 'UTF-8' encoding to read mojo metadata.
Applying mojo extractor for language: java-annotations
Mojo extractor for language: java-annotations found 2 mojo descriptors.
Applying mojo extractor for language: java
Mojo extractor for language: java found 0 mojo descriptors.
Applying mojo extractor for language: bsh
Mojo extractor for language: bsh found 0 mojo descriptors.

--- maven-resources-plugin:2.2:testResources (default-testResources) @ ha
doop-maven-plugins ---
Using default encoding to copy filtered resources.

--- maven-compiler-plugin:2.5.1:testCompile (default-testCompile) @ hadoo
p-maven-plugins ---
No sources to compile

--- maven-surefire-plugin:2.16:test (default-test) @ hadoop-maven-plugins
---
No tests to run.

--- maven-jar-plugin:2.3.1:jar (default-jar) @ hadoop-maven-plugins ---
Building jar: D:\hadoop2\hadoop-2.4.0-src\hadoop-maven-plugins\target\had
oop-maven-plugins-2.4.0.jar

--- maven-plugin-plugin:3.0:addPluginArtifactMetadata (default-addPluginA
rtifactMetadata) @ hadoop-maven-plugins ---

--- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hadoop-
maven-plugins ---

--- maven-install-plugin:2.3.1:install (default-install) @ hadoop-maven-p
lugins ---
Installing D:\hadoop2\hadoop-2.4.0-src\hadoop-maven-plugins\target\hadoop
-maven-plugins-2.4.0.jar to C:\Users\hyj\.m2\repository\org\apache\hadoop\hadoop
-maven-plugins\2.4.0\hadoop-maven-plugins-2.4.0.jar
Installing D:\hadoop2\hadoop-2.4.0-src\hadoop-maven-plugins\pom.xml to C:
\Users\hyj\.m2\repository\org\apache\hadoop\hadoop-maven-plugins\2.4.0\hadoop-ma
ven-plugins-2.4.0.pom
------------------------------------------------------------------------
BUILD SUCCESS
------------------------------------------------------------------------
Total time: 4.891 s
Finished at: 2014-06-23T14:47:33+08:00
Final Memory: 21M/347M
------------------------------------------------------------------------
部分截图如下:








(2)运行

mvn eclipse:eclipse -DskipTests 这时候注意,我们进入的是hadoop_home,我这里是D:\hadoop2\hadoop-2.4.0-src

部分信息如下

------------------------------------------------------------------------
Reactor Summary:

Apache Hadoop Main ................................ SUCCESS
Apache Hadoop Project POM ......................... SUCCESS
Apache Hadoop Annotations ......................... SUCCESS
Apache Hadoop Project Dist POM .................... SUCCESS
Apache Hadoop Assemblies .......................... SUCCESS
Apache Hadoop Maven Plugins ....................... SUCCESS
Apache Hadoop MiniKDC ............................. SUCCESS
Apache Hadoop Auth ................................ SUCCESS
Apache Hadoop Auth Examples ....................... SUCCESS
Apache Hadoop Common .............................. SUCCESS
Apache Hadoop NFS ................................. SUCCESS
Apache Hadoop Common Project ...................... SUCCESS
Apache Hadoop HDFS ................................ SUCCESS
Apache Hadoop HttpFS .............................. SUCCESS
Apache Hadoop HDFS BookKeeper Journal ............. SUCCESS
Apache Hadoop HDFS-NFS ............................ SUCCESS
Apache Hadoop HDFS Project ........................ SUCCESS
hadoop-yarn ....................................... SUCCESS
hadoop-yarn-api ................................... SUCCESS
hadoop-yarn-common ................................ SUCCESS
hadoop-yarn-server ................................ SUCCESS
hadoop-yarn-server-common ......................... SUCCESS
hadoop-yarn-server-nodemanager .................... SUCCESS
hadoop-yarn-server-web-proxy ...................... SUCCESS
hadoop-yarn-server-applicationhistoryservice ...... SUCCESS
hadoop-yarn-server-resourcemanager ................ SUCCESS
hadoop-yarn-server-tests .......................... SUCCESS
hadoop-yarn-client ................................ SUCCESS
hadoop-yarn-applications .......................... SUCCESS
hadoop-yarn-applications-distributedshell ......... SUCCESS
hadoop-yarn-applications-unmanaged-am-launcher .... SUCCESS
hadoop-yarn-site .................................. SUCCESS
hadoop-yarn-project ............................... SUCCESS
hadoop-mapreduce-client ........................... SUCCESS
hadoop-mapreduce-client-core ...................... SUCCESS
hadoop-mapreduce-client-common .................... SUCCESS
hadoop-mapreduce-client-shuffle ................... SUCCESS
hadoop-mapreduce-client-app ....................... SUCCESS
hadoop-mapreduce-client-hs ........................ SUCCESS
hadoop-mapreduce-client-jobclient ................. SUCCESS
hadoop-mapreduce-client-hs-plugins ................ SUCCESS
Apache Hadoop MapReduce Examples .................. SUCCESS
hadoop-mapreduce .................................. SUCCESS
Apache Hadoop MapReduce Streaming ................. SUCCESS
Apache Hadoop Distributed Copy .................... SUCCESS
Apache Hadoop Archives ............................ SUCCESS
Apache Hadoop Rumen ............................... SUCCESS
Apache Hadoop Gridmix ............................. SUCCESS
Apache Hadoop Data Join ........................... SUCCESS
Apache Hadoop Extras .............................. SUCCESS
Apache Hadoop Pipes ............................... SUCCESS
Apache Hadoop OpenStack support ................... SUCCESS
Apache Hadoop Client .............................. SUCCESS
Apache Hadoop Mini-Cluster ........................ SUCCESS
Apache Hadoop Scheduler Load Simulator ............ SUCCESS
Apache Hadoop Tools Dist .......................... SUCCESS
Apache Hadoop Tools ............................... SUCCESS
Apache Hadoop Distribution ........................ SUCCESS
------------------------------------------------------------------------
BUILD SUCCESS
------------------------------------------------------------------------
Total time: 31.234 s
Finished at: 2014-06-23T14:55:08+08:00
Final Memory: 84M/759M
------------------------------------------------------------------------这时候,我们已经把源码给下载下来了。这时候,我们会看到文件会明显增大。






3.关联eclipse源码

加入我们以下程序

如下图示,对他们进行了打包


这两个文件, MaxTemperature.zip为mapreduce例子,mockito-core-1.8.5.jar为mapreduce例子所引用的包
(这里需要说明的是,mapreduce为2.2,但是不影响关联源码,只是交给大家该如何关联源码)
我们解压之后,导入eclipse
(对于导入项目不熟悉,参考零基础教你如何导入eclipse项目)




我们导入之后,看到很多的红线,这些其实都是没有引用包,下面我们开始解决这些语法问题。
一、解决导入jar包
(1)引入mockito-core-1.8.5.jar

(2)hadoop2.4编译包中的jar文件,这些文件的位置如下:

hadoop_home中share\hadoop文件夹下,具体我这的位置D:\hadoop2\hadoop-2.4.0\share\hadoop
找到里面的jar包,举例如下:lib文件中的jar包,以及下面的jar包都添加到buildpath中。
如果对于引用包,不知道该如何添加这些jar包,参考hadoop开发方式总结及操作指导。
(注意的是,我们这里是引入的是编译包,编译的下载hadoop--642.4.0.tar.gz
链接: http://pan.baidu.com/s/1c0vPjG0 密码:xj6l)

更多包下载可以参考hadoop家族、strom、spark、Linux、flume等jar包、安装包汇总下载








二、关联源码
1.我们导入jar包之后,就没有错误了,如下图所示



2.找不到源码

当我们想看一个类或则函数怎么实现的时候,通过Open Call Hierarchy,却找不到源文件。








3.Attach Source



上面三处,我们按照顺序添加即可,我们选定压缩包之后,单击确定,ok了,我们的工作已经完毕。

注意:对于hadoop-2.2.0-src.zip则是我们上面通过maven下载的源码,然后压缩的文件,记得一定是压缩文件zip的形式


4.验证关联后查看源码

我们再次执行上面操作,通过Open Call Hierarchy




看到下面内容






然后我们双击上图主类,即红字部分,我们看到下面内容:




问题:
细心的同学,这里面我们产生一个问题,因为我们看到的是.class文件,而不是.java文件。那么他会不会和我们所看到的.java文件不一样那。
其实是一样的,感兴趣的同学,可以验证一下。


下一篇:
如何通过eclipse查看、阅读hadoop2.4源码


Victor-Shy 发表于 2014-10-31 15:11:32

呼,按着楼主的大作终于OK了,,不过2确实比1麻烦好多的说

gogaobin 发表于 2014-8-7 14:33:15

------------------------------------------------------------------------
Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:2.4.0:prot
oc (compile-protoc) on project hadoop-common: org.apache.maven.plugin.MojoExecut
ionException: 'protoc --version' did not return a version ->

To see the full stack trace of the errors, re-run Maven with the -e swit
ch.
Re-run Maven using the -X switch to enable full debug logging.

For more information about the errors and possible solutions, please rea
d the following articles:
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE
xception

After correcting the problems, you can resume the build with the command

   mvn <goals> -rf :hadoop-common

这个错误是怎么回事,已经正确安装protoc

sunny62520 发表于 2014-6-23 23:35:01

感谢分享,回头操作一遍

pig2 发表于 2014-6-23 23:51:36




Reactor Summary:

Apache Hadoop Main ................................ SUCCESS
Apache Hadoop Project POM ......................... SUCCESS
Apache Hadoop Annotations ......................... SUCCESS
Apache Hadoop Project Dist POM .................... SUCCESS
Apache Hadoop Assemblies .......................... SUCCESS
Apache Hadoop Maven Plugins ....................... SUCCESS
Apache Hadoop Auth ................................ FAILURE
Apache Hadoop Auth Examples ....................... SKIPPED
Apache Hadoop Common .............................. SKIPPED
Apache Hadoop NFS ................................. SKIPPED
Apache Hadoop Common Project ...................... SKIPPED
Apache Hadoop HDFS ................................ SKIPPED
Apache Hadoop HttpFS .............................. SKIPPED
Apache Hadoop HDFS BookKeeper Journal ............. SKIPPED
Apache Hadoop HDFS-NFS ............................ SKIPPED
Apache Hadoop HDFS Project ........................ SKIPPED
hadoop-yarn ....................................... SKIPPED
hadoop-yarn-api ................................... SKIPPED
hadoop-yarn-common ................................ SKIPPED
hadoop-yarn-server ................................ SKIPPED
hadoop-yarn-server-common ......................... SKIPPED
hadoop-yarn-server-nodemanager .................... SKIPPED
hadoop-yarn-server-web-proxy ...................... SKIPPED
hadoop-yarn-server-resourcemanager ................ SKIPPED
hadoop-yarn-server-tests .......................... SKIPPED
hadoop-yarn-client ................................ SKIPPED
hadoop-yarn-applications .......................... SKIPPED
hadoop-yarn-applications-distributedshell ......... SKIPPED
hadoop-mapreduce-client ........................... SKIPPED
hadoop-mapreduce-client-core ...................... SKIPPED
hadoop-yarn-applications-unmanaged-am-launcher .... SKIPPED
hadoop-yarn-site .................................. SKIPPED
hadoop-yarn-project ............................... SKIPPED
hadoop-mapreduce-client-common .................... SKIPPED
hadoop-mapreduce-client-shuffle ................... SKIPPED
hadoop-mapreduce-client-app ....................... SKIPPED
hadoop-mapreduce-client-hs ........................ SKIPPED
hadoop-mapreduce-client-jobclient ................. SKIPPED
hadoop-mapreduce-client-hs-plugins ................ SKIPPED
Apache Hadoop MapReduce Examples .................. SKIPPED
hadoop-mapreduce .................................. SKIPPED
Apache Hadoop MapReduce Streaming ................. SKIPPED
Apache Hadoop Distributed Copy .................... SKIPPED
Apache Hadoop Archives ............................ SKIPPED
Apache Hadoop Rumen ............................... SKIPPED
Apache Hadoop Gridmix ............................. SKIPPED
Apache Hadoop Data Join ........................... SKIPPED
Apache Hadoop Extras .............................. SKIPPED
Apache Hadoop Pipes ............................... SKIPPED
Apache Hadoop Tools Dist .......................... SKIPPED
Apache Hadoop Tools ............................... SKIPPED
Apache Hadoop Distribution ........................ SKIPPED
Apache Hadoop Client .............................. SKIPPED
Apache Hadoop Mini-Cluster ........................ SKIPPED
------------------------------------------------------------------------
BUILD FAILURE
------------------------------------------------------------------------
Total time: 24.476 s
Finished at: 2014-06-23T08:30:58+08:00
Final Memory: 38M/305M
------------------------------------------------------------------------
Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.
5.1:testCompile (default-testCompile) on project hadoop-auth: Compilation failur
e: Compilation failure:
D:\hadoop2\hadoop-2.2.0-src\hadoop-common-project\hadoop-auth\src\test\j
ava\org\apache\hadoop\security\authentication\client\AuthenticatorTestCase.java:
错误: 无法访问AbstractLifeCycle
找不到org.mortbay.component.AbstractLifeCycle的类文件
D:\hadoop2\hadoop-2.2.0-src\hadoop-common-project\hadoop-auth\src\test\j
ava\org\apache\hadoop\security\authentication\client\AuthenticatorTestCase.java:
错误: 无法访问LifeCycle
找不到org.mortbay.component.LifeCycle的类文件
D:\hadoop2\hadoop-2.2.0-src\hadoop-common-project\hadoop-auth\src\test\j
ava\org\apache\hadoop\security\authentication\client\AuthenticatorTestCase.java:
错误: 找不到符号
符号:   方法 start()
位置: 类型为Server的变量 server
D:\hadoop2\hadoop-2.2.0-src\hadoop-common-project\hadoop-auth\src\test\j
ava\org\apache\hadoop\security\authentication\client\AuthenticatorTestCase.java:
错误: 找不到符号
->

To see the full stack trace of the errors, re-run Maven with the -e swit
ch.
Re-run Maven using the -X switch to enable full debug logging.

For more information about the errors and possible solutions, please rea
d the following articles:
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureExc
eption

After correcting the problems, you can resume the build with the command

   mvn <goals> -rf :hadoop-auth
出现这个错误的原因:命令的执行路径不正确。
正确执行命令的方式
D:\hadoop2\hadoop-2.4.0-src\hadoop-maven-plugins>mvn install

mvn eclipse:eclipse -DskipTests (是在D:\hadoop2\hadoop-2.4.0-src路径下)


xjl456852 发表于 2014-8-7 18:23:24

帖子很好,正在研究,编译源码用了2个多小时

admin 发表于 2014-8-8 21:22:08

gogaobin 发表于 2014-8-7 14:33
------------------------------------------------------------------------
Failed to execute ...

'protoc --version' did not return a version ->

虽然你正确安装了,但是还是没有返回版本,所以你安装的还是有问题

轩辕依梦Q 发表于 2014-9-23 10:10:06

gogaobin 发表于 2014-8-7 14:33
------------------------------------------------------------------------
Failed to execute ...

连接访问速度比较慢。链接:http://pan.baidu.com/s/1i3klLqd 密码:a3kd

轩辕依梦Q 发表于 2014-9-23 10:11:16

gogaobin 发表于 2014-8-7 14:33
------------------------------------------------------------------------
Failed to execute ...

你的问题解决了么?

轩辕依梦Q 发表于 2014-9-23 10:14:27

admin 发表于 2014-8-8 21:22
'protoc --version' did not return a version ->

虽然你正确安装了,但是还是没有返回版本, ...

你好:我使用的是win764位系统操作的,按照这个帖子操作的。http://www.aboutyun.com/forum.php?mod=viewthread&tid=8212&highlight=protoc,报同样的错误protoc --version' did not return a version -> ,怎样确定protoc是正确安装了。安装过程中没有遇到使用pom.xml。

pig2 发表于 2014-10-14 11:08:15

gogaobin 发表于 2014-8-7 14:33
------------------------------------------------------------------------
Failed to execute ...

按照上面,如果不能正确返回,请注意版本问题,jdk1.7
页: [1] 2 3 4 5
查看完整版本: 从零教你如何获取hadoop2.4源码并使用eclipse关联hadoop2.4源码