分享

使用Maven构建hadoop项目

levycui 发表于 2015-7-15 15:15:10 [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 0 15349
一. Maven介绍
Apache Maven,是一个Java的项目管理及自动构建工具,由Apache软件基金会所提供。基于项目对象模型(缩写:POM)概念,Maven利用一个中央信息片断能管理一个项目的构建、报告和文档等步骤。曾是Jakarta项目的子项目,现为独立Apache项目。

我的环境:
Windows 7
hadoop 1.2.1
Maven 3.3.3
Eclipse Indigo
jdk1.7.0_79

二. Maven安装(win)
下载Maven:http://maven.apache.org/download.cgi

下载最新的xxx-bin.zip文件,在win上解压到 D:\toolkit\maven3

并把maven/bin目录设置在环境变量PATH:环境变量-系统变量-D:\toolkit\maven3\bin
java_home环境变量也好设置好

然后,打开命令行输入mvn,我们会看到mvn命令的运行效果

C:\Users\Administrator>mvn
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 0.154 s
[INFO] Finished at: 2015-07-15T09:43:17+08:00
[INFO] Final Memory: 4M/15M
[INFO] ------------------------------------------------------------------------
[ERROR] No goals have been specified for this build. You must specify a valid li
fecycle phase or a goal in the format <plugin-prefix>:<goal> or <plugin-group-id
>:<plugin-artifact-id>[:<plugin-version>]:<goal>. Available lifecycle phases are
: validate, initialize, generate-sources, process-sources, generate-resources, p
rocess-resources, compile, process-classes, generate-test-sources, process-test-
sources, generate-test-resources, process-test-resources, test-compile, process-
test-classes, test, prepare-package, package, pre-integration-test, integration-
test, post-integration-test, verify, install, deploy, pre-clean, clean, post-cle
an, pre-site, site, post-site, site-deploy. -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please rea
d the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/NoGoalSpecifie
dException
这个有ERROR是正常的。

三、安装Eclipse的Maven插件:MavenIntegration for Eclipse

方法一:对网络要求高
1.启动eclipse,点击window——>preferences——>install/update——>available software sites,点击Add,

复制下载地址 http://download.eclipse.org/technology/m2e/releases

到eclipse的help菜单-> Install New Software 进行安装

Work with:http://download.eclipse.org/technology/m2e/releases
Add... - Name:maven - ok 等待一会 会出现两个m2e -  选择m2e - Maven Integration for Eclipse (includes Incubating components)    1.6.1.20150625-2338
安装大约需要半多小时间的时间。(由于网络不稳定,试了几次才装成功)


方法二:建议使用
link 离线安装 eclipse maven 插件

1. 在你的 eclipse 安装的根目录下创建两个文件夹:links,myplugins(名字可以随便取),我的这两个文件夹位于:D:/eclipse/(作为参考,下面用到)
2. 将http://download.csdn.net/download/bluerebel/7407455 提供下载的 eclipse-maven3-plugin.7z 解压缩到 myplugins 目录下,显示maven目录
3. 在 links 目录下创建一个 maven.txt(名字可以随便取),打开并输入:path=D:/eclipse/myplugins/maven(请参照上面对应你的 maven 插件)
4. 保存关闭 maven.txt,并将后缀改成 maven.link,重启 eclipse(如果你的 eclipse 没有开着,直接打开就行)


检查 eclipse 的 maven 插件是否安装成功:Window  -->  Preferences
Maven-User Sittings
D:\toolkit\maven3\conf\settings.xml

四、用Maven构建Hadoop环境
1. 用Maven创建一个标准化的Java项目
2. 导入项目到eclipse
3. 增加hadoop依赖,修改pom.xml
4. 下载依赖
5. 从Hadoop集群环境下载hadoop配置文件
6. 配置本地host


1). 用Maven创建一个标准化的Java项目

D:\workspace>mvn archetype:generate -DarchetypeGroupId=org.apache.maven.archetypes -DgroupId=org.conan.myhadoop.mr -DartifactId=myHadoop -

DpackageName=org.conan.myhadoop.mr -Dversion=1.0-SNAPSHOT -DinteractiveMode=false

[INFO]Scanning for projects...
[INFO]
[INFO]------------------------------------------------------------------------
[INFO]Building Maven Stub Project (No POM) 1
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] >>> maven-archetype-plugin:2.2:generate (default-cli) @standalone-pom >>>
[INFO]
[INFO] <<< maven-archetype-plugin:2.2:generate (default-cli) @standalone-pom <<<
[INFO]
[INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @standalone-pom ---
[INFO]Generating project in Batch mode
[INFO]No archetype defined. Using maven-archetype-quickstart (org.apache.maven.archetypes:maven-archetype-quickstart:1.0)
Downloading:http://repo.maven.apache.org/mav ... -quickstart-1.0.jar
Downloaded:http://repo.maven.apache.org/mav ... ven/archetypes/mave
n-archetype-quickstart/1.0/maven-archetype-quickstart-1.0.jar(5 KB at 4.5 KB/sec)
Downloading:http://repo.maven.apache.org/mav ... -quickstart-1.0.pom
Downloaded:http://repo.maven.apache.org/mav ... -quickstart-1.0.pom(703 B at 1.2

KB/sec)
[INFO]----------------------------------------------------------------------------
[INFO]Using following parameters for creating project from Old (1.x)Archetype:maven-archetype-quickstart:1.0
[INFO]----------------------------------------------------------------------------
[INFO]Parameter: groupId, Value: org.conan.myhadoop.mr
[INFO]Parameter: packageName, Value: org.conan.myhadoop.mr
[INFO]Parameter: package, Value: org.conan.myhadoop.mr
[INFO]Parameter: artifactId, Value: myHadoop
[INFO]Parameter: basedir, Value: C:\Users\licz
[INFO]Parameter: version, Value: 1.0-SNAPSHOT
[INFO]project created from Old (1.x) Archetype in dir: C:\Users\licz\myHadoop
[INFO]------------------------------------------------------------------------
[INFO]BUILD SUCCESS
[INFO]------------------------------------------------------------------------
[INFO]Total time: 36.633s
[INFO]Finished at: Thu Jan 16 10:31:44 CST 2014
[INFO]Final Memory: 11M/490M
[INFO]------------------------------------------------------------------------



进入项目,执行mvn命令

D:\workspace>cd myHadoop

D:\workspace\myHadoop>mvn clean install

……
[INFO]Installing C:\Users\licz\myHadoop\target\myHadoop-1.0-SNAPSHOT.jar to C:\Users\licz\.m2\repository\org\conan\myhadoop\mr\myHadoop\1.0-SNAPSHOT\myHadoop-

1.0-SNAPSHOT.jar
[INFO]Installing C:\Users\licz\myHadoop\pom.xml to C:\Users\licz\.m2\repository\org\conan\myhadoop\mr\myHadoop\1.0-SNAPSHOT\myHadoop-1.0-SNAPSHOT.pom
[INFO]------------------------------------------------------------------------
[INFO]BUILD SUCCESS
[INFO]------------------------------------------------------------------------
[INFO]Total time: 2:06.911s
[INFO]Finished at: Thu Jan 16 14:52:00 CST 2014
[INFO]Final Memory: 9M/490M
[INFO]------------------------------------------------------------------------


2). 导入项目到eclipse

我们创建好了一个基本的maven项目,然后导入到eclipse中。这里我们最好已安装好了Maven的插件。

步骤如下:
File->import->Maven->ExistingMaven Projects 点击 next
填上上面新建的Maven项目的目录
点击 Finish,完成导入


3). 增加hadoop依赖

这里我使用hadoop-1.2.1版本,在eclipse里修改文件:pom.xml

//主要是添加红色的hadoop内容

<projectxmlns="http://maven.apache.org/POM/4.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>org.conan.myhadoop.mr</groupId>
  <artifactId>myHadoop</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>myHadoop</name>
  <url>http://maven.apache.org</url>
  <dependencies>

   <dependency>
       <groupId>org.apache.hadoop</groupId>
       <artifactId>hadoop-core</artifactId>
       <version>1.2.1</version>
    </dependency>

    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
  </dependencies>
</project>


4). 下载依赖

D:\workspace\myHadoop>mvn clean install

命令完成后会看到Mamen Dependencies下多出许多信赖包
项目的依赖程序,被自动加载的库路径下面


5). 从Hadoop集群环境下载hadoop配置文件
core-site.xml
hdfs-site.xml
mapred-site.xml

保存在src/main/resources/hadoop目录下面

删除原自动生成的文件:App.java和AppTest.java


6).配置本地host,增加主节点nticket1的域名指向
修改hosts文件:
c:/Windows/System32/drivers/etc/hosts
192.168.19.214 hadoop


问题1:
首次运行控制台错误
2013-9-3019:25:02 org.apache.hadoop.util.NativeCodeLoader
警告: Unable toload native-hadoop library for your platform... using builtin-java classeswhere applicable
2013-9-3019:25:02 org.apache.hadoop.security.UserGroupInformation doAs
严重: PriviledgedActionExceptionas:Administrator cause:java.io.IOException: Failed to set permissions of path:\tmp\hadoop-Administrator\mapred\staging

\Administrator1702422322\.staging to0700

Exception inthread "main" java.io.IOException: Failed to set permissions of path:\tmp\hadoop-Administrator\mapred\staging\Administrator1702422322\.staging to0700
       atorg.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
       atorg.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)

这个错误是win中开发特有的错误,文件权限问题,在Linux下可以正常运行。
解决方法是,修改/hadoop-1.2.1/src/core/org/apache/hadoop/fs/FileUtil.java文件

688-692行注释,然后重新编译源代码,重新打一个hadoop.jar的包。

685 private static void checkReturnValue(boolean rv, File p,
686                                        FsPermissionpermission
687                                        )throws IOException {
688     /*if (!rv) {
689       throw new IOException("Failed toset permissions of path: " + p +
690                             " to " +
691                             String.format("%04o",permission.toShort()));
692     }*/
693   }

为了方便,我直接在网上下载的已经编译好的hadoop-core-1.2.1.jar包
下载连接 http://download.csdn.net/detail/yunlong34574/7079951

我们还要替换maven中的hadoop类库。
cp hadoop-core-1.2.1.jar 到 C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-core\1.2.1\hadoop-core-1.2.1.jar(这个目录按照自己要求变更)

问题2:
org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE,

inode="hadoop":hadoop:supergroup:rwxr-xr-x
解决方法:
提示往HDFS写文件是不容许的
在conf/hdfs-site.xml中加入
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>


问题3:
红叉或感叹号解决办法:
在项目的JRE System Library[JavaSE-1.5]右键-> Properties
在Evecution environment选择JavaSE-1.6(jre6S)
点确定后,项目名称上的感叹号警告就会消失


转载:http://blog.chinaunix.net/uid-25691489-id-5122374.html

没找到任何评论,期待你打破沉寂

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条