本帖最后由 pig2 于 2014-2-27 22:18 编辑
上一篇: hadoop培训笔记之HDFS介绍--HDFS优点与缺点
可以带着下面问题来阅读本文:
1.面对shell命令,你是否能够想出对应的Java代码?如创建一个文件该怎么写,需要几步,都用到什么类?
2.能否写出Client端与Namenode之间的RPC通信协议?
3.什么功能使得Hadoop集群能够感知机架-物理主机-Hadoop节点三层架构?
4.DistributedFileSystem调用create方法后的返回类型是什么?
5.Java中对文件读取速度最快的数据结构是什么?
6.默认的Namenode web管理端口是多少?(很简单,可是不一定能说对)
HDFS API使用:http://hadoop.apache.org/docs/current1/api/ (Hadoop 1.2.1) 主要用代码讲解了其中一些API的使用。以下是JUnit测试代。 效果可以用如下(类似)命令查看: ./hadoop fs -lsr /
./hadoop fs -cat /test/a.txt
- import java.io.BufferedInputStream;
- import java.io.File;
- import java.io.FileInputStream;
- import java.io.IOException;
- import java.io.InputStream;
- import java.net.URI;
-
- import org.apache.hadoop.conf.Configuration;
- import org.apache.hadoop.fs.BlockLocation;
- import org.apache.hadoop.fs.FSDataOutputStream;
- import org.apache.hadoop.fs.FileStatus;
- import org.apache.hadoop.fs.FileSystem;
- import org.apache.hadoop.fs.Path;
- import org.apache.hadoop.io.IOUtils;
- import org.apache.hadoop.util.Progressable;
- import org.junit.Test;
-
- import junit.framework.TestCase;
-
-
- public class TestHDFS extends TestCase {
-
- public static String hdfsUrl = "hdfs://192.168.56.101:9100";
- //create HDFS folder
- @Test
- public void testHDFSMkdir() throws IOException{
- Configuration conf = new Configuration();
- FileSystem fs = FileSystem.get(URI.create(hdfsUrl), conf);
- Path path = new Path("/test");
- fs.mkdirs(path);
- }
-
- //create a file
- @Test
- public void testCreateFile() throws IOException{
- Configuration conf = new Configuration();
- FileSystem fs = FileSystem.get(URI.create(hdfsUrl), conf);
- Path path = new Path("/test/a.txt");
- FSDataOutputStream out = fs.create(path);
- out.write("hello hadoop".getBytes());
- }
-
- //rename a file
- @Test
- public void testRenameFile() throws IOException{
- Configuration conf = new Configuration();
- FileSystem fs = FileSystem.get(URI.create(hdfsUrl), conf);
- Path path = new Path("/test/a.txt");
- Path newpath = new Path("/test/b.txt");
- System.out.println(fs.rename(path, newpath));
- }
-
- //upload a local file to HDFS
- @Test
- public void testUploadFile1() throws IOException{
- Configuration conf = new Configuration();
- FileSystem fs = FileSystem.get(URI.create(hdfsUrl), conf);
- Path src = new Path("/home/xwchen/hadoop/hadoop-1.2.1/bin/rcc");
- Path dst = new Path("/test");
- fs.copyFromLocalFile(src, dst);
- }
-
- @Test
- public void testUploadFile2() throws IOException{
- Configuration conf = new Configuration();
- FileSystem fs = FileSystem.get(URI.create(hdfsUrl), conf);
- InputStream in = new BufferedInputStream(new FileInputStream(new File("/home/xwchen/hadoop/hadoop-1.2.1/bin/rcc")));
- FSDataOutputStream out = fs.create(new Path("/test/rcc1"));
- IOUtils.copyBytes(in, out, 4096);
- }
-
- @Test
- public void testUploadFile3() throws IOException{
- Configuration conf = new Configuration();
- FileSystem fs = FileSystem.get(URI.create(hdfsUrl), conf);
- InputStream in = new BufferedInputStream(new FileInputStream(new File("/home/xwchen/hadoop/hadoop-1.2.1/bin/rcc")));
- FSDataOutputStream out = fs.create(new Path("/test/rcc2"), new Progressable(){
-
- @Override
- public void progress() {
- System.out.println(".");
-
- }});
- IOUtils.copyBytes(in, out, 4096);
- }
-
-
- @Test //dd if=/dev/zero of=data bs=1024 count=1024
- public void testUploadFile4() throws IOException{
- Configuration conf = new Configuration();
- FileSystem fs = FileSystem.get(URI.create(hdfsUrl), conf);
- InputStream in = new BufferedInputStream(new FileInputStream(new File("/home/xwchen/hadoop/hadoop-1.2.1/bin/data")));
- FSDataOutputStream out = fs.create(new Path("/test/data"), new Progressable(){
-
- @Override
- public void progress() {
- System.out.println(".");
-
- }});
- IOUtils.copyBytes(in, out, 4096);
- }
-
- //list files under folder
- @Test
- public void testListFiles() throws IOException{
- Configuration conf = new Configuration();
- FileSystem fs = FileSystem.get(URI.create(hdfsUrl), conf);
- Path dst = new Path("/test");
- FileStatus[] files = fs.listStatus(dst);
- for(FileStatus file: files){
- System.out.println(file.getPath().toString());
- }
- }
-
- //list block info of file
- @Test
- public void testGetBlockInfo() throws IOException{
- Configuration conf = new Configuration();
- FileSystem fs = FileSystem.get(URI.create(hdfsUrl), conf);
- Path dst = new Path("/test/data");
- FileStatus fileStatus = fs.getFileStatus(dst);
- BlockLocation[] blkLoc = fs.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());
- for(BlockLocation loc: blkLoc){
- //System.out.println(loc.getHosts());
- for (int i=0; i<loc.getHosts().length; i++)
- System.out.println(loc.getHosts()[i]);
- }
- }
- }
- </font></font>
复制代码
测验相关:
FileSystem类是一个抽象类。
Client端与Namenode之间的RPC通信协议是ClientProtocol。
传统的Hadoop集群节点拓扑结构包括机架和主机层,但虚拟化在此基础上还需要知道hypervisor这一层。HVE(Hadoop Virtualization Extensions)功能使得Hadoop集群能够感知机架-物理主机-Hadoop节点三层架构,并且根据相应算法使运行于同一台物理主机上的存储节点和计算节点之间的通信方式满足数据本地化的要求。
DistributedFileSystem调用create方法后的返回类型是FSDataOutputStream
Java中对文件读取速度最快的数据结构是FileChannel,其次是BufferInputStream,FileInputStrea,andomAccessFile(http://bbs.itheima.com/thread-48379-1-1.html)
FSDataOutputStream实现的接口:Closeable, DataOutput, Flushable, CanSetDropBehind, Syncable
文件系统如ZFS,Moose、Glusterfs和Lustre使用FUSE实现,FastDFS没有提供FUSE功能
用户空间文件系统(Filesystem in Userspace,简称FUSE)是操作系统中的概念,指完全在用户态实现的文件系统。(http://zh.wikipedia.org/wiki/FUSE)
默认的Namenode web管理端口是50070
DirectByteBuffer和ByteBuffer:ByteBuffer需要通过wrap方法来封装字节数组,ByteBuffer在heap上分配内存,DirectByteBuffer的字节访问速度比ByteBuffer快,ByteBuffer由JVM负责垃圾回收(Direct不是)
|