问题导读
1.Hive分区不在文件中,sqoop如何导出到新的目标库?
2.解决这个问题的关键是什么?
3.实现包含哪些步骤?
问题分析:
hive中分区表其底层就是HDFS中的多个目录下的单个文件,hive导出数据本质是将HDFS中的文件导出
hive中的分区表,因为分区字段(静态分区)不在文件中,所以在sqoop导出的时候,无法将分区字段进行直接导出
思路:在hive中创建一个临时表,将分区表复制过去后分区字段转换为普通字段,然后再用sqoop将tmp表导出即实现需求
步骤如下:
1.创建目标表(分区表)
- hive> CREATE TABLE `dept_partition`(
- `deptno` int,
- `dname` string,
- `loc` string)
- PARTITIONED BY (`month` string) row format delimited fields terminated by '\t';
复制代码
1.1查看表结构
- hive> show create table dept_partition;
复制代码
- +----------------------------------------------------+--+
- | createtab_stmt |
- +----------------------------------------------------+--+
- | CREATE TABLE `dept_partition`( |
- | `deptno` int, |
- | `dname` string, |
- | `loc` string) |
- | PARTITIONED BY ( |
- | `month` string)
复制代码
2.导入数据
- hive> load data inpath '/user/hive/hive_db/data/dept.txt' into table dept_partition;
复制代码
- 10 ACCOUNTING 1700
- 20 RESEARCH 1800
- 30 SALES 1900
- 40 OPERATIONS 1700
复制代码
3.查询表dept_partition
- hive> select * from dept_partition;
复制代码
- +------------------------+-----------------------+---------------------+-----------------------+--+
- | dept_partition.deptno | dept_partition.dname | dept_partition.loc | dept_partition.month |
- +------------------------+-----------------------+---------------------+-----------------------+--+
- | 10 | ACCOUNTING | 1700 | 2019-10-19 |
- | 20 | RESEARCH | 1800 | 2019-10-19 |
- | 30 | SALES | 1900 | 2019-10-19 |
- | 40 | OPERATIONS | 1700 | 2019-10-19 |
- | 10 | ACCOUNTING | 1700 | 2019-10-20 |
- | 20 | RESEARCH | 1800 | 2019-10-20 |
- | 30 | SALES | 1900 | 2019-10-20 |
- | 40 | OPERATIONS | 1700 | 2019-10-20 |
- +------------------------+-----------------------+---------------------+-----------------------+--+
复制代码
4.创建临时表 tmp_dept_partition
- hive> create table tmp_dept_partition as select * from dept_partition;
复制代码
5.查询临时表
- hive> select * from tmp_dept_partition;
复制代码
- +----------------------------+---------------------------+-------------------------+---------------------------+--+
- | tmp_dept_partition.deptno | tmp_dept_partition.dname | tmp_dept_partition.loc | tmp_dept_partition.month |
- +----------------------------+---------------------------+-------------------------+---------------------------+--+
- | 10 | ACCOUNTING | 1700 | 2019-10-19 |
- | 20 | RESEARCH | 1800 | 2019-10-19 |
- | 30 | SALES | 1900 | 2019-10-19 |
- | 40 | OPERATIONS | 1700 | 2019-10-19 |
- | 10 | ACCOUNTING | 1700 | 2019-10-20 |
- | 20 | RESEARCH | 1800 | 2019-10-20 |
- | 30 | SALES | 1900 | 2019-10-20 |
- | 40 | OPERATIONS | 1700 | 2019-10-20 |
- +----------------------------+---------------------------+-------------------------+---------------------------+--+
复制代码
6.查看表结构(这个时候分区表已经转换为非分区表了)
- hive> show create table tmp_dept_partition;
复制代码
- +----------------------------------------------------+--+
- | createtab_stmt |
- +----------------------------------------------------+--+
- | CREATE TABLE `tmp_dept_partition`( |
- | `deptno` int, |
- | `dname` string, |
- | `loc` string, |
- | `month` string)
复制代码
7.MySQL中建表 dept_partition
- mysql> drop table if exists dept_partition;
- create table dept_partition(
- `deptno` int,
- `dname` varchar(20),
- `loc` varchar(20),
- `month` varchar(50))
复制代码
8.使用sqoop导入到MySQL
- bin/sqoop export \
- --connect jdbc:mysql://hadoop01:3306/partitionTb \
- --username root \
- --password 123456 \
- --table dept_partition \
- --num-mappers 1 \
- --export-dir /user/hive/warehouse/hive_db.db/tmp_dept_partition \
- --input-fields-terminated-by "\001"
复制代码
9.Mysql查询验证是否成功导出
- mysql> select * from dept_partition;
复制代码
- +--------+------------+------+------------+
- | deptno | dname | loc | month |
- +--------+------------+------+------------+
- | 10 | ACCOUNTING | 1700 | 2019-10-19 |
- | 20 | RESEARCH | 1800 | 2019-10-19 |
- | 30 | SALES | 1900 | 2019-10-19 |
- | 40 | OPERATIONS | 1700 | 2019-10-19 |
- | 10 | ACCOUNTING | 1700 | 2019-10-20 |
- | 20 | RESEARCH | 1800 | 2019-10-20 |
- | 30 | SALES | 1900 | 2019-10-20 |
- | 40 | OPERATIONS | 1700 | 2019-10-20 |
- +--------+------------+------+------------+
复制代码
最新经典文章,欢迎关注公众号
|