[mw_shl_code=bash,true]set hive.enforce.bucketing=true;
set hive.exec.compress.output=true;
set mapred.output.compress=true;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec;[/mw_shl_code]
在hive的命令下行运行如上代码即可,这里用的是Gzip压缩。
2、基于xml文件的压缩配置
mapred-site.xml
[mw_shl_code=bash,true]<property>
<name>mapred.output.compress</name>
<value>true</value>
<description>Should the job outputs be compressed?
</description>
</property>
<property>
<name>mapred.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.GzipCodec</value>
<description>If the job outputs are compressed, how should they be compressed?
</description>
</property>[/mw_shl_code]