本帖最后由 BB_BDATA 于 2018-7-25 17:56 编辑
自己写了个连续日期udf函数running_dates,部署后,通过查询源表src_test_1 中最小日期,最大日期,获取之间的连续日期结果集,在通过左关联实现:
select g.monthid,
if(w.monthid is null, 0, w.cnt)
from (select regexp_replace(k.monthid, '-', '') as monthid
from (select explode(split(running_dates(
min(from_unixtime(unix_timestamp(a.monthid,'yyyyMMdd'), 'yyyy-MM-dd')),
max(from_unixtime(unix_timestamp(a.monthid,'yyyyMMdd'), 'yyyy-MM-dd'))),
',')) as monthid from src_test_1 a) k) g
left join src_test_1 w on w.monthid = g.monthid;
结果如下:
20180101 101
20180102 102
20180103 100
20180104 0
20180105 104
20180106 0
20180107 105
20180108 102
20180109 103 |