问题描述:
hive表中存放了很多数据,其中有两个字段
referrer_url 上一页访问的页面
url 从上一个访问过来的当前页
我通过
[mw_shl_code=sql,true]select url, referrer_url, count(url) as c_url from table
where (referrer_url = 'http://www.xxx.xxx.cn/' or referrer_url = 'http://www.xxx.xxx.xx' or referrer_url = 'www.xxxx.xxx.xx')
and (domain = '100103') group by pv_url, pv_referrer_url order by c_url desc limit 10;
[/mw_shl_code]
查询出一条URL名为
www.abc.com为63条
我把上述的数据存入到临时表中,然后再和大表JOIN
[mw_shl_code=sql,true]select url, referrer_url, count(url) as cc_url from table p
JOIN temp_B ON(referrer_url = temp_B.url and p.domain = '100103')
group by referrer_url, url order by cc_url desc;
[/mw_shl_code]
但是www.abc.com一组下的URL数据结果不为63条。
找了半天还是没有灵感,有类似的或有方法的朋友提示下!
|
|