分享

工作经验记录:clickhouse执行查询内存超出限制问题

hyj 2021-5-25 11:49:20 发表于 Nosql [显示全部楼层] 回帖奖励 阅读模式 关闭右栏 0 6544


今天用clickhouse执行分布式查询报错:

SQL 错误 [241]: ClickHouse exception, code: 241, host: 192.168.29.124, port: 18123; Code: 241, e.displayText() = DB::Exception: Memory limit (total) exceeded: would use 15.44 GiB (attempt to
        allocate chunk of 4294967296 bytes), maximum: 13.95 GiB (version 20.6.3.28 (official build))

查询的语句如下:

  1.   SELECT DISTINCT plate_number FROM
  2.          (SELECT DISTINCT e.plate_number FROM base_db.event_local AS e
  3.          WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
  4.          ) AS t0
  5.          INNER JOIN
  6.          (SELECT DISTINCT e.plate_number FROM base_db.event_local AS e
  7.          WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
  8.          ) AS t1 ON t0.plate_number = t1.plate_number
  9.          INNER JOIN
  10.          (SELECT DISTINCT e.plate_number FROM base_db.event_local AS e
  11.          WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
  12.           ) AS t2 ON t1.plate_number = t2.plate_number
  13.          INNER JOIN
  14.          (SELECT DISTINCT e.plate_number FROM base_db.event_local AS e
  15.          WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
  16.           ) AS t3 ON t2.plate_number = t3.plate_number
复制代码
以上语句的意思是要查询同时出现在不同时段的车牌,每个时段数据量在10万左右,在选择3组join查询时结果也1到2秒内查询出结果,但是选择4组时段join查询就报内存超出限制
也就是说3组子查询join的话,没问题,4组甚至更多组就会报出内存限制问题。
由于业务上只要查询出车牌,所以上面每组子查询都用了DISTINCT做去重,之所以用DISTINCT不用group by的原因是我不想把语句写太长,太复杂。
看了下clickhouse的DISTINCT与group by功能上是差不多的(实际上用起来并非如此),于是我在user.xml中设置如下:

  1. <max_bytes_before_external_group_by>7500000000</max_bytes_before_external_group_by>
复制代码
也就是设置group by超过7.5g就让group by走磁盘查询避免内存超出,设置完重启,上面这个查询依然报内存超出限制。也就是说这个配置不会对DISTINCT有影响。
于是我只好把上面的子查询语句DISTINCT全部修改成group by:

  1. SELECT DISTINCT plate_number FROM
  2.          (SELECT e.plate_number FROM base_db.event_local AS e
  3.          WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
  4.          group by e.plate_number) AS t0
  5.          INNER JOIN
  6.          (SELECT e.plate_number FROM base_db.event_local AS e
  7.          WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
  8.          group by e.plate_number) AS t1 ON t0.plate_number = t1.plate_number
  9.          INNER JOIN
  10.          (SELECT e.plate_number FROM base_db.event_local AS e
  11.          WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
  12.           group by e.plate_number) AS t2 ON t1.plate_number = t2.plate_number
  13.          INNER JOIN
  14.          (SELECT e.plate_number FROM base_db.event_local AS e
  15.          WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
  16.           group by e.plate_number) AS t3 ON t2.plate_number = t3.plate_number
复制代码


终于,没有再报内存问题,而且这个查询只用了2秒。
总结一下:group by与DISTINCT虽说查询结果一样,但是在内存限制的计算中并不太一样,max_bytes_before_external_group_by参数配置也只会对group by有影响,另外我还发现
分布式查询中除响应节点外其他节点被分发的语句,以上面报错的语句为例(不是我修改后的),从t0这个子查询开始的DISTINCT关键字会被去掉,分发节点的查询被篡改,可能是ck自动优化化语句吧。所以基于内存的限制问题来说clickhouse的复杂查询能不用DISTINCT就不用,我这就掉坑里了。

以上问题解决办法就是将DISTINCT修改为group by,并且设置max_bytes_before_external_group_by参数就可以了



原文链接:
http://classinstance.cn/detail/116.html

没找到任何评论,期待你打破沉寂

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条