hyj 发表于 2021-5-25 11:49:20

工作经验记录:clickhouse执行查询内存超出限制问题



今天用clickhouse执行分布式查询报错:

SQL 错误 : ClickHouse exception, code: 241, host: 192.168.29.124, port: 18123; Code: 241, e.displayText() = DB::Exception: Memory limit (total) exceeded: would use 15.44 GiB (attempt to
        allocate chunk of 4294967296 bytes), maximum: 13.95 GiB (version 20.6.3.28 (official build))

查询的语句如下:

SELECT DISTINCT plate_number FROM
       (SELECT DISTINCT e.plate_number FROM base_db.event_local AS e
       WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
       ) AS t0
       INNER JOIN
       (SELECT DISTINCT e.plate_number FROM base_db.event_local AS e
       WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
       ) AS t1 ON t0.plate_number = t1.plate_number
       INNER JOIN
       (SELECT DISTINCT e.plate_number FROM base_db.event_local AS e
       WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
          ) AS t2 ON t1.plate_number = t2.plate_number
       INNER JOIN
       (SELECT DISTINCT e.plate_number FROM base_db.event_local AS e
       WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
          ) AS t3 ON t2.plate_number = t3.plate_number 以上语句的意思是要查询同时出现在不同时段的车牌,每个时段数据量在10万左右,在选择3组join查询时结果也1到2秒内查询出结果,但是选择4组时段join查询就报内存超出限制
也就是说3组子查询join的话,没问题,4组甚至更多组就会报出内存限制问题。
由于业务上只要查询出车牌,所以上面每组子查询都用了DISTINCT做去重,之所以用DISTINCT不用group by的原因是我不想把语句写太长,太复杂。
看了下clickhouse的DISTINCT与group by功能上是差不多的(实际上用起来并非如此),于是我在user.xml中设置如下:

<max_bytes_before_external_group_by>7500000000</max_bytes_before_external_group_by>
也就是设置group by超过7.5g就让group by走磁盘查询避免内存超出,设置完重启,上面这个查询依然报内存超出限制。也就是说这个配置不会对DISTINCT有影响。
于是我只好把上面的子查询语句DISTINCT全部修改成group by:

SELECT DISTINCT plate_number FROM
       (SELECT e.plate_number FROM base_db.event_local AS e
       WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
       group by e.plate_number) AS t0
       INNER JOIN
       (SELECT e.plate_number FROM base_db.event_local AS e
       WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
       group by e.plate_number) AS t1 ON t0.plate_number = t1.plate_number
       INNER JOIN
       (SELECT e.plate_number FROM base_db.event_local AS e
       WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
          group by e.plate_number) AS t2 ON t1.plate_number = t2.plate_number
       INNER JOIN
       (SELECT e.plate_number FROM base_db.event_local AS e
       WHERE (e.snapshot_time >= '2020-11-18 00:00:00') AND (e.snapshot_time <= '2020-11-24 23:59:59')
          group by e.plate_number) AS t3 ON t2.plate_number = t3.plate_number

终于,没有再报内存问题,而且这个查询只用了2秒。
总结一下:group by与DISTINCT虽说查询结果一样,但是在内存限制的计算中并不太一样,max_bytes_before_external_group_by参数配置也只会对group by有影响,另外我还发现
分布式查询中除响应节点外其他节点被分发的语句,以上面报错的语句为例(不是我修改后的),从t0这个子查询开始的DISTINCT关键字会被去掉,分发节点的查询被篡改,可能是ck自动优化化语句吧。所以基于内存的限制问题来说clickhouse的复杂查询能不用DISTINCT就不用,我这就掉坑里了。

以上问题解决办法就是将DISTINCT修改为group by,并且设置max_bytes_before_external_group_by参数就可以了



原文链接:
http://classinstance.cn/detail/116.html
页: [1]
查看完整版本: 工作经验记录:clickhouse执行查询内存超出限制问题