Bug 反馈
【 Bug 的影响】
三表关联的情况下绕过tidb_mem_quota_query限制触发oom
【可能的问题复现步骤】
1.系统关闭swap
2.tidb设置:
server_configs:
tidb:
enable-batch-dml: true
mem-quota-query: 4294967296
performance.server-memory-quota: 30064771072
performance.txn-total-size-limit: 1073741824
3.A,B,C三表关联,A表约2亿数据,按日分区,700+分区,应用触发形如下列查询时:
select
`B`.`code` as `c0`,
`C`.`br_name` as `c1`,
sum(`A`.`ss_num`) as `m0`,
sum(`A`.`a_ss_num`) as `m1`,
sum(`A`.`cb_num`) as `m2`
from
`test`.`A2` as `A`,
`test`.`B` as `B`,
`test`.`C` as `C`
where
`B`.`code` = '1010'
and
`A`.`s_id` = `B`.`s_id`
and
`A`.`b_code` = `C`.`b_code`
group by
`B`.`code`,
`C`.`br_name`;
【看到的非预期行为】
如果A过大,此时没有触发Out of Memory Quota,tidb server的内存顶满,被系统oom-killer杀掉
【期望看到的行为】
触发Out of Memory Quota,tidb server不用重启。
【相关组件及具体版本】
5.4.0
tidb server 8C 32G 三台
tikv 8C 32G 三台
【其他背景信息或者截图】
被kill之前,能捕捉到的heap:tidb_10.speedscope.json (177.3 KB) tidb_10.speedscope.json
截图如下:
此时观察溢出的类是:
github.com/pingcap/tidb/util/chunk.NewColumn (/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/util/chunk/column.go:0)
github.com/pingcap/tidb/util/chunk.New (/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/util/chunk/chunk.go:0)
github.com/pingcap/tidb/executor.(*HashJoinExec).fetchBuildSideRows (/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/join.go:0)
github.com/pingcap/tidb/executor.(*HashJoinExec).fetchAndBuildHashTable.func2 (/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/join.go:0)
github.com/pingcap/tidb/util.WithRecovery (/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/util/misc.go:0)
fetchAndBuildHashTable所关联内存达到10G。
执行计划显示如下:
A的扫描结果首先跟C做HashJoin,C做Build,A自拍Probe,然后A和C的结果与B做HashJoin,A和C的结果做build,B做Probe,怀疑,这个步骤出现问题,A和C的结果过大。
执行计划:执行计划 (6.1 KB)