执行计划显示hashagg和build的疑问?

【 TiDB 使用环境】测试
【 TiDB 版本】6.5.3
【复现路径】做过哪些操作出现的问题
【遇到的问题:问题现象及影响】
各位老师好:
在学习tidb的执行计划时候,有下列疑问,数据来源都来自于tpch

mysql> explain analyze  select * from customer  where C_CUSTKEY in (select O_CUSTKEY from orders limit 1);
+----------------------------------+---------+---------+-----------+-----------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
| id                               | estRows | actRows | task      | access object                                 | execution info                                                                                                                                                                                            | operator info                                                                                                                                                           | memory    | disk    |
+----------------------------------+---------+---------+-----------+-----------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
| IndexHashJoin_16                 | 1.00    | 1       | root      |                                               | time:2.73ms, loops:2, inner:{total:1.05ms, concurrency:5, task:1, construct:12.1µs, fetch:1.03ms, build:3.69µs, join:4.19µs}                                                                              | inner join, inner:TableReader_11, outer key:tpch2.orders.o_custkey, inner key:tpch2.customer.c_custkey, equal cond:eq(tpch2.orders.o_custkey, tpch2.customer.c_custkey) | 146.4 KB  | N/A     |
| ├─HashAgg_21(Build)              | 1.00    | 1       | root      |                                               | time:1.53ms, loops:3                                                                                                                                                                                      | group by:tpch2.orders.o_custkey, funcs:firstrow(tpch2.orders.o_custkey)->tpch2.orders.o_custkey                                                                         | 898 Bytes | 0 Bytes |
| │ └─Limit_22                     | 1.00    | 1       | root      |                                               | time:1.52ms, loops:2                                                                                                                                                                                      | offset:0, count:1                                                                                                                                                       | N/A       | N/A     |
| │   └─IndexReader_27             | 1.00    | 1       | root      |                                               | time:1.51ms, loops:1, cop_task: {num: 1, max: 1.57ms, proc_keys: 1, rpc_num: 1, rpc_time: 1.54ms, copr_cache: disabled, distsql_concurrency: 1}                                                           | index:Limit_26                                                                                                                                                          | 241 Bytes | N/A     |
| │     └─Limit_26                 | 1.00    | 1       | cop[tikv] |                                               | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_process_keys_size: 46, total_keys: 2, get_snapshot_time: 601.8µs, rocksdb: {key_skipped_count: 1, block: {cache_hit_count: 5}}}  | offset:0, count:1                                                                                                                                                       | N/A       | N/A     |
| │       └─IndexFullScan_25       | 1.00    | 1       | cop[tikv] | table:orders, index:indexO_CUSTKEY(O_CUSTKEY) | tikv_task:{time:0s, loops:1}                                                                                                                                                                              | keep order:false                                                                                                                                                        | N/A       | N/A     |
| └─TableReader_11(Probe)          | 1.00    | 1       | root      |                                               | time:951.8µs, loops:2, cop_task: {num: 1, max: 907µs, proc_keys: 1, rpc_num: 1, rpc_time: 888.3µs, copr_cache: disabled, distsql_concurrency: 15}                                                         | data:TableRangeScan_10                                                                                                                                                  | N/A       | N/A     |
|   └─TableRangeScan_10            | 1.00    | 1       | cop[tikv] | table:customer                                | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_process_keys_size: 184, total_keys: 1, get_snapshot_time: 407.4µs, rocksdb: {block: {cache_hit_count: 5}}}                       | range: decided by [tpch2.orders.o_custkey], keep order:false                                                                                                            | N/A       | N/A     |
+----------------------------------+---------+---------+-----------+-----------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
8 rows in set (0.00 sec)

有下列2个疑问:
1.对于这种limit 1 这种语句,为什么出现hashagg 这种算子 ?
2.hashagg 这种算子 为什么有build的显示,是不是这个build 是不是上面算子IndexHashJoin的显示呢?

mysql> explain analyze select O_CUSTKEY from orders limit 1;
+--------------------------+---------+---------+-----------+-----------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------+---------+------+
| id                       | estRows | actRows | task      | access object                                 | execution info                                                                                                                                                                                                     | operator info          | memory  | disk |
+--------------------------+---------+---------+-----------+-----------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------+---------+------+
| Limit_6                  | 1.00    | 1       | root      |                                               | time:1.59ms, loops:2                                                                                                                                                                                               | offset:0, count:1      | N/A     | N/A  |
| └─IndexReader_11         | 1.00    | 32      | root      |                                               | time:1.59ms, loops:1, cop_task: {num: 1, max: 1.44ms, proc_keys: 224, rpc_num: 1, rpc_time: 1.41ms, copr_cache: disabled, distsql_concurrency: 15}                                                                 | index:IndexFullScan_10 | 4.19 KB | N/A  |
|   └─IndexFullScan_10     | 1.00    | 224     | cop[tikv] | table:orders, index:indexO_CUSTKEY(O_CUSTKEY) | tikv_task:{time:0s, loops:3}, scan_detail: {total_process_keys: 224, total_process_keys_size: 10304, total_keys: 225, get_snapshot_time: 611.7µs, rocksdb: {key_skipped_count: 224, block: {cache_hit_count: 6}}}  | keep order:false       | N/A     | N/A  |
+--------------------------+---------+---------+-----------+-----------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------+---------+------+
3 rows in set (0.00 sec)
1 个赞
mysql> explain analyze  select * from customer  where C_CUSTKEY in (select C_CUSTKEY from tempcustomer limit 1);
+--------------------------------+---------+---------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
| id                             | estRows | actRows | task      | access object      | execution info                                                                                                                                                                                           | operator info                                                                                                                                                                       | memory    | disk |
+--------------------------------+---------+---------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
| IndexHashJoin_17               | 1.00    | 1       | root      |                    | time:2.38ms, loops:2, inner:{total:1.58ms, concurrency:5, task:1, construct:10.7µs, fetch:1.56ms, build:3.23µs, join:4.12µs}                                                                             | inner join, inner:TableReader_12, outer key:tpch2.tempcustomer.c_custkey, inner key:tpch2.customer.c_custkey, equal cond:eq(tpch2.tempcustomer.c_custkey, tpch2.customer.c_custkey) | 138.3 KB  | N/A  |
| ├─Limit_22(Build)              | 1.00    | 1       | root      |                    | time:696.4µs, loops:3                                                                                                                                                                                    | offset:0, count:1                                                                                                                                                                   | N/A       | N/A  |
| │ └─TableReader_26             | 1.00    | 1       | root      |                    | time:694.9µs, loops:1, cop_task: {num: 1, max: 712.4µs, proc_keys: 1, rpc_num: 1, rpc_time: 690µs, copr_cache: disabled, distsql_concurrency: 1}                                                         | data:Limit_25                                                                                                                                                                       | 231 Bytes | N/A  |
| │   └─Limit_25                 | 1.00    | 1       | cop[tikv] |                    | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_process_keys_size: 27, total_keys: 2, get_snapshot_time: 19.4µs, rocksdb: {key_skipped_count: 1, block: {cache_hit_count: 5}}}  | offset:0, count:1                                                                                                                                                                   | N/A       | N/A  |
| │     └─TableFullScan_24       | 1.00    | 1       | cop[tikv] | table:tempcustomer | tikv_task:{time:0s, loops:1}                                                                                                                                                                             | keep order:false                                                                                                                                                                    | N/A       | N/A  |
| └─TableReader_12(Probe)        | 1.00    | 1       | root      |                    | time:1.5ms, loops:2, cop_task: {num: 1, max: 1.46ms, proc_keys: 1, rpc_num: 1, rpc_time: 1.45ms, copr_cache: disabled, distsql_concurrency: 15}                                                          | data:TableRangeScan_11                                                                                                                                                              | N/A       | N/A  |
|   └─TableRangeScan_11          | 1.00    | 1       | cop[tikv] | table:customer     | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_process_keys_size: 184, total_keys: 1, get_snapshot_time: 987.6µs, rocksdb: {block: {cache_hit_count: 5}}}                      | range: decided by [tpch2.tempcustomer.c_custkey], keep order:false                                                                                                                  | N/A       | N/A  |
+--------------------------------+---------+---------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
7 rows in set (0.00 sec)

像这个limit 也有build

这个是不是在B+树这样的存储中查找的次数

首先build和probe一定是成对出现的,一般是一个join的两端。

https://docs.pingcap.com/zh/tidb/stable/explain-overview#算子的执行顺序

文档有些大致的介绍。翻代码,你会发现build和probe都起了一个线程去做,不好说实际谁先谁后。
最多是build的线程先建立。

然后limit1里面为啥会有hashagg这个算子,我觉的这个确实是一个有问题的执行计划。应该是有优化空间的。

是的。tidb的执行计划对于build和probe 显示确实有点乱

1.这种加 group by才等价,所以会出现 HASH AGG 算子,参考: https://docs.pingcap.com/zh/tidb/stable/subquery-optimization#-in-select--from-

2.这个是因为它是 index hash join 的 build 端。

1 个赞

2.这个是因为它是 index hash join 的 build 端。
那它应该显示再Indexhashjoin 后面把

不应该,简单说,AB 两张表关联,内表和外表,驱动表和被驱动表,build 端和 probe 端,说的都是 AB 表,而不是俩表选择的关联算法

是我误解了,当然我说的不是某个特定的连接算法,我这里有点记忆错了
感谢大佬,解决我的疑问,子查询的那个优化,其它数据库也是有的,感谢

学习了

大佬,请教1个问题 ,安装官网对于子查询的优化

子查询
select * from t1 where t1.a in (select t2.a from t2) 会被改写为 select t1.* from t1, (select distinct(a) a from t2) t2 where t1.a = t2.a

那么t2只能当驱动表?不能当被驱动表?我觉得应该可以把,我记得MySQL可以

TiDB 现在还不行,你可以改写成 exists

学习了,学习了

学习了学习了

想问下build和probe代表了什么含义

谢谢老师的回复

学习了

学习了。目前我了解到的是除了build和probe知道先后执行或者并行执行,其他都按照myslq的方式来理解