HashJoin占用内存过大,落盘不及时

【 TiDB 使用环境】Poc
【 TiDB 版本】6.5.0
【复现路径】 组件涉及:tidb、tikv 在默认参数情况下执行TPCH的Q5语句时候发现在错误的执行计划下总是出现语句级别的OOM。
Q5语句本来是对结果集做分组聚合,之类进行简化,测试语句和执行计划为(因表较小这里没有对nation和region表做统计信息搜集,其它表做了搜集,导致了错误的执行计划):

select 
        count(*)
from
        customer,
        orders,
        lineitem,
        supplier,
        nation,
        region
where
        c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and l_suppkey = s_suppkey
        and c_nationkey = s_nationkey
        and s_nationkey = n_nationkey
        and n_regionkey = r_regionkey
        and r_name = 'ASIA'
        and o_orderdate >= date '1994-01-01'
        and o_orderdate < date '1994-01-01' + interval '1' year;

+------------------------------------------------+-------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                             | estRows     | task      | access object  | operator info                                                                                                                                                                                                                 |
+------------------------------------------------+-------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| HashAgg_24                                     | 1.00        | root      |                | funcs:count(1)->Column#48                                                                                                                                                                                                     |
| └─IndexJoin_30                                 | 8738986.41  | root      |                | inner join, inner:TableReader_27, outer key:tpch.orders.o_orderkey, inner key:tpch.lineitem.l_orderkey, equal cond:eq(tpch.orders.o_orderkey, tpch.lineitem.l_orderkey), eq(tpch.supplier.s_suppkey, tpch.lineitem.l_suppkey) |
|   ├─HashJoin_37(Build)                         | 2196527.02  | root      |                | inner join, equal:[eq(tpch.customer.c_custkey, tpch.orders.o_custkey)]                                                                                                                                                        |
|   │ ├─HashJoin_39(Build)                       | 1431131.00  | root      |                | inner join, equal:[eq(tpch.supplier.s_nationkey, tpch.customer.c_nationkey)]                                                                                                                                                  |
|   │ │ ├─HashJoin_41(Build)                     | 25.00       | root      |                | inner join, equal:[eq(tpch.nation.n_nationkey, tpch.supplier.s_nationkey)]                                                                                                                                                    |
|   │ │ │ ├─HashJoin_54(Build)                   | 0.01        | root      |                | inner join, equal:[eq(tpch.region.r_regionkey, tpch.nation.n_regionkey)]                                                                                                                                                      |
|   │ │ │ │ ├─TableReader_59(Build)              | 0.01        | root      |                | data:Selection_58                                                                                                                                                                                                             |
|   │ │ │ │ │ └─Selection_58                     | 0.01        | cop[tikv] |                | eq(tpch.region.r_name, "ASIA")                                                                                                                                                                                                |
|   │ │ │ │ │   └─TableFullScan_57               | 5.00        | cop[tikv] | table:region   | keep order:false, stats:pseudo                                                                                                                                                                                                |
|   │ │ │ │ └─TableReader_56(Probe)              | 25.00       | root      |                | data:TableFullScan_55                                                                                                                                                                                                         |
|   │ │ │ │   └─TableFullScan_55                 | 25.00       | cop[tikv] | table:nation   | keep order:false, stats:pseudo                                                                                                                                                                                                |
|   │ │ │ └─TableReader_61(Probe)                | 100000.00   | root      |                | data:TableFullScan_60                                                                                                                                                                                                         |
|   │ │ │   └─TableFullScan_60                   | 100000.00   | cop[tikv] | table:supplier | keep order:false                                                                                                                                                                                                              |
|   │ │ └─TableReader_63(Probe)                  | 1500000.00  | root      |                | data:TableFullScan_62                                                                                                                                                                                                         |
|   │ │   └─TableFullScan_62                     | 1500000.00  | cop[tikv] | table:customer | keep order:false                                                                                                                                                                                                              |
|   │ └─TableReader_66(Probe)                    | 2196527.02  | root      |                | data:Selection_65                                                                                                                                                                                                             |
|   │   └─Selection_65                           | 2196527.02  | cop[tikv] |                | ge(tpch.orders.o_orderdate, 1994-01-01), lt(tpch.orders.o_orderdate, 1995-01-01)                                                                                                                                              |
|   │     └─TableFullScan_64                     | 14457280.00 | cop[tikv] | table:orders   | keep order:false                                                                                                                                                                                                              |
|   └─TableReader_27(Probe)                      | 2196527.02  | root      |                | data:TableRangeScan_26                                                                                                                                                                                                        |
|     └─TableRangeScan_26                        | 2196527.02  | cop[tikv] | table:lineitem | range: decided by [eq(tpch.lineitem.l_orderkey, tpch.orders.o_orderkey)], keep order:false                                                                                                                                    |
+------------------------------------------------+-------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
20 rows in set (0.00 sec)

当参数为默认设置时,内存相关参数为:

mysql> show variables like 'tidb_mem_quota_query';
+----------------------+------------+
| Variable_name        | Value      |
+----------------------+------------+
| tidb_mem_quota_query | 1073741824 |
+----------------------+------------+
1 row in set (0.01 sec)
mysql> show variables like '%oom%';
+--------------------------------+--------+
| Variable_name                  | Value  |
+--------------------------------+--------+
| tidb_enable_tmp_storage_on_oom | ON     |
| tidb_mem_oom_action            | CANCEL |
+--------------------------------+--------+
2 rows in set (0.00 sec)

mysql> show config where name like 'tmp-storage%';
+------+---------------------+-------------------+-----------------------------------------------------------------+
| Type | Instance            | Name              | Value                                                           |
+------+---------------------+-------------------+-----------------------------------------------------------------+
| tidb | 192.168.31.201:4002 | tmp-storage-path  | /tmp/1000_tidb/MC4wLjAuMDo0MDAyLzAuMC4wLjA6MTAwODI=/tmp-storage |
| tidb | 192.168.31.201:4002 | tmp-storage-quota | -1                                                              |
+------+---------------------+-------------------+-----------------------------------------------------------------+
2 rows in set (0.01 sec)

执行该语句:

mysql> explain analyze select          count(*) from         customer,         orders,         lineitem,         supplier,         nation,         region where         c_custkey = o_custkey         and l_orderkey = o_orderkey
       and l_suppkey = s_suppkey         and c_nationkey = s_nationkey         and s_nationkey = n_nationkey         and n_regionkey = r_regionkey         and r_name = 'ASIA'         and o_orderdate >= date '1994-01-01'
 and o_orderdate < date '1994-01-01' + interval '1' year;
ERROR 1105 (HY000): context canceled
mysql> explain analyze select          count(*) from         customer,         orders,         lineitem,         supplier,         nation,         region where         c_custkey = o_custkey         and l_orderkey = o_orderkey
       and l_suppkey = s_suppkey         and c_nationkey = s_nationkey         and s_nationkey = n_nationkey         and n_regionkey = r_regionkey         and r_name = 'ASIA'         and o_orderdate >= date '1994-01-01'
 and o_orderdate < date '1994-01-01' + interval '1' year;
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id:    415
Current database: tpch

ERROR 1105 (HY000): Out Of Memory Quota![conn_id=293158387267404191]

可以看到发生了OOM,这里我已经设置了落盘,但是还是被cancel掉了!

修改参数set global tidb_mem_oom_action=LOG; 让其超出内存后仅打印日志,不杀掉该语句,但是根据oom优先级还是会尽可能的落到磁盘,这里观察语句真正执行时候到底会不得不用到多少内存,explain analyze执行结果如下:

+------------------------------------------------+-------------+------------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+
| id                                             | estRows     | actRows    | task      | access object  | execution info                                                                                                                                                                                                                                                                                                                                                                                                              | operator info                                                                                                                                                                                                                 | memory    | disk      |
+------------------------------------------------+-------------+------------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+
| HashAgg_24                                     | 1.00        | 1          | root      |                | time:40m16.7s, loops:2, partial_worker:{wall_time:40m16.708802251s, concurrency:5, task_num:72, tot_wait:3h21m23.538992743s, tot_exec:2.902767ms, tot_time:3h21m23.543541588s, max:40m16.708722824s, p95:40m16.708722824s}, final_worker:{wall_time:40m16.709129743s, concurrency:5, task_num:5, tot_wait:3h21m23.543673912s, tot_exec:342.368µs, tot_time:3h21m23.544019075s, max:40m16.709041758s, p95:40m16.709041758s}  | funcs:count(1)->Column#48                                                                                                                                                                                                     | 165.3 KB  | N/A       |
| └─IndexJoin_30                                 | 8738986.41  | 72985      | root      |                | time:40m16.7s, loops:73, inner:{total:12m43.5s, concurrency:5, task:71508, construct:10m57.2s, fetch:1m44.8s, build:1.33s}, probe:2m15s                                                                                                                                                                                                                                                                                     | inner join, inner:TableReader_27, outer key:tpch.orders.o_orderkey, inner key:tpch.lineitem.l_orderkey, equal cond:eq(tpch.orders.o_orderkey, tpch.lineitem.l_orderkey), eq(tpch.supplier.s_suppkey, tpch.lineitem.l_suppkey) | 10.2 MB   | N/A       |
|   ├─HashJoin_37(Build)                         | 2196527.02  | 1830400706 | root      |                | time:39m51.8s, loops:1787505, build_hash_table:{total:15m1s, fetch:11m33.8s, build:3m27.3s}, probe:{concurrency:5, total:3h21m17.6s, max:40m16.7s, probe:2h6m12.3s, fetch:1h15m5.3s}                                                                                                                                                                                                                                        | inner join, equal:[eq(tpch.customer.c_custkey, tpch.orders.o_custkey)]                                                                                                                                                        | 17.9 GB   | 44.8 GB   |
|   │ ├─HashJoin_39(Build)                       | 1431131.00  | 1203307921 | root      |                | time:12m31.8s, loops:1175109, build_hash_table:{total:15.5ms, fetch:13.4ms, build:2.11ms}, probe:{concurrency:5, total:1h15m1.6s, max:15m1s, probe:1h15m1.3s, fetch:245ms}                                                                                                                                                                                                                                                  | inner join, equal:[eq(tpch.supplier.s_nationkey, tpch.customer.c_nationkey)]                                                                                                                                                  | 846.4 KB  | 782.7 KB  |
|   │ │ ├─HashJoin_41(Build)                     | 25.00       | 20037      | root      |                | time:14.5ms, loops:23, build_hash_table:{total:2.36ms, fetch:2.35ms, build:10.1µs}, probe:{concurrency:5, total:75.2ms, max:15.2ms, probe:15ms, fetch:60.2ms}                                                                                                                                                                                                                                                               | inner join, equal:[eq(tpch.nation.n_nationkey, tpch.supplier.s_nationkey)]                                                                                                                                                    | 1.59 KB   | 120 Bytes |
|   │ │ │ ├─HashJoin_54(Build)                   | 0.01        | 5          | root      |                | time:2.34ms, loops:2, build_hash_table:{total:2.14ms, fetch:2.13ms, build:10.1µs}, probe:{concurrency:5, total:10.7ms, max:2.22ms, probe:66.6µs, fetch:10.6ms}                                                                                                                                                                                                                                                              | inner join, equal:[eq(tpch.region.r_regionkey, tpch.nation.n_regionkey)]                                                                                                                                                      | 31.1 KB   | 36 Bytes  |
|   │ │ │ │ ├─TableReader_59(Build)              | 0.01        | 1          | root      |                | time:2.11ms, loops:2, cop_task: {num: 1, max: 2.6ms, proc_keys: 5, rpc_num: 1, rpc_time: 2.57ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                                                                                                                                                                                       | data:Selection_58                                                                                                                                                                                                             | 312 Bytes | N/A       |
|   │ │ │ │ │ └─Selection_58                     | 0.01        | 1          | cop[tikv] |                | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 5, total_process_keys_size: 559, total_keys: 6, get_snapshot_time: 3.1µs, rocksdb: {key_skipped_count: 5, block: {cache_hit_count: 4}}}                                                                                                                                                                                                                     | eq(tpch.region.r_name, "ASIA")                                                                                                                                                                                                | N/A       | N/A       |
|   │ │ │ │ │   └─TableFullScan_57               | 5.00        | 5          | cop[tikv] | table:region   | tikv_task:{time:0s, loops:1}                                                                                                                                                                                                                                                                                                                                                                                                | keep order:false, stats:pseudo                                                                                                                                                                                                | N/A       | N/A       |
|   │ │ │ │ └─TableReader_56(Probe)              | 25.00       | 25         | root      |                | time:469.1µs, loops:2, cop_task: {num: 1, max: 961.6µs, proc_keys: 25, rpc_num: 1, rpc_time: 938.3µs, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                                                                                                                                                                                  | data:TableFullScan_55                                                                                                                                                                                                         | 679 Bytes | N/A       |
|   │ │ │ │   └─TableFullScan_55                 | 25.00       | 25         | cop[tikv] | table:nation   | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 25, total_process_keys_size: 3109, total_keys: 26, get_snapshot_time: 7.15µs, rocksdb: {key_skipped_count: 25, block: {cache_hit_count: 4}}}                                                                                                                                                                                                                | keep order:false, stats:pseudo                                                                                                                                                                                                | N/A       | N/A       |
|   │ │ │ └─TableReader_61(Probe)                | 100000.00   | 100000     | root      |                | time:11.8ms, loops:100, cop_task: {num: 10, max: 2.55ms, min: 770.1µs, avg: 1.29ms, p95: 2.55ms, max_proc_keys: 480, p95_proc_keys: 480, rpc_num: 10, rpc_time: 12.8ms, copr_cache_hit_ratio: 0.90, distsql_concurrency: 15}                                                                                                                                                                                                | data:TableFullScan_60                                                                                                                                                                                                         | 964.4 KB  | N/A       |
|   │ │ │   └─TableFullScan_60                   | 100000.00   | 100000     | cop[tikv] | table:supplier | tikv_task:{proc max:40ms, min:1ms, avg: 11ms, p80:20ms, p95:40ms, iters:137, tasks:10}, scan_detail: {total_process_keys: 480, total_process_keys_size: 87745, total_keys: 481, get_snapshot_time: 254.5µs, rocksdb: {key_skipped_count: 480, block: {cache_hit_count: 6}}}                                                                                                                                                 | keep order:false                                                                                                                                                                                                              | N/A       | N/A       |
|   │ │ └─TableReader_63(Probe)                  | 1500000.00  | 1500000    | root      |                | time:153ms, loops:1468, cop_task: {num: 54, max: 31ms, min: 242.7µs, avg: 9.32ms, p95: 27.2ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 395ms, tot_wait: 14ms, rpc_num: 54, rpc_time: 502ms, copr_cache_hit_ratio: 0.37, distsql_concurrency: 15}                                                                                                                                                              | data:TableFullScan_62                                                                                                                                                                                                         | 3.07 MB   | N/A       |
|   │ │   └─TableFullScan_62                     | 1500000.00  | 1500000    | cop[tikv] | table:customer | tikv_task:{proc max:27ms, min:0s, avg: 10.5ms, p80:19ms, p95:24ms, iters:1678, tasks:54}, scan_detail: {total_process_keys: 1044960, total_process_keys_size: 212609983, total_keys: 1044994, get_snapshot_time: 12.4ms, rocksdb: {key_skipped_count: 1044960, block: {cache_hit_count: 3603}}}                                                                                                                             | keep order:false                                                                                                                                                                                                              | N/A       | N/A       |
|   │ └─TableReader_66(Probe)                    | 2196527.02  | 2275919    | root      |                | time:52.4ms, loops:2217, cop_task: {num: 230, max: 212.8ms, min: 310.7µs, avg: 32.6ms, p95: 133.5ms, max_proc_keys: 331744, p95_proc_keys: 219104, tot_proc: 7.06s, tot_wait: 40ms, rpc_num: 230, rpc_time: 7.5s, copr_cache_hit_ratio: 0.21, distsql_concurrency: 15}                                                                                                                                                      | data:Selection_65                                                                                                                                                                                                             | 12.0 MB   | N/A       |
|   │   └─Selection_65                           | 2196527.02  | 2275919    | cop[tikv] |                | tikv_task:{proc max:202ms, min:0s, avg: 30.7ms, p80:56ms, p95:125ms, iters:15588, tasks:230}, scan_detail: {total_process_keys: 14790624, total_process_keys_size: 2245939801, total_keys: 14790805, get_snapshot_time: 32.7ms, rocksdb: {key_skipped_count: 14790624, block: {cache_hit_count: 38074}}}                                                                                                                    | ge(tpch.orders.o_orderdate, 1994-01-01), lt(tpch.orders.o_orderdate, 1995-01-01)                                                                                                                                              | N/A       | N/A       |
|   │     └─TableFullScan_64                     | 14457280.00 | 15000000   | cop[tikv] | table:orders   | tikv_task:{proc max:185ms, min:0s, avg: 28.2ms, p80:51ms, p95:115ms, iters:15588, tasks:230}                                                                                                                                                                                                                                                                                                                                | keep order:false                                                                                                                                                                                                              | N/A       | N/A       |
|   └─TableReader_27(Probe)                      | 2196527.02  | 3246208    | root      |                | time:1m32.3s, loops:143016, cop_task: {num: 105579, max: 128.6ms, min: 352.2µs, avg: 1.09ms, p95: 1.72ms, max_proc_keys: 74, p95_proc_keys: 54, tot_proc: 294ms, tot_wait: 3.11s, rpc_num: 105579, rpc_time: 1m51.6s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                                                                  | data:TableRangeScan_26                                                                                                                                                                                                        | N/A       | N/A       |
|     └─TableRangeScan_26                        | 2196527.02  | 3246208    | cop[tikv] | table:lineitem | tikv_task:{proc max:12ms, min:0s, avg: 179.2µs, p80:0s, p95:1ms, iters:160687, tasks:105579}, scan_detail: {total_process_keys: 3246208, total_process_keys_size: 636572514, total_keys: 4058610, get_snapshot_time: 2.9s, rocksdb: {key_skipped_count: 3246208, block: {cache_hit_count: 2613923}}}                                                                                                                        | range: decided by [eq(tpch.lineitem.l_orderkey, tpch.orders.o_orderkey)], keep order:false                                                                                                                                    | N/A       | N/A       |
+------------------------------------------------+-------------+------------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+
20 rows in set (40 min 17.39 sec)

可以看到HashJoin_37最多时候用了17.9GB的内存和44.8GB的磁盘空间。

heap和CPU信息截图如下:
image
image
CPU这里主要是落盘慢,参考这里:https://asktug.com/t/topic/994873 (sort、hashjoin、agg等落盘都一套逻辑,优化后均受益)

对nation和region表做统计信息搜集让其走正确的执行计划后即可较快完成:

mysql> explain analyze
    -> select 
    ->         count(*)
    -> from
    ->         customer,
    ->         orders,
    ->         lineitem,
    ->         supplier,
    ->         nation,
    ->         region
    -> where
    ->         c_custkey = o_custkey
    ->         and l_orderkey = o_orderkey
    ->         and l_suppkey = s_suppkey
    ->         and c_nationkey = s_nationkey
    ->         and s_nationkey = n_nationkey
    ->         and n_regionkey = r_regionkey
    ->         and r_name = 'ASIA'
    ->         and o_orderdate >= date '1994-01-01'
    ->         and o_orderdate < date '1994-01-01' + interval '1' year;
+--------------------------------------------+-------------+----------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
| id                                         | estRows     | actRows  | task      | access object  | execution info                                                                                                                                                                                                                                                                                                                                                              | operator info                                                                                                                   | memory    | disk    |
+--------------------------------------------+-------------+----------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
| HashAgg_24                                 | 1.00        | 1        | root      |                | time:5.69s, loops:2, partial_worker:{wall_time:5.690198408s, concurrency:5, task_num:74, tot_wait:28.445986387s, tot_exec:3.734988ms, tot_time:28.449773521s, max:5.689979264s, p95:5.689979264s}, final_worker:{wall_time:5.690280272s, concurrency:5, task_num:5, tot_wait:28.449951686s, tot_exec:22.054µs, tot_time:28.449976184s, max:5.690022202s, p95:5.690022202s}  | funcs:count(1)->Column#48                                                                                                       | 165.3 KB  | N/A     |
| └─HashJoin_34                              | 2196527.02  | 72985    | root      |                | time:5.69s, loops:75, build_hash_table:{total:735ms, fetch:60.2ms, build:674.8ms}, probe:{concurrency:5, total:28.4s, max:5.69s, probe:1.06s, fetch:27.4s}                                                                                                                                                                                                                  | inner join, equal:[eq(tpch.orders.o_custkey, tpch.customer.c_custkey) eq(tpch.supplier.s_nationkey, tpch.customer.c_nationkey)] | 117.9 MB  | 0 Bytes |
|   ├─TableReader_80(Build)                  | 1500000.00  | 1500000  | root      |                | time:48.3ms, loops:1469, cop_task: {num: 54, max: 47.9ms, min: 456.6µs, avg: 14.5ms, p95: 35.6ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 72ms, tot_wait: 404ms, rpc_num: 54, rpc_time: 784ms, copr_cache_hit_ratio: 0.74, distsql_concurrency: 15}                                                                                                           | data:TableFullScan_79                                                                                                           | 3.07 MB   | N/A     |
|   │ └─TableFullScan_79                     | 1500000.00  | 1500000  | cop[tikv] | table:customer | tikv_task:{proc max:27ms, min:0s, avg: 10.6ms, p80:19ms, p95:24ms, iters:1678, tasks:54}, scan_detail: {total_process_keys: 187199, total_process_keys_size: 38071167, total_keys: 187213, get_snapshot_time: 2.83ms, rocksdb: {key_skipped_count: 187199, block: {cache_hit_count: 677}}}                                                                                  | keep order:false                                                                                                                | N/A       | N/A     |
|   └─HashJoin_48(Probe)                     | 2196527.02  | 1825042  | root      |                | time:5.69s, loops:1786, build_hash_table:{total:1.3s, fetch:345.2ms, build:954.8ms}, probe:{concurrency:5, total:28.4s, max:5.69s, probe:6.4s, fetch:22s}                                                                                                                                                                                                                   | inner join, equal:[eq(tpch.lineitem.l_orderkey, tpch.orders.o_orderkey)]                                                        | 246.2 MB  | 0 Bytes |
|     ├─TableReader_78(Build)                | 2196527.02  | 2275919  | root      |                | time:360.3ms, loops:2217, cop_task: {num: 230, max: 274.5ms, min: 366.2µs, avg: 44.1ms, p95: 170.9ms, max_proc_keys: 331744, p95_proc_keys: 219104, tot_proc: 7.97s, tot_wait: 1.08s, rpc_num: 230, rpc_time: 10.1s, copr_cache_hit_ratio: 0.25, distsql_concurrency: 15}                                                                                                   | data:Selection_77                                                                                                               | 8.86 MB   | N/A     |
|     │ └─Selection_77                       | 2196527.02  | 2275919  | cop[tikv] |                | tikv_task:{proc max:246ms, min:0s, avg: 35.1ms, p80:73ms, p95:146ms, iters:15588, tasks:230}, scan_detail: {total_process_keys: 14750944, total_process_keys_size: 2239915746, total_keys: 14751117, get_snapshot_time: 29ms, rocksdb: {key_skipped_count: 14750944, block: {cache_hit_count: 37943}}}                                                                      | ge(tpch.orders.o_orderdate, 1994-01-01), lt(tpch.orders.o_orderdate, 1995-01-01)                                                | N/A       | N/A     |
|     │   └─TableFullScan_76                 | 14457280.00 | 15000000 | cop[tikv] | table:orders   | tikv_task:{proc max:236ms, min:0s, avg: 32.8ms, p80:67ms, p95:135ms, iters:15588, tasks:230}                                                                                                                                                                                                                                                                                | keep order:false                                                                                                                | N/A       | N/A     |
|     └─HashJoin_51(Probe)                   | 12074487.12 | 12021065 | root      |                | time:4.45s, loops:11742, build_hash_table:{total:102.3ms, fetch:98.9ms, build:3.42ms}, probe:{concurrency:5, total:28.4s, max:5.69s, probe:21.4s, fetch:7.06s}                                                                                                                                                                                                              | inner join, equal:[eq(tpch.supplier.s_suppkey, tpch.lineitem.l_suppkey)]                                                        | 1.77 MB   | 0 Bytes |
|       ├─HashJoin_53(Build)                 | 20000.00    | 20037    | root      |                | time:100.6ms, loops:25, build_hash_table:{total:358.3µs, fetch:349.1µs, build:9.23µs}, probe:{concurrency:5, total:508.4ms, max:101.8ms, probe:21.4ms, fetch:487.1ms}                                                                                                                                                                                                       | inner join, equal:[eq(tpch.nation.n_nationkey, tpch.supplier.s_nationkey)]                                                      | 1.59 KB   | 0 Bytes |
|       │ ├─HashJoin_66(Build)               | 5.00        | 5        | root      |                | time:333.7µs, loops:2, build_hash_table:{total:156.6µs, fetch:148.2µs, build:8.39µs}, probe:{concurrency:5, total:769.8µs, max:174.1µs, probe:39.8µs, fetch:730µs}                                                                                                                                                                                                          | inner join, equal:[eq(tpch.region.r_regionkey, tpch.nation.n_regionkey)]                                                        | 28.5 KB   | 0 Bytes |
|       │ │ ├─TableReader_71(Build)          | 1.00        | 1        | root      |                | time:39.4µs, loops:2, cop_task: {num: 1, max: 1.19ms, proc_keys: 5, rpc_num: 1, rpc_time: 1.15ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                                                                                                                                      | data:Selection_70                                                                                                               | 313 Bytes | N/A     |
|       │ │ │ └─Selection_70                 | 1.00        | 1        | cop[tikv] |                | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 5, total_process_keys_size: 559, total_keys: 6, get_snapshot_time: 35.8µs, rocksdb: {key_skipped_count: 5, block: {cache_hit_count: 4}}}                                                                                                                                                                    | eq(tpch.region.r_name, "ASIA")                                                                                                  | N/A       | N/A     |
|       │ │ │   └─TableFullScan_69           | 5.00        | 5        | cop[tikv] | table:region   | tikv_task:{time:0s, loops:1}                                                                                                                                                                                                                                                                                                                                                | keep order:false                                                                                                                | N/A       | N/A     |
|       │ │ └─TableReader_68(Probe)          | 25.00       | 25       | root      |                | time:41.6µs, loops:2, cop_task: {num: 1, max: 1.08ms, proc_keys: 25, rpc_num: 1, rpc_time: 1.06ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                                                                                                                                     | data:TableFullScan_67                                                                                                           | 680 Bytes | N/A     |
|       │ │   └─TableFullScan_67             | 25.00       | 25       | cop[tikv] | table:nation   | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 25, total_process_keys_size: 3109, total_keys: 26, get_snapshot_time: 29.7µs, rocksdb: {key_skipped_count: 25, block: {cache_hit_count: 4}}}                                                                                                                                                                | keep order:false                                                                                                                | N/A       | N/A     |
|       │ └─TableReader_73(Probe)            | 100000.00   | 100000   | root      |                | time:95.4ms, loops:100, cop_task: {num: 10, max: 19.4ms, min: 931.9µs, avg: 9.87ms, p95: 19.4ms, max_proc_keys: 480, p95_proc_keys: 480, rpc_num: 10, rpc_time: 98.5ms, copr_cache_hit_ratio: 0.90, distsql_concurrency: 15}                                                                                                                                                | data:TableFullScan_72                                                                                                           | 964.4 KB  | N/A     |
|       │   └─TableFullScan_72               | 100000.00   | 100000   | cop[tikv] | table:supplier | tikv_task:{proc max:40ms, min:1ms, avg: 11ms, p80:20ms, p95:40ms, iters:137, tasks:10}, scan_detail: {total_process_keys: 480, total_process_keys_size: 87745, total_keys: 481, get_snapshot_time: 342.1µs, rocksdb: {key_skipped_count: 480, block: {cache_hit_count: 6}}}                                                                                                 | keep order:false                                                                                                                | N/A       | N/A     |
|       └─TableReader_75(Probe)              | 59986052.00 | 59986052 | root      |                | time:924.6ms, loops:58781, cop_task: {num: 2197, max: 103.9ms, min: 724.9µs, avg: 23.6ms, p95: 51.4ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 31.3s, tot_wait: 4.27s, rpc_num: 2197, rpc_time: 51.7s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}                                                                                                   | data:TableFullScan_74                                                                                                           | 7.78 MB   | N/A     |
|         └─TableFullScan_74                 | 59986052.00 | 59986052 | cop[tikv] | table:lineitem | tikv_task:{proc max:90ms, min:0s, avg: 14ms, p80:26ms, p95:35ms, iters:67264, tasks:2197}, scan_detail: {total_process_keys: 59972516, total_process_keys_size: 11760420816, total_keys: 59974704, get_snapshot_time: 265.8ms, rocksdb: {key_skipped_count: 59972516, block: {cache_hit_count: 192071}}}                                                                    | keep order:false                                                                                                                | N/A       | N/A     |
+--------------------------------------------+-------------+----------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
20 rows in set (5.69 sec)

根据误打正着,沿着错误的执行计划分析,有一点不解:
为何在语句控制内存为1GB且 tidb_mem_oom_action=CANCEL 和tidb_enable_tmp_storage_on_oom=ON的前提下本应该超过内存落盘但是这里被cancel掉了呢?
是不是这里的没有落盘问题导致的?https://github.com/pingcap/tidb/blob/d0b72a2cc79bc8568935e30d49d1f269af1ce0ee/executor/hash_table.go#L242

这样的大表还是放在cdh大数据里面跑比较合适

可能是这个原因,可以验证一下,将 tidb_enable_tmp_storage_on_oom 关掉,试试 。https://docs.pingcap.com/zh/tidb/dev/system-variables#tidb_mem_oom_action-从-v610-版本开始引入

和这个没关系吧,红色圈起来的不能再利用临时磁盘应该是如果有就优先用,如果没有才会cancel。
如果tidb_enable_tmp_storage_on_oom=OFF一定会被cancel,更不符合要求。
这里我认为应该先落盘,当磁盘没有空间后才cancel(也不会用超过tidb_mem_quota_query的内存)。
我实际测试时候磁盘没有满。

hashjoin的hashtable相关落盘应该在这里:https://github.com/pingcap/tidb/blob/706c3fa3c526cdba5b3e9f066b1a568fb96c56e3/executor/join.go#L1206
MemTracker.FallbackOldAndSetNewAction 这个方法是给当前对象上的动作队列上添加新的动作,然后根据动作的优先级属性进行重排,优先级高的调用时候最先执行。
当打开了落盘后,落盘的优先级要高于cancel。
优先级的顺序为:https://github.com/pingcap/tidb/blob/706c3fa3c526cdba5b3e9f066b1a568fb96c56e3/util/memory/action.go#L81
开启限流则优先限流,然后落盘,然后(oom-action=log则只是打印日志警告,如果为cancel则直接kill)。
因此这里落盘和cancel的逻辑不冲突,应该先落盘,无法落盘后再cancel。这里的问题感觉是做hashtable时候攒批落盘的“批”过大导致内存不足然后被杀掉,但是不知道是不是这样子,如果是是不是应该代码层面做下优化。

如果磁盘没有满, 应该不会 cancel。稍后测试一下。

这里有测试结果,是多次重复测试的结果。
你应该比较难复现,因为需要nation和regon表不搜集统计信息(psudo)。

如果复现需要把这两个表的统计信息清空(可以重建表方式来清空统计信息)。

Mark,这里的OOM主要是内部发生了cross join,issues见这里:https://github.com/pingcap/tidb/issues/40500
后续应该会修复。