【 TiDB 使用环境】Poc
【 TiDB 版本】6.5.0
【复现路径】 组件涉及:tidb、tikv 在默认参数情况下执行TPCH的Q5语句时候发现在错误的执行计划下总是出现语句级别的OOM。
Q5语句本来是对结果集做分组聚合,之类进行简化,测试语句和执行计划为(因表较小这里没有对nation和region表做统计信息搜集,其它表做了搜集,导致了错误的执行计划):
select
count(*)
from
customer,
orders,
lineitem,
supplier,
nation,
region
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and l_suppkey = s_suppkey
and c_nationkey = s_nationkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'ASIA'
and o_orderdate >= date '1994-01-01'
and o_orderdate < date '1994-01-01' + interval '1' year;
+------------------------------------------------+-------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------------------------+-------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| HashAgg_24 | 1.00 | root | | funcs:count(1)->Column#48 |
| └─IndexJoin_30 | 8738986.41 | root | | inner join, inner:TableReader_27, outer key:tpch.orders.o_orderkey, inner key:tpch.lineitem.l_orderkey, equal cond:eq(tpch.orders.o_orderkey, tpch.lineitem.l_orderkey), eq(tpch.supplier.s_suppkey, tpch.lineitem.l_suppkey) |
| ├─HashJoin_37(Build) | 2196527.02 | root | | inner join, equal:[eq(tpch.customer.c_custkey, tpch.orders.o_custkey)] |
| │ ├─HashJoin_39(Build) | 1431131.00 | root | | inner join, equal:[eq(tpch.supplier.s_nationkey, tpch.customer.c_nationkey)] |
| │ │ ├─HashJoin_41(Build) | 25.00 | root | | inner join, equal:[eq(tpch.nation.n_nationkey, tpch.supplier.s_nationkey)] |
| │ │ │ ├─HashJoin_54(Build) | 0.01 | root | | inner join, equal:[eq(tpch.region.r_regionkey, tpch.nation.n_regionkey)] |
| │ │ │ │ ├─TableReader_59(Build) | 0.01 | root | | data:Selection_58 |
| │ │ │ │ │ └─Selection_58 | 0.01 | cop[tikv] | | eq(tpch.region.r_name, "ASIA") |
| │ │ │ │ │ └─TableFullScan_57 | 5.00 | cop[tikv] | table:region | keep order:false, stats:pseudo |
| │ │ │ │ └─TableReader_56(Probe) | 25.00 | root | | data:TableFullScan_55 |
| │ │ │ │ └─TableFullScan_55 | 25.00 | cop[tikv] | table:nation | keep order:false, stats:pseudo |
| │ │ │ └─TableReader_61(Probe) | 100000.00 | root | | data:TableFullScan_60 |
| │ │ │ └─TableFullScan_60 | 100000.00 | cop[tikv] | table:supplier | keep order:false |
| │ │ └─TableReader_63(Probe) | 1500000.00 | root | | data:TableFullScan_62 |
| │ │ └─TableFullScan_62 | 1500000.00 | cop[tikv] | table:customer | keep order:false |
| │ └─TableReader_66(Probe) | 2196527.02 | root | | data:Selection_65 |
| │ └─Selection_65 | 2196527.02 | cop[tikv] | | ge(tpch.orders.o_orderdate, 1994-01-01), lt(tpch.orders.o_orderdate, 1995-01-01) |
| │ └─TableFullScan_64 | 14457280.00 | cop[tikv] | table:orders | keep order:false |
| └─TableReader_27(Probe) | 2196527.02 | root | | data:TableRangeScan_26 |
| └─TableRangeScan_26 | 2196527.02 | cop[tikv] | table:lineitem | range: decided by [eq(tpch.lineitem.l_orderkey, tpch.orders.o_orderkey)], keep order:false |
+------------------------------------------------+-------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
20 rows in set (0.00 sec)
当参数为默认设置时,内存相关参数为:
mysql> show variables like 'tidb_mem_quota_query';
+----------------------+------------+
| Variable_name | Value |
+----------------------+------------+
| tidb_mem_quota_query | 1073741824 |
+----------------------+------------+
1 row in set (0.01 sec)
mysql> show variables like '%oom%';
+--------------------------------+--------+
| Variable_name | Value |
+--------------------------------+--------+
| tidb_enable_tmp_storage_on_oom | ON |
| tidb_mem_oom_action | CANCEL |
+--------------------------------+--------+
2 rows in set (0.00 sec)
mysql> show config where name like 'tmp-storage%';
+------+---------------------+-------------------+-----------------------------------------------------------------+
| Type | Instance | Name | Value |
+------+---------------------+-------------------+-----------------------------------------------------------------+
| tidb | 192.168.31.201:4002 | tmp-storage-path | /tmp/1000_tidb/MC4wLjAuMDo0MDAyLzAuMC4wLjA6MTAwODI=/tmp-storage |
| tidb | 192.168.31.201:4002 | tmp-storage-quota | -1 |
+------+---------------------+-------------------+-----------------------------------------------------------------+
2 rows in set (0.01 sec)
执行该语句:
mysql> explain analyze select count(*) from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey
and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' and o_orderdate >= date '1994-01-01'
and o_orderdate < date '1994-01-01' + interval '1' year;
ERROR 1105 (HY000): context canceled
mysql> explain analyze select count(*) from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey
and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' and o_orderdate >= date '1994-01-01'
and o_orderdate < date '1994-01-01' + interval '1' year;
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id: 415
Current database: tpch
ERROR 1105 (HY000): Out Of Memory Quota![conn_id=293158387267404191]
可以看到发生了OOM,这里我已经设置了落盘,但是还是被cancel掉了!
修改参数set global tidb_mem_oom_action=LOG; 让其超出内存后仅打印日志,不杀掉该语句,但是根据oom优先级还是会尽可能的落到磁盘,这里观察语句真正执行时候到底会不得不用到多少内存,explain analyze执行结果如下:
+------------------------------------------------+-------------+------------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------------------------+-------------+------------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+
| HashAgg_24 | 1.00 | 1 | root | | time:40m16.7s, loops:2, partial_worker:{wall_time:40m16.708802251s, concurrency:5, task_num:72, tot_wait:3h21m23.538992743s, tot_exec:2.902767ms, tot_time:3h21m23.543541588s, max:40m16.708722824s, p95:40m16.708722824s}, final_worker:{wall_time:40m16.709129743s, concurrency:5, task_num:5, tot_wait:3h21m23.543673912s, tot_exec:342.368µs, tot_time:3h21m23.544019075s, max:40m16.709041758s, p95:40m16.709041758s} | funcs:count(1)->Column#48 | 165.3 KB | N/A |
| └─IndexJoin_30 | 8738986.41 | 72985 | root | | time:40m16.7s, loops:73, inner:{total:12m43.5s, concurrency:5, task:71508, construct:10m57.2s, fetch:1m44.8s, build:1.33s}, probe:2m15s | inner join, inner:TableReader_27, outer key:tpch.orders.o_orderkey, inner key:tpch.lineitem.l_orderkey, equal cond:eq(tpch.orders.o_orderkey, tpch.lineitem.l_orderkey), eq(tpch.supplier.s_suppkey, tpch.lineitem.l_suppkey) | 10.2 MB | N/A |
| ├─HashJoin_37(Build) | 2196527.02 | 1830400706 | root | | time:39m51.8s, loops:1787505, build_hash_table:{total:15m1s, fetch:11m33.8s, build:3m27.3s}, probe:{concurrency:5, total:3h21m17.6s, max:40m16.7s, probe:2h6m12.3s, fetch:1h15m5.3s} | inner join, equal:[eq(tpch.customer.c_custkey, tpch.orders.o_custkey)] | 17.9 GB | 44.8 GB |
| │ ├─HashJoin_39(Build) | 1431131.00 | 1203307921 | root | | time:12m31.8s, loops:1175109, build_hash_table:{total:15.5ms, fetch:13.4ms, build:2.11ms}, probe:{concurrency:5, total:1h15m1.6s, max:15m1s, probe:1h15m1.3s, fetch:245ms} | inner join, equal:[eq(tpch.supplier.s_nationkey, tpch.customer.c_nationkey)] | 846.4 KB | 782.7 KB |
| │ │ ├─HashJoin_41(Build) | 25.00 | 20037 | root | | time:14.5ms, loops:23, build_hash_table:{total:2.36ms, fetch:2.35ms, build:10.1µs}, probe:{concurrency:5, total:75.2ms, max:15.2ms, probe:15ms, fetch:60.2ms} | inner join, equal:[eq(tpch.nation.n_nationkey, tpch.supplier.s_nationkey)] | 1.59 KB | 120 Bytes |
| │ │ │ ├─HashJoin_54(Build) | 0.01 | 5 | root | | time:2.34ms, loops:2, build_hash_table:{total:2.14ms, fetch:2.13ms, build:10.1µs}, probe:{concurrency:5, total:10.7ms, max:2.22ms, probe:66.6µs, fetch:10.6ms} | inner join, equal:[eq(tpch.region.r_regionkey, tpch.nation.n_regionkey)] | 31.1 KB | 36 Bytes |
| │ │ │ │ ├─TableReader_59(Build) | 0.01 | 1 | root | | time:2.11ms, loops:2, cop_task: {num: 1, max: 2.6ms, proc_keys: 5, rpc_num: 1, rpc_time: 2.57ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15} | data:Selection_58 | 312 Bytes | N/A |
| │ │ │ │ │ └─Selection_58 | 0.01 | 1 | cop[tikv] | | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 5, total_process_keys_size: 559, total_keys: 6, get_snapshot_time: 3.1µs, rocksdb: {key_skipped_count: 5, block: {cache_hit_count: 4}}} | eq(tpch.region.r_name, "ASIA") | N/A | N/A |
| │ │ │ │ │ └─TableFullScan_57 | 5.00 | 5 | cop[tikv] | table:region | tikv_task:{time:0s, loops:1} | keep order:false, stats:pseudo | N/A | N/A |
| │ │ │ │ └─TableReader_56(Probe) | 25.00 | 25 | root | | time:469.1µs, loops:2, cop_task: {num: 1, max: 961.6µs, proc_keys: 25, rpc_num: 1, rpc_time: 938.3µs, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15} | data:TableFullScan_55 | 679 Bytes | N/A |
| │ │ │ │ └─TableFullScan_55 | 25.00 | 25 | cop[tikv] | table:nation | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 25, total_process_keys_size: 3109, total_keys: 26, get_snapshot_time: 7.15µs, rocksdb: {key_skipped_count: 25, block: {cache_hit_count: 4}}} | keep order:false, stats:pseudo | N/A | N/A |
| │ │ │ └─TableReader_61(Probe) | 100000.00 | 100000 | root | | time:11.8ms, loops:100, cop_task: {num: 10, max: 2.55ms, min: 770.1µs, avg: 1.29ms, p95: 2.55ms, max_proc_keys: 480, p95_proc_keys: 480, rpc_num: 10, rpc_time: 12.8ms, copr_cache_hit_ratio: 0.90, distsql_concurrency: 15} | data:TableFullScan_60 | 964.4 KB | N/A |
| │ │ │ └─TableFullScan_60 | 100000.00 | 100000 | cop[tikv] | table:supplier | tikv_task:{proc max:40ms, min:1ms, avg: 11ms, p80:20ms, p95:40ms, iters:137, tasks:10}, scan_detail: {total_process_keys: 480, total_process_keys_size: 87745, total_keys: 481, get_snapshot_time: 254.5µs, rocksdb: {key_skipped_count: 480, block: {cache_hit_count: 6}}} | keep order:false | N/A | N/A |
| │ │ └─TableReader_63(Probe) | 1500000.00 | 1500000 | root | | time:153ms, loops:1468, cop_task: {num: 54, max: 31ms, min: 242.7µs, avg: 9.32ms, p95: 27.2ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 395ms, tot_wait: 14ms, rpc_num: 54, rpc_time: 502ms, copr_cache_hit_ratio: 0.37, distsql_concurrency: 15} | data:TableFullScan_62 | 3.07 MB | N/A |
| │ │ └─TableFullScan_62 | 1500000.00 | 1500000 | cop[tikv] | table:customer | tikv_task:{proc max:27ms, min:0s, avg: 10.5ms, p80:19ms, p95:24ms, iters:1678, tasks:54}, scan_detail: {total_process_keys: 1044960, total_process_keys_size: 212609983, total_keys: 1044994, get_snapshot_time: 12.4ms, rocksdb: {key_skipped_count: 1044960, block: {cache_hit_count: 3603}}} | keep order:false | N/A | N/A |
| │ └─TableReader_66(Probe) | 2196527.02 | 2275919 | root | | time:52.4ms, loops:2217, cop_task: {num: 230, max: 212.8ms, min: 310.7µs, avg: 32.6ms, p95: 133.5ms, max_proc_keys: 331744, p95_proc_keys: 219104, tot_proc: 7.06s, tot_wait: 40ms, rpc_num: 230, rpc_time: 7.5s, copr_cache_hit_ratio: 0.21, distsql_concurrency: 15} | data:Selection_65 | 12.0 MB | N/A |
| │ └─Selection_65 | 2196527.02 | 2275919 | cop[tikv] | | tikv_task:{proc max:202ms, min:0s, avg: 30.7ms, p80:56ms, p95:125ms, iters:15588, tasks:230}, scan_detail: {total_process_keys: 14790624, total_process_keys_size: 2245939801, total_keys: 14790805, get_snapshot_time: 32.7ms, rocksdb: {key_skipped_count: 14790624, block: {cache_hit_count: 38074}}} | ge(tpch.orders.o_orderdate, 1994-01-01), lt(tpch.orders.o_orderdate, 1995-01-01) | N/A | N/A |
| │ └─TableFullScan_64 | 14457280.00 | 15000000 | cop[tikv] | table:orders | tikv_task:{proc max:185ms, min:0s, avg: 28.2ms, p80:51ms, p95:115ms, iters:15588, tasks:230} | keep order:false | N/A | N/A |
| └─TableReader_27(Probe) | 2196527.02 | 3246208 | root | | time:1m32.3s, loops:143016, cop_task: {num: 105579, max: 128.6ms, min: 352.2µs, avg: 1.09ms, p95: 1.72ms, max_proc_keys: 74, p95_proc_keys: 54, tot_proc: 294ms, tot_wait: 3.11s, rpc_num: 105579, rpc_time: 1m51.6s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15} | data:TableRangeScan_26 | N/A | N/A |
| └─TableRangeScan_26 | 2196527.02 | 3246208 | cop[tikv] | table:lineitem | tikv_task:{proc max:12ms, min:0s, avg: 179.2µs, p80:0s, p95:1ms, iters:160687, tasks:105579}, scan_detail: {total_process_keys: 3246208, total_process_keys_size: 636572514, total_keys: 4058610, get_snapshot_time: 2.9s, rocksdb: {key_skipped_count: 3246208, block: {cache_hit_count: 2613923}}} | range: decided by [eq(tpch.lineitem.l_orderkey, tpch.orders.o_orderkey)], keep order:false | N/A | N/A |
+------------------------------------------------+-------------+------------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+
20 rows in set (40 min 17.39 sec)
可以看到HashJoin_37最多时候用了17.9GB的内存和44.8GB的磁盘空间。
heap和CPU信息截图如下:
CPU这里主要是落盘慢,参考这里:https://asktug.com/t/topic/994873 (sort、hashjoin、agg等落盘都一套逻辑,优化后均受益)
对nation和region表做统计信息搜集让其走正确的执行计划后即可较快完成:
mysql> explain analyze
-> select
-> count(*)
-> from
-> customer,
-> orders,
-> lineitem,
-> supplier,
-> nation,
-> region
-> where
-> c_custkey = o_custkey
-> and l_orderkey = o_orderkey
-> and l_suppkey = s_suppkey
-> and c_nationkey = s_nationkey
-> and s_nationkey = n_nationkey
-> and n_regionkey = r_regionkey
-> and r_name = 'ASIA'
-> and o_orderdate >= date '1994-01-01'
-> and o_orderdate < date '1994-01-01' + interval '1' year;
+--------------------------------------------+-------------+----------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+--------------------------------------------+-------------+----------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
| HashAgg_24 | 1.00 | 1 | root | | time:5.69s, loops:2, partial_worker:{wall_time:5.690198408s, concurrency:5, task_num:74, tot_wait:28.445986387s, tot_exec:3.734988ms, tot_time:28.449773521s, max:5.689979264s, p95:5.689979264s}, final_worker:{wall_time:5.690280272s, concurrency:5, task_num:5, tot_wait:28.449951686s, tot_exec:22.054µs, tot_time:28.449976184s, max:5.690022202s, p95:5.690022202s} | funcs:count(1)->Column#48 | 165.3 KB | N/A |
| └─HashJoin_34 | 2196527.02 | 72985 | root | | time:5.69s, loops:75, build_hash_table:{total:735ms, fetch:60.2ms, build:674.8ms}, probe:{concurrency:5, total:28.4s, max:5.69s, probe:1.06s, fetch:27.4s} | inner join, equal:[eq(tpch.orders.o_custkey, tpch.customer.c_custkey) eq(tpch.supplier.s_nationkey, tpch.customer.c_nationkey)] | 117.9 MB | 0 Bytes |
| ├─TableReader_80(Build) | 1500000.00 | 1500000 | root | | time:48.3ms, loops:1469, cop_task: {num: 54, max: 47.9ms, min: 456.6µs, avg: 14.5ms, p95: 35.6ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 72ms, tot_wait: 404ms, rpc_num: 54, rpc_time: 784ms, copr_cache_hit_ratio: 0.74, distsql_concurrency: 15} | data:TableFullScan_79 | 3.07 MB | N/A |
| │ └─TableFullScan_79 | 1500000.00 | 1500000 | cop[tikv] | table:customer | tikv_task:{proc max:27ms, min:0s, avg: 10.6ms, p80:19ms, p95:24ms, iters:1678, tasks:54}, scan_detail: {total_process_keys: 187199, total_process_keys_size: 38071167, total_keys: 187213, get_snapshot_time: 2.83ms, rocksdb: {key_skipped_count: 187199, block: {cache_hit_count: 677}}} | keep order:false | N/A | N/A |
| └─HashJoin_48(Probe) | 2196527.02 | 1825042 | root | | time:5.69s, loops:1786, build_hash_table:{total:1.3s, fetch:345.2ms, build:954.8ms}, probe:{concurrency:5, total:28.4s, max:5.69s, probe:6.4s, fetch:22s} | inner join, equal:[eq(tpch.lineitem.l_orderkey, tpch.orders.o_orderkey)] | 246.2 MB | 0 Bytes |
| ├─TableReader_78(Build) | 2196527.02 | 2275919 | root | | time:360.3ms, loops:2217, cop_task: {num: 230, max: 274.5ms, min: 366.2µs, avg: 44.1ms, p95: 170.9ms, max_proc_keys: 331744, p95_proc_keys: 219104, tot_proc: 7.97s, tot_wait: 1.08s, rpc_num: 230, rpc_time: 10.1s, copr_cache_hit_ratio: 0.25, distsql_concurrency: 15} | data:Selection_77 | 8.86 MB | N/A |
| │ └─Selection_77 | 2196527.02 | 2275919 | cop[tikv] | | tikv_task:{proc max:246ms, min:0s, avg: 35.1ms, p80:73ms, p95:146ms, iters:15588, tasks:230}, scan_detail: {total_process_keys: 14750944, total_process_keys_size: 2239915746, total_keys: 14751117, get_snapshot_time: 29ms, rocksdb: {key_skipped_count: 14750944, block: {cache_hit_count: 37943}}} | ge(tpch.orders.o_orderdate, 1994-01-01), lt(tpch.orders.o_orderdate, 1995-01-01) | N/A | N/A |
| │ └─TableFullScan_76 | 14457280.00 | 15000000 | cop[tikv] | table:orders | tikv_task:{proc max:236ms, min:0s, avg: 32.8ms, p80:67ms, p95:135ms, iters:15588, tasks:230} | keep order:false | N/A | N/A |
| └─HashJoin_51(Probe) | 12074487.12 | 12021065 | root | | time:4.45s, loops:11742, build_hash_table:{total:102.3ms, fetch:98.9ms, build:3.42ms}, probe:{concurrency:5, total:28.4s, max:5.69s, probe:21.4s, fetch:7.06s} | inner join, equal:[eq(tpch.supplier.s_suppkey, tpch.lineitem.l_suppkey)] | 1.77 MB | 0 Bytes |
| ├─HashJoin_53(Build) | 20000.00 | 20037 | root | | time:100.6ms, loops:25, build_hash_table:{total:358.3µs, fetch:349.1µs, build:9.23µs}, probe:{concurrency:5, total:508.4ms, max:101.8ms, probe:21.4ms, fetch:487.1ms} | inner join, equal:[eq(tpch.nation.n_nationkey, tpch.supplier.s_nationkey)] | 1.59 KB | 0 Bytes |
| │ ├─HashJoin_66(Build) | 5.00 | 5 | root | | time:333.7µs, loops:2, build_hash_table:{total:156.6µs, fetch:148.2µs, build:8.39µs}, probe:{concurrency:5, total:769.8µs, max:174.1µs, probe:39.8µs, fetch:730µs} | inner join, equal:[eq(tpch.region.r_regionkey, tpch.nation.n_regionkey)] | 28.5 KB | 0 Bytes |
| │ │ ├─TableReader_71(Build) | 1.00 | 1 | root | | time:39.4µs, loops:2, cop_task: {num: 1, max: 1.19ms, proc_keys: 5, rpc_num: 1, rpc_time: 1.15ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15} | data:Selection_70 | 313 Bytes | N/A |
| │ │ │ └─Selection_70 | 1.00 | 1 | cop[tikv] | | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 5, total_process_keys_size: 559, total_keys: 6, get_snapshot_time: 35.8µs, rocksdb: {key_skipped_count: 5, block: {cache_hit_count: 4}}} | eq(tpch.region.r_name, "ASIA") | N/A | N/A |
| │ │ │ └─TableFullScan_69 | 5.00 | 5 | cop[tikv] | table:region | tikv_task:{time:0s, loops:1} | keep order:false | N/A | N/A |
| │ │ └─TableReader_68(Probe) | 25.00 | 25 | root | | time:41.6µs, loops:2, cop_task: {num: 1, max: 1.08ms, proc_keys: 25, rpc_num: 1, rpc_time: 1.06ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15} | data:TableFullScan_67 | 680 Bytes | N/A |
| │ │ └─TableFullScan_67 | 25.00 | 25 | cop[tikv] | table:nation | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 25, total_process_keys_size: 3109, total_keys: 26, get_snapshot_time: 29.7µs, rocksdb: {key_skipped_count: 25, block: {cache_hit_count: 4}}} | keep order:false | N/A | N/A |
| │ └─TableReader_73(Probe) | 100000.00 | 100000 | root | | time:95.4ms, loops:100, cop_task: {num: 10, max: 19.4ms, min: 931.9µs, avg: 9.87ms, p95: 19.4ms, max_proc_keys: 480, p95_proc_keys: 480, rpc_num: 10, rpc_time: 98.5ms, copr_cache_hit_ratio: 0.90, distsql_concurrency: 15} | data:TableFullScan_72 | 964.4 KB | N/A |
| │ └─TableFullScan_72 | 100000.00 | 100000 | cop[tikv] | table:supplier | tikv_task:{proc max:40ms, min:1ms, avg: 11ms, p80:20ms, p95:40ms, iters:137, tasks:10}, scan_detail: {total_process_keys: 480, total_process_keys_size: 87745, total_keys: 481, get_snapshot_time: 342.1µs, rocksdb: {key_skipped_count: 480, block: {cache_hit_count: 6}}} | keep order:false | N/A | N/A |
| └─TableReader_75(Probe) | 59986052.00 | 59986052 | root | | time:924.6ms, loops:58781, cop_task: {num: 2197, max: 103.9ms, min: 724.9µs, avg: 23.6ms, p95: 51.4ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 31.3s, tot_wait: 4.27s, rpc_num: 2197, rpc_time: 51.7s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15} | data:TableFullScan_74 | 7.78 MB | N/A |
| └─TableFullScan_74 | 59986052.00 | 59986052 | cop[tikv] | table:lineitem | tikv_task:{proc max:90ms, min:0s, avg: 14ms, p80:26ms, p95:35ms, iters:67264, tasks:2197}, scan_detail: {total_process_keys: 59972516, total_process_keys_size: 11760420816, total_keys: 59974704, get_snapshot_time: 265.8ms, rocksdb: {key_skipped_count: 59972516, block: {cache_hit_count: 192071}}} | keep order:false | N/A | N/A |
+--------------------------------------------+-------------+----------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
20 rows in set (5.69 sec)
根据误打正着,沿着错误的执行计划分析,有一点不解:
为何在语句控制内存为1GB且 tidb_mem_oom_action=CANCEL 和tidb_enable_tmp_storage_on_oom=ON的前提下本应该超过内存落盘但是这里被cancel掉了呢?
是不是这里的没有落盘问题导致的?https://github.com/pingcap/tidb/blob/d0b72a2cc79bc8568935e30d49d1f269af1ce0ee/executor/hash_table.go#L242