【需求涉及的问题场景】select * from a,b where a.id=b.id (走索引) and substr(a.name,1,10)=substr(b.name,1,10),在指定hint的情况下无法走IndexJoin
【期望的需求行为】可以强制走IndexJoin
表结构:
mysql> show create table customer \G
*************************** 1. row ***************************
Table: customer
Create Table: CREATE TABLE `customer` (
`C_CUSTKEY` bigint(20) NOT NULL,
`C_NAME` varchar(25) NOT NULL,
`C_ADDRESS` varchar(40) NOT NULL,
`C_NATIONKEY` bigint(20) NOT NULL,
`C_PHONE` char(15) NOT NULL,
`C_ACCTBAL` decimal(15,2) NOT NULL,
`C_MKTSEGMENT` char(10) NOT NULL,
`C_COMMENT` varchar(117) NOT NULL,
PRIMARY KEY (`C_CUSTKEY`) /*T![clustered_index] CLUSTERED */,
KEY `idx1` (`C_PHONE`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.01 sec)
mysql> show create table orders \G
*************************** 1. row ***************************
Table: orders
Create Table: CREATE TABLE `orders` (
`O_ORDERKEY` bigint(20) NOT NULL,
`O_CUSTKEY` bigint(20) NOT NULL,
`O_ORDERSTATUS` char(1) NOT NULL,
`O_TOTALPRICE` decimal(15,2) NOT NULL,
`O_ORDERDATE` date NOT NULL,
`O_ORDERPRIORITY` char(15) NOT NULL,
`O_CLERK` char(15) NOT NULL,
`O_SHIPPRIORITY` bigint(20) NOT NULL,
`O_COMMENT` varchar(79) NOT NULL,
PRIMARY KEY (`O_ORDERKEY`) /*T![clustered_index] CLUSTERED */,
KEY `idx1` (`O_CUSTKEY`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.01 sec)
语句:explain analyze select /*+ INL_JOIN(a,b) / count() from customer a,orders b where a.c_phone=‘25-989-741-2988’ and a.c_custkey=b.o_custkey and a.C_COMMENT=b.O_COMMENT;
b.o_custkey是索引字段,b.o_comment不是索引字段,可以正常走IndexJoin算法。
mysql> explain analyze select /*+ INL_JOIN(a,b) */ count(*) from customer a,orders b where a.c_phone='25-989-741-2988' and a.c_custkey=b.o_custkey and a.C_COMMENT=b.O_COMMENT;
+--------------------------------------+---------+---------+-----------+--------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+--------------------------------------+---------+---------+-----------+--------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+------+
| StreamAgg_10 | 1.00 | 1 | root | | time:16.8ms, loops:2, RU:2.519252 | funcs:count(1)->Column#18 | 8 Bytes | N/A |
| └─IndexJoin_43 | 1.04 | 0 | root | | time:16.8ms, loops:1, inner:{total:14.9ms, concurrency:5, task:1, construct:11.9ms, fetch:2.97ms, build:28.2µs}, probe:2.48µs | inner join, inner:IndexLookUp_42, outer key:tpch1.customer.c_custkey, inner key:tpch1.orders.o_custkey, equal cond:eq(tpch1.customer.c_comment, tpch1.orders.o_comment), eq(tpch1.customer.c_custkey, tpch1.orders.o_custkey) | 51.1 KB | N/A |
| ├─IndexLookUp_37(Build) | 1.04 | 1 | root | | time:1.56ms, loops:3, index_task: {total_time: 750.8µs, fetch_handle: 748.9µs, build: 728ns, wait: 1.12µs}, table_task: {total_time: 708.2µs, num: 1, concurrency: 5}, next: {wait_index: 889.4µs, wait_table_lookup_build: 0s, wait_table_lookup_resp: 648.9µs} | | 2.13 KB | N/A |
| │ ├─IndexRangeScan_35(Build) | 1.04 | 1 | cop[tikv] | table:a, index:idx1(C_PHONE) | time:747.4µs, loops:3, cop_task: {num: 1, max: 668.4µs, proc_keys: 1, tot_proc: 92.2µs, tot_wait: 93.2µs, rpc_num: 1, rpc_time: 646.1µs, copr_cache_hit_ratio: 0.00, build_task_duration: 17.9µs, max_distsql_concurrency: 1}, tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_process_keys_size: 55, total_keys: 2, get_snapshot_time: 24.2µs, rocksdb: {key_skipped_count: 1, block: {cache_hit_count: 2}}} | range:["25-989-741-2988","25-989-741-2988"], keep order:false | N/A | N/A |
| │ └─TableRowIDScan_36(Probe) | 1.04 | 1 | cop[tikv] | table:a | time:630.3µs, loops:2, cop_task: {num: 1, max: 548µs, proc_keys: 1, tot_proc: 57µs, tot_wait: 71µs, rpc_num: 1, rpc_time: 533.1µs, copr_cache_hit_ratio: 0.00, build_task_duration: 24µs, max_distsql_concurrency: 1, max_extra_concurrency: 1}, tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_process_keys_size: 184, total_keys: 1, get_snapshot_time: 14.4µs, rocksdb: {block: {cache_hit_count: 1}}} | keep order:false | N/A | N/A |
| └─IndexLookUp_42(Probe) | 1.04 | 6 | root | | time:2.81ms, loops:2, index_task: {total_time: 834.1µs, fetch_handle: 830.6µs, build: 804ns, wait: 2.64µs}, table_task: {total_time: 1.05ms, num: 1, concurrency: 5}, next: {wait_index: 1.71ms, wait_table_lookup_build: 98.5µs, wait_table_lookup_resp: 961.5µs} | | 9.34 KB | N/A |
| ├─IndexRangeScan_40(Build) | 1.04 | 6 | cop[tikv] | table:b, index:idx1(O_CUSTKEY) | time:827.2µs, loops:3, cop_task: {num: 2, max: 713.5µs, min: 654µs, avg: 683.8µs, p95: 713.5µs, max_proc_keys: 6, p95_proc_keys: 6, tot_proc: 186µs, tot_wait: 255.8µs, rpc_num: 2, rpc_time: 1.33ms, copr_cache_hit_ratio: 0.00, build_task_duration: 814.7µs, max_distsql_concurrency: 2}, tikv_task:{proc max:0s, min:0s, avg: 0s, p80:0s, p95:0s, iters:2, tasks:2}, scan_detail: {total_process_keys: 6, total_process_keys_size: 276, total_keys: 8, get_snapshot_time: 86.8µs, rocksdb: {key_skipped_count: 6, block: {cache_hit_count: 1, read_count: 1, read_byte: 3.81 KB, read_time: 12.4µs}}} | range: decided by [eq(tpch1.orders.o_custkey, tpch1.customer.c_custkey)], keep order:false | N/A | N/A |
| └─TableRowIDScan_41(Probe) | 1.04 | 6 | cop[tikv] | table:b | time:944.2µs, loops:2, cop_task: {num: 3, max: 825.8µs, min: 0s, avg: 275.3µs, p95: 825.8µs, max_proc_keys: 3, p95_proc_keys: 3, tot_proc: 190.9µs, tot_wait: 280.1µs, rpc_num: 1, rpc_time: 810.6µs, copr_cache_hit_ratio: 0.00, build_task_duration: 29.8µs, max_distsql_concurrency: 1, max_extra_concurrency: 1, store_batch_num: 2}, tikv_task:{proc max:0s, min:0s, avg: 0s, p80:0s, p95:0s, iters:3, tasks:3}, scan_detail: {total_process_keys: 6, total_process_keys_size: 929, total_keys: 6, get_snapshot_time: 39.6µs, rocksdb: {block: {cache_hit_count: 6}}} | keep order:false | N/A | N/A |
+--------------------------------------+---------+---------+-----------+--------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+------+
8 rows in set (0.02 sec)
当在 a.C_COMMENT=b.O_COMMENT 关联条件字段上(b.O_COMMENT)加上函数后则不走IndexJoin:
mysql> explain analyze select /*+ INL_JOIN(a,b) */ count(*) from customer a,orders b where a.c_phone='25-989-741-2988' and a.c_custkey=b.o_custkey and substr(a.C_COMMENT,1,10)=substr(b.O_COMMENT,1,10);
+----------------------------------------+------------+---------+-----------+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+----------+---------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+----------------------------------------+------------+---------+-----------+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+----------+---------+
| StreamAgg_13 | 1.00 | 1 | root | | time:345ms, loops:2, RU:2322.550974 | funcs:count(1)->Column#18 | 8 Bytes | N/A |
| └─HashJoin_27 | 1.04 | 0 | root | | time:345ms, loops:1, build_hash_table:{total:3.87ms, fetch:3.85ms, build:13.6µs}, probe:{concurrency:5, total:1.72s, max:344.8ms, probe:165.2ms, fetch:1.56s} | inner join, equal:[eq(tpch1.customer.c_custkey, tpch1.orders.o_custkey) eq(Column#19, Column#20)] | 17.8 KB | 0 Bytes |
| ├─Projection_16(Build) | 1.04 | 1 | root | | time:3.84ms, loops:2, Concurrency:OFF | tpch1.customer.c_custkey, substr(tpch1.customer.c_comment, 1, 10)->Column#19 | 17.5 KB | N/A |
| │ └─IndexLookUp_22 | 1.04 | 1 | root | | time:3.81ms, loops:2, index_task: {total_time: 1.06ms, fetch_handle: 1.05ms, build: 832ns, wait: 1.3µs}, table_task: {total_time: 2.55ms, num: 1, concurrency: 5}, next: {wait_index: 1.33ms, wait_table_lookup_build: 0s, wait_table_lookup_resp: 2.47ms} | | 34.2 KB | N/A |
| │ ├─IndexRangeScan_20(Build) | 1.04 | 1 | cop[tikv] | table:a, index:idx1(C_PHONE) | time:1.05ms, loops:3, cop_task: {num: 1, max: 968.3µs, proc_keys: 1, tot_proc: 40.9µs, tot_wait: 398.6µs, rpc_num: 1, rpc_time: 955.2µs, copr_cache_hit_ratio: 0.00, build_task_duration: 11.9µs, max_distsql_concurrency: 1}, tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_process_keys_size: 55, total_keys: 2, get_snapshot_time: 11.1µs, rocksdb: {key_skipped_count: 1, block: {cache_hit_count: 2}}} | range:["25-989-741-2988","25-989-741-2988"], keep order:false | N/A | N/A |
| │ └─TableRowIDScan_21(Probe) | 1.04 | 1 | cop[tikv] | table:a | time:2.46ms, loops:2, cop_task: {num: 1, max: 2.34ms, proc_keys: 1, tot_proc: 55.4µs, tot_wait: 88.8µs, rpc_num: 1, rpc_time: 2.33ms, copr_cache_hit_ratio: 0.00, build_task_duration: 32.7µs, max_distsql_concurrency: 1, max_extra_concurrency: 1}, tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_process_keys_size: 184, total_keys: 1, get_snapshot_time: 19.3µs, rocksdb: {block: {cache_hit_count: 1}}} | keep order:false | N/A | N/A |
| └─Projection_23(Probe) | 1500000.00 | 1500000 | root | | time:331.8ms, loops:1471, Concurrency:5 | tpch1.orders.o_custkey, substr(tpch1.orders.o_comment, 1, 10)->Column#20 | 495.7 KB | N/A |
| └─TableReader_25 | 1500000.00 | 1500000 | root | | time:92.3ms, loops:1471, cop_task: {num: 69, max: 94.7ms, min: 231.5µs, avg: 16.5ms, p95: 63ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 673.7ms, tot_wait: 23.6ms, rpc_num: 69, rpc_time: 1.13s, copr_cache_hit_ratio: 0.58, build_task_duration: 12µs, max_distsql_concurrency: 5} | data:TableFullScan_24 | 15.4 MB | N/A |
| └─TableFullScan_24 | 1500000.00 | 1500000 | cop[tikv] | table:b | tikv_task:{proc max:294ms, min:0s, avg: 61.3ms, p80:89ms, p95:259ms, iters:1737, tasks:69}, scan_detail: {total_process_keys: 893024, total_process_keys_size: 135281341, total_keys: 893053, get_snapshot_time: 1.02ms, rocksdb: {key_skipped_count: 893024, block: {cache_hit_count: 4574}}} | keep order:false | N/A | N/A |
+----------------------------------------+------------+---------+-----------+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+----------+---------+
9 rows in set, 2 warnings (0.35 sec)
mysql> show warnings;
+---------+------+--------------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------------------------------------------+
| Warning | 1815 | Optimizer Hint /*+ INL_JOIN(a, b) */ or /*+ TIDB_INLJ(a, b) */ is inapplicable |
| Warning | 1815 | Optimizer Hint /*+ INL_JOIN(a, b) */ or /*+ TIDB_INLJ(a, b) */ is inapplicable |
+---------+------+--------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
另外,对于a表(很小数据量)join b表(超大数据量)on a.date between (b.date,b.date+1) 这种关联条件都是非等值情况也不支持IndexJoin,但是如果支持IndexJoin,有一些场景下IndexJoin效率要比HashJoin高很多。参考帖子:TIDB 索引失效
希望IndexJoin算法可以支持更多形式的的On Condition条件。
1、希望支持关联条件全是非等值情况下,支持可走IndexJoin。
2、希望支持存在部分关联条件等值on condition(索引)情况下,均支持可走IndexJoin。