TiDB优化器未对关联条件中“or”进行改写

对于TPCH表有如下测试语句(该语句无任何意义,仅为考察or改写能力):
select count(a.S_NAME),sum(b.s_acctbal) from supplier a,supplier b where a.S_ADDRESS=b.S_ADDRESS or a.S_PHONE=b.S_PHONE;

查看其执行计划:

mysql> explain select count(a.S_NAME),sum(b.s_acctbal) from supplier a,supplier b where a.S_ADDRESS=b.S_ADDRESS or a.S_PHONE=b.S_PHONE;
+-------------------------------+----------------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| id                            | estRows        | task      | access object | operator info                                                                                                                                   |
+-------------------------------+----------------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| StreamAgg_10                  | 1.00           | root      |               | funcs:count(tpch1.supplier.s_name)->Column#15, funcs:sum(tpch1.supplier.s_acctbal)->Column#16                                                   |
| └─HashJoin_17                 | 10000000000.00 | root      |               | CARTESIAN inner join, other cond:or(eq(tpch1.supplier.s_address, tpch1.supplier.s_address), eq(tpch1.supplier.s_phone, tpch1.supplier.s_phone)) |
|   ├─TableReader_16(Build)     | 100000.00      | root      |               | data:TableFullScan_15                                                                                                                           |
|   │ └─TableFullScan_15        | 100000.00      | cop[tikv] | table:b       | keep order:false                                                                                                                                |
|   └─TableReader_14(Probe)     | 100000.00      | root      |               | data:TableFullScan_13                                                                                                                           |
|     └─TableFullScan_13        | 100000.00      | cop[tikv] | table:a       | keep order:false                                                                                                                                |
+-------------------------------+----------------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
6 rows in set (0.01 sec)

mysql> explain analyze select a.l_partkey from lineitem a cross join lineitem b limit 10;
+-------------------------------+-------------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+-----------+---------+
| id                            | estRows     | actRows  | task      | access object | execution info                                                                                                                                                                                                                                                                                                                                                      | operator info         | memory    | disk    |
+-------------------------------+-------------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+-----------+---------+
| Limit_12                      | 10.00       | 10       | root      |               | time:2m46.8s, loops:2                                                                                                                                                                                                                                                                                                                                               | offset:0, count:10    | N/A       | N/A     |
| └─HashJoin_13                 | 10.00       | 1024     | root      |               | time:2m46.8s, loops:1, build_hash_table:{total:17.4s, fetch:10.5s, build:6.87s}, probe:{concurrency:3, total:8m23.6s, max:2m48.7s, probe:7m31.4s, fetch:52.1s}                                                                                                                                                                                                      | CARTESIAN inner join  | 1023.6 MB | 1.34 GB |
|   ├─TableReader_18(Build)     | 59986052.00 | 59986052 | root      |               | time:10.3s, loops:58813, cop_task: {num: 2197, max: 281.3ms, min: 1.1ms, avg: 31.4ms, p95: 68.4ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 56.1s, tot_wait: 9.09s, rpc_num: 2197, rpc_time: 1m9s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 5}                                                                                                 | data:TableFullScan_17 | 1.86 MB   | N/A     |
|   │ └─TableFullScan_17        | 59986052.00 | 59986052 | cop[tikv] | table:b       | tikv_task:{proc max:271ms, min:0s, avg: 25.6ms, p80:42ms, p95:58ms, iters:67264, tasks:2197}, scan_detail: {total_process_keys: 59986052, total_process_keys_size: 2159497872, total_keys: 59988249, get_snapshot_time: 204.3ms, rocksdb: {key_skipped_count: 59986052, block: {cache_hit_count: 17464, read_count: 185633, read_byte: 3.77 GB, read_time: 15.7s}}} | keep order:false      | N/A       | N/A     |
|   └─TableReader_16(Probe)     | 0.00        | 3072     | root      |               | time:10.2ms, loops:3, cop_task: {num: 9, max: 11.9ms, min: 956.6µs, avg: 6.62ms, p95: 11.9ms, max_proc_keys: 992, p95_proc_keys: 992, tot_proc: 22ms, tot_wait: 26ms, rpc_num: 9, rpc_time: 59.4ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 5}                                                                                                             | data:TableFullScan_15 | 48.4 KB   | N/A     |
|     └─TableFullScan_15        | 0.00        | 3552     | cop[tikv] | table:a       | tikv_task:{proc max:9ms, min:0s, avg: 2.78ms, p80:9ms, p95:9ms, iters:32, tasks:9}, scan_detail: {total_process_keys: 3552, total_process_keys_size: 697625, total_keys: 3561, get_snapshot_time: 18.2ms, rocksdb: {key_skipped_count: 3552, block: {cache_hit_count: 71, read_count: 14, read_byte: 750.3 KB, read_time: 10.4ms}}}                                 | keep order:false      | N/A       | N/A     |
+-------------------------------+-------------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+-----------+---------+
6 rows in set (2 min 48.75 sec)

可以看到还是使用了hashjoin+内部cross join的方式导致hashjoin过程非常缓慢。
如果可以进行union改写则在此场景下效率会快很多。
因此是否应该增强基于代价的执行计划改写,来优化此类问题?

只能说数据库的优化器真的很难,像oracle这种驰骋江湖几十年的老牌数据库也没做到尽善尽美。。。

人无我有,人有我强,以此为目标