指定INL_JOIN,但执行计划并不走

一种 workaround 的方式是把这个查询的 left join 改写为子查询的形式且让优化器不对子查询进行解关联,这种情况下用 Apply 算子去执行(a 每取一行,然后去 b 里找匹配的行),就能利用上 orderso_custkey 索引。

mysql> explain select
    ->     a.C_NAME,
    ->     (
    ->         select
    ->             sum(O_TOTALPRICE)
    ->         from
    ->             tpch.orders b
    ->         where
    ->             b.O_CUSTKEY = a.C_CUSTKEY
    ->         group by
    ->             o_custkey
    ->     ) as price
    -> from
    ->     tpch.customer a use index(customer_idx1)
    -> where
    ->     a.C_PHONE = '23-768-687-3665';
+------------------------------------------+---------+-----------+---------------------------------------+-------------------------------------------------------------------------------------------------------+
| id                                       | estRows | task      | access object                         | operator info                                                                                         |
+------------------------------------------+---------+-----------+---------------------------------------+-------------------------------------------------------------------------------------------------------+
| Projection_11                            | 0.00    | root      |                                       | tpch.customer.c_name, Column#27                                                                       |
| └─Apply_13                               | 0.00    | root      |                                       | CARTESIAN left outer join                                                                             |
|   ├─IndexLookUp_16(Build)                | 1.00    | root      |                                       |                                                                                                       |
|   │ ├─IndexRangeScan_14(Build)           | 1.00    | cop[tikv] | table:a, index:customer_idx1(C_PHONE) | range:["23-768-687-3665","23-768-687-3665"], keep order:false, stats:pseudo                           |
|   │ └─TableRowIDScan_15(Probe)           | 1.00    | cop[tikv] | table:a                               | keep order:false, stats:pseudo                                                                        |
|   └─MaxOneRow_17(Probe)                  | 1.00    | root      |                                       |                                                                                                       |
|     └─StreamAgg_22                       | 2.00    | root      |                                       | group by:tpch.orders.o_custkey, funcs:sum(tpch.orders.o_totalprice)->Column#27                        |
|       └─Projection_40                    | 2.50    | root      |                                       | tpch.orders.o_custkey, tpch.orders.o_totalprice                                                       |
|         └─IndexLookUp_39                 | 2.50    | root      |                                       |                                                                                                       |
|           ├─IndexRangeScan_37(Build)     | 2.50    | cop[tikv] | table:b, index:o_custkey(O_CUSTKEY)   | range: decided by [eq(tpch.orders.o_custkey, tpch.customer.c_custkey)], keep order:true, stats:pseudo |
|           └─TableRowIDScan_38(Probe)     | 2.50    | cop[tikv] | table:b                               | keep order:false, stats:pseudo                                                                        |
+------------------------------------------+---------+-----------+---------------------------------------+-------------------------------------------------------------------------------------------------------+
11 rows in set (0.00 sec)

子查询中是否加 group by o_custkey 语义都是一样的,这里加了 group by o_custkey 是为了让优化器不能进行解关联(把 Apply 转化成 Join)。

或者在 6.4 引入 NO_DECORRELATE hint 之后,也去掉 group by o_custkey 然后加这个 hint 避免解关联。

mysql> explain select
    ->     a.C_NAME,
    ->     (
    ->         select /*+ NO_DECORRELATE() */
    ->             sum(O_TOTALPRICE)
    ->         from
    ->             tpch.orders b
    ->         where
    ->             b.O_CUSTKEY = a.C_CUSTKEY
    ->     ) as price
    -> from
    ->     tpch.customer a use index(customer_idx1)
    -> where
    ->     a.C_PHONE = '23-768-687-3665';
+----------------------------------------+---------+-----------+---------------------------------------+--------------------------------------------------------------------------------------------------------+
| id                                     | estRows | task      | access object                         | operator info                                                                                          |
+----------------------------------------+---------+-----------+---------------------------------------+--------------------------------------------------------------------------------------------------------+
| Projection_11                          | 0.00    | root      |                                       | tpch.customer.c_name, Column#27                                                                        |
| └─Apply_13                             | 0.00    | root      |                                       | CARTESIAN left outer join                                                                              |
|   ├─IndexLookUp_16(Build)              | 1.00    | root      |                                       |                                                                                                        |
|   │ ├─IndexRangeScan_14(Build)         | 1.00    | cop[tikv] | table:a, index:customer_idx1(C_PHONE) | range:["23-768-687-3665","23-768-687-3665"], keep order:false, stats:pseudo                            |
|   │ └─TableRowIDScan_15(Probe)         | 1.00    | cop[tikv] | table:a                               | keep order:false, stats:pseudo                                                                         |
|   └─MaxOneRow_17(Probe)                | 1.00    | root      |                                       |                                                                                                        |
|     └─HashAgg_29                       | 1.00    | root      |                                       | funcs:sum(Column#30)->Column#27                                                                        |
|       └─IndexLookUp_30                 | 1.00    | root      |                                       |                                                                                                        |
|         ├─IndexRangeScan_27(Build)     | 10.00   | cop[tikv] | table:b, index:o_custkey(O_CUSTKEY)   | range: decided by [eq(tpch.orders.o_custkey, tpch.customer.c_custkey)], keep order:false, stats:pseudo |
|         └─HashAgg_19(Probe)            | 1.00    | cop[tikv] |                                       | funcs:sum(tpch.orders.o_totalprice)->Column#30                                                         |
|           └─TableRowIDScan_28          | 10.00   | cop[tikv] | table:b                               | keep order:false, stats:pseudo                                                                         |
+----------------------------------------+---------+-----------+---------------------------------------+--------------------------------------------------------------------------------------------------------+
11 rows in set (0.00 sec)